From noreply@sourceforge.net  Mon Apr  1 02:46:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 31 Mar 2002 18:46:16 -0800
Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type
Message-ID: <E16rrpo-0007cE-00@usw-sf-web1.sourceforge.net>

Patches item #528022, was opened at 2002-03-10 00:45
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 285 - Adding a bool type

Initial Comment:
Here's a preliminary implementation of the PEP,
including unittests  checking the promises made in the
PEP (test_bool.py) and (some) documentation.

With this 12 tests fail for me (on Linux); I'll look
into these later.  They appear shallow (mostly doctests
dying on True or False where 1 or 0 was expected).

Note: the presence of this patch does not mean that the
PEP is accepted -- it just means that a sample
implementation exists in case someone wants to explore
the effects of the PEP on their code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-31 21:46

Message:
Logged In: YES 
user_id=6380

Here's an updated diff (booldiff2.txt). It fixes a refcount
bug in bool_repr(), and works with current CVS.

With this patch set, 10 standard tests fail for shallow
reasons having to do with str() or repr() returning False or
True instead of 0 or 1. Here are the failed tests:

    test_descr test_descrtut test_difflib test_doctest
test_extcall
    test_generators test_gettext test_richcmp
test_richcompare
    test_unicode


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470


From noreply@sourceforge.net  Mon Apr  1 09:38:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 01:38:19 -0800
Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods
Message-ID: <E16ryGZ-0003U3-00@usw-sf-web2.sourceforge.net>

Patches item #537536, was opened at 2002-03-31 23:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Phillip J. Eby (pje)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: bug 535444 super() broken w/classmethods

Initial Comment:
This patch fixes bug #535444.  It is against the
current CVS version of Python, and addresses the
problem by adding a 'starttype' variable to
'super_getattro', which works the same as 'starttype'
in the pure-Python version of super in the descriptor
tutorial.  This variable is then passed to the
descriptor __get__ function, ala
'descriptor.__get__(self.__obj__,starttype)'.

This patch does not correct the pure-Python version of
'super' in the descriptor tutorial; I don't know where
that file is or how to submit a patch for it.

This patch also does not include a regression test for
the bug.  I do not know what would be considered the
appropriate test script to place this in.

Thanks.


----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-04-01 09:38

Message:
Logged In: YES 
user_id=6656

Guido gets the fix too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470


From noreply@sourceforge.net  Mon Apr  1 11:44:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 03:44:31 -0800
Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type
Message-ID: <E16s0Eh-0007lS-00@usw-sf-web3.sourceforge.net>

Patches item #528022, was opened at 2002-03-10 06:45
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 285 - Adding a bool type

Initial Comment:
Here's a preliminary implementation of the PEP,
including unittests  checking the promises made in the
PEP (test_bool.py) and (some) documentation.

With this 12 tests fail for me (on Linux); I'll look
into these later.  They appear shallow (mostly doctests
dying on True or False where 1 or 0 was expected).

Note: the presence of this patch does not mean that the
PEP is accepted -- it just means that a sample
implementation exists in case someone wants to explore
the effects of the PEP on their code.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 13:44

Message:
Logged In: YES 
user_id=21627

This patch does not support pickling of bools (the PEP
should probably spell out how they are pickled). marshalling
of bools does not round-trip (you get back an int).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-01 04:46

Message:
Logged In: YES 
user_id=6380

Here's an updated diff (booldiff2.txt). It fixes a refcount
bug in bool_repr(), and works with current CVS.

With this patch set, 10 standard tests fail for shallow
reasons having to do with str() or repr() returning False or
True instead of 0 or 1. Here are the failed tests:

    test_descr test_descrtut test_difflib test_doctest
test_extcall
    test_generators test_gettext test_richcmp
test_richcompare
    test_unicode


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470


From noreply@sourceforge.net  Mon Apr  1 11:55:01 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 03:55:01 -0800
Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods
Message-ID: <E16s0Or-0004tA-00@usw-sf-web2.sourceforge.net>

Patches item #537536, was opened at 2002-04-01 01:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Phillip J. Eby (pje)
Assigned to: Guido van Rossum (gvanrossum)
Summary: bug 535444 super() broken w/classmethods

Initial Comment:
This patch fixes bug #535444.  It is against the
current CVS version of Python, and addresses the
problem by adding a 'starttype' variable to
'super_getattro', which works the same as 'starttype'
in the pure-Python version of super in the descriptor
tutorial.  This variable is then passed to the
descriptor __get__ function, ala
'descriptor.__get__(self.__obj__,starttype)'.

This patch does not correct the pure-Python version of
'super' in the descriptor tutorial; I don't know where
that file is or how to submit a patch for it.

This patch also does not include a regression test for
the bug.  I do not know what would be considered the
appropriate test script to place this in.

Thanks.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 13:55

Message:
Logged In: YES 
user_id=21627

Please put tests for this stuff into test_descr.py.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-01 11:38

Message:
Logged In: YES 
user_id=6656

Guido gets the fix too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470


From noreply@sourceforge.net  Mon Apr  1 19:41:18 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 11:41:18 -0800
Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods
Message-ID: <E16s7g6-00029d-00@usw-sf-web2.sourceforge.net>

Patches item #537536, was opened at 2002-03-31 23:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Phillip J. Eby (pje)
Assigned to: Guido van Rossum (gvanrossum)
Summary: bug 535444 super() broken w/classmethods

Initial Comment:
This patch fixes bug #535444.  It is against the
current CVS version of Python, and addresses the
problem by adding a 'starttype' variable to
'super_getattro', which works the same as 'starttype'
in the pure-Python version of super in the descriptor
tutorial.  This variable is then passed to the
descriptor __get__ function, ala
'descriptor.__get__(self.__obj__,starttype)'.

This patch does not correct the pure-Python version of
'super' in the descriptor tutorial; I don't know where
that file is or how to submit a patch for it.

This patch also does not include a regression test for
the bug.  I do not know what would be considered the
appropriate test script to place this in.

Thanks.


----------------------------------------------------------------------

>Comment By: Phillip J. Eby (pje)
Date: 2002-04-01 19:41

Message:
Logged In: YES 
user_id=56214

Here's the regression test.  It asserts 6 things, 5 of which
will fail without the typeobject.c patch to super() in
place.  Thanks.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 11:55

Message:
Logged In: YES 
user_id=21627

Please put tests for this stuff into test_descr.py.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-01 09:38

Message:
Logged In: YES 
user_id=6656

Guido gets the fix too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 04:11:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 20:11:24 -0800
Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods
Message-ID: <E16sFdk-0007NE-00@usw-sf-web1.sourceforge.net>

Patches item #537536, was opened at 2002-03-31 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Phillip J. Eby (pje)
Assigned to: Guido van Rossum (gvanrossum)
Summary: bug 535444 super() broken w/classmethods

Initial Comment:
This patch fixes bug #535444.  It is against the
current CVS version of Python, and addresses the
problem by adding a 'starttype' variable to
'super_getattro', which works the same as 'starttype'
in the pure-Python version of super in the descriptor
tutorial.  This variable is then passed to the
descriptor __get__ function, ala
'descriptor.__get__(self.__obj__,starttype)'.

This patch does not correct the pure-Python version of
'super' in the descriptor tutorial; I don't know where
that file is or how to submit a patch for it.

This patch also does not include a regression test for
the bug.  I do not know what would be considered the
appropriate test script to place this in.

Thanks.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-01 23:11

Message:
Logged In: YES 
user_id=6380

Accepted, also as bugfix for 2.2.1 (assuming it works there,
not tested).

I can check this in in the morning.

Thanks all!

----------------------------------------------------------------------

Comment By: Phillip J. Eby (pje)
Date: 2002-04-01 14:41

Message:
Logged In: YES 
user_id=56214

Here's the regression test.  It asserts 6 things, 5 of which
will fail without the typeobject.c patch to super() in
place.  Thanks.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 06:55

Message:
Logged In: YES 
user_id=21627

Please put tests for this stuff into test_descr.py.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-01 04:38

Message:
Logged In: YES 
user_id=6656

Guido gets the fix too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 07:15:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 23:15:15 -0800
Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement
Message-ID: <E16sIVf-000415-00@usw-sf-web3.sourceforge.net>

Patches item #536661, was opened at 2002-03-29 09:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: splitext performances improvement

Initial Comment:
After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.

The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.

def splitext(p):
    root, ext = '', ''
    for c in p:
        if c == '/':
            root, ext = root + ext + c, ''
        elif c == '.':
            if ext:
                root, ext = root + ext, c
            else:
                ext = c
        elif ext:
            ext = ext + c
        else:
            root = root + c
    return root, ext

def splitext2(p):
    i = p.rfind('.')
    if i<=p.rfind('/'):
        return p, ''
    else:
        return p[:i], p[i:]

l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')

l2 = (
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
'/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
    )

for i in l1+l2:
    assert splitext2(i) == splitext(i)

import time

def test(f,args):
    t = time.clock()
    for p in args:
        for i in range(1000):
            f(p)
    return time.clock() - t

def f(p):pass

a=test(splitext, l1)
b=test(splitext2, l1)
c=test(f,l1)
print a,b,c,(a-c)/(b-c)

a=test(splitext, l2)
b=test(splitext2, l2)
c=test(f,l2)
print a,b,c,(a-c)/(b-c)


----------------------------------------------------------------------

>Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:15

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-29 19:56

Message:
Logged In: YES 
user_id=31435

I like it fine so far as it goes, but I'd like it a lot 
more if it also patched the splitext and test 
implementations for other platforms.  It's not good that, 
e.g., posixpath.py and ntpath.py get more and more out of 
synch over time, and that their test suites also diverge.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 10:49

Message:
Logged In: YES 
user_id=21627

The patch looks good to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 07:28:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 01 Apr 2002 23:28:31 -0800
Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement
Message-ID: <E16sIiV-0001N8-00@usw-sf-web2.sourceforge.net>

Patches item #536661, was opened at 2002-03-29 09:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: splitext performances improvement

Initial Comment:
After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.

The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.

def splitext(p):
    root, ext = '', ''
    for c in p:
        if c == '/':
            root, ext = root + ext + c, ''
        elif c == '.':
            if ext:
                root, ext = root + ext, c
            else:
                ext = c
        elif ext:
            ext = ext + c
        else:
            root = root + c
    return root, ext

def splitext2(p):
    i = p.rfind('.')
    if i<=p.rfind('/'):
        return p, ''
    else:
        return p[:i], p[i:]

l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')

l2 = (
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
'/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
    )

for i in l1+l2:
    assert splitext2(i) == splitext(i)

import time

def test(f,args):
    t = time.clock()
    for p in args:
        for i in range(1000):
            f(p)
    return time.clock() - t

def f(p):pass

a=test(splitext, l1)
b=test(splitext2, l1)
c=test(f,l1)
print a,b,c,(a-c)/(b-c)

a=test(splitext, l2)
b=test(splitext2, l2)
c=test(f,l2)
print a,b,c,(a-c)/(b-c)


----------------------------------------------------------------------

>Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:28

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:15

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-29 19:56

Message:
Logged In: YES 
user_id=31435

I like it fine so far as it goes, but I'd like it a lot 
more if it also patched the splitext and test 
implementations for other platforms.  It's not good that, 
e.g., posixpath.py and ntpath.py get more and more out of 
synch over time, and that their test suites also diverge.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 10:49

Message:
Logged In: YES 
user_id=21627

The patch looks good to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 09:24:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 01:24:27 -0800
Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement
Message-ID: <E16sKWh-0002gb-00@usw-sf-web2.sourceforge.net>

Patches item #536661, was opened at 2002-03-29 09:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: splitext performances improvement

Initial Comment:
After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.

The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.

def splitext(p):
    root, ext = '', ''
    for c in p:
        if c == '/':
            root, ext = root + ext + c, ''
        elif c == '.':
            if ext:
                root, ext = root + ext, c
            else:
                ext = c
        elif ext:
            ext = ext + c
        else:
            root = root + c
    return root, ext

def splitext2(p):
    i = p.rfind('.')
    if i<=p.rfind('/'):
        return p, ''
    else:
        return p[:i], p[i:]

l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')

l2 = (
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
'/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
    )

for i in l1+l2:
    assert splitext2(i) == splitext(i)

import time

def test(f,args):
    t = time.clock()
    for p in args:
        for i in range(1000):
            f(p)
    return time.clock() - t

def f(p):pass

a=test(splitext, l1)
b=test(splitext2, l1)
c=test(f,l1)
print a,b,c,(a-c)/(b-c)

a=test(splitext, l2)
b=test(splitext2, l2)
c=test(f,l2)
print a,b,c,(a-c)/(b-c)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-02 11:24

Message:
Logged In: YES 
user_id=21627

Sharing code is a good thing. However, it would be critical
as to how exactly this is done, since os is such a central
module. If you start now, and don't get agreement
immediately, it may well be that you cannot complete until
Python 2.3.

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:28

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:15

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-29 19:56

Message:
Logged In: YES 
user_id=31435

I like it fine so far as it goes, but I'd like it a lot 
more if it also patched the splitext and test 
implementations for other platforms.  It's not good that, 
e.g., posixpath.py and ntpath.py get more and more out of 
synch over time, and that their test suites also diverge.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 10:49

Message:
Logged In: YES 
user_id=21627

The patch looks good to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 11:21:29 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 03:21:29 -0800
Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals()
Message-ID: <E16sMLx-0006aM-00@usw-sf-web4.sourceforge.net>

Patches item #511219, was opened at 2002-01-31 15:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Cesar Douady (douady)
Assigned to: Nobody/Anonymous (nobody)
Summary: suppress type restrictions on locals()

Initial Comment:
This patch suppresses the restriction that global and
local dictionaries do not access overloaded __getitem__
and __setitem__ if passed an object derived from class
dict.

An exception is made for the builtin insertion and
reference in the global dict to make sure this object
exists and to suppress the need for the derived class
to take care of this implementation dependent detail.

The behavior of eval and exec has been updated for code
objects which have the CO_NEWLOCALS flag set : if
explicitely passed a local dict, a new local dict is
not generated. This allows one to pass an explicit
local dict to the code object of a function (which
otherwise cannot be achieved). If this cannot be done
for backward compatibility problems, then an
alternative would consist in using the "new" module to
create a code object from a function with CO_NEWLOCALS
reset but it seems logical to me to use the information
explicitely provided.

Free and cell variables are not managed in this
version. If the patch is accepted, I am willing to
finish the job and implement free and cell variables,
but this requires a serious rework of the Cell object:
free variables should be accessed using the method of
the dict in which they relies and today, this dict is
not accessible from the Cell object.

Robustness : Currently, the plain test suite passes
(with a modification of test_desctut which precisely
verifies that the suppressed restriction is enforced).
 I have introduced a new test (test_subdict.py) which
verifies the new behavior.

Because of performance, the plain case (when the local
dict is a plain dict) is optimized so that differences
in performance are not measurable (within 1%) when run
on the test suite (i.e. I timed make test).


----------------------------------------------------------------------

>Comment By: Cesar Douady (douady)
Date: 2002-04-02 13:21

Message:
Logged In: YES 
user_id=428521

I successfully applied the patch as is to revision 2.2.1c2
with the following output (and then the same procedure as
mentioned for patching revision 2.2) :

patching file Include/dictobject.h
patching file Include/frameobject.h
patching file Include/object.h
patching file Lib/test/test_descrtut.py
patching file Lib/test/test_subdict.py
patching file Modules/cPickle.c
patching file Objects/classobject.c
patching file Objects/frameobject.c
patching file Python/ceval.c
Hunk #2 succeeded at 1534 (offset 3 lines).
Hunk #4 succeeded at 1613 (offset 3 lines).
Hunk #6 succeeded at 1655 (offset 3 lines).
Hunk #8 succeeded at 1860 (offset 3 lines).
Hunk #10 succeeded at 1889 (offset 3 lines).
Hunk #12 succeeded at 2635 (offset 3 lines).
Hunk #14 succeeded at 2893 (offset 3 lines).
Hunk #16 succeeded at 3038 (offset 3 lines).
Hunk #18 succeeded at 3657 (offset 3 lines).
Hunk #20 succeeded at 3722 (offset 3 lines).
patching file Python/compile.c
Hunk #1 succeeded at 2916 (offset 12 lines).
patching file Python/import.c
Hunk #1 succeeded at 1668 (offset -4 lines).
Hunk #3 succeeded at 1716 (offset -4 lines).
patching file Python/sysmodule.c
Hunk #1 succeeded at 238 (offset -4 lines).


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:27

Message:
Logged In: YES 
user_id=6656

And there's precisely no way it's going into 2.2.x.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-30 01:08

Message:
Logged In: YES 
user_id=428521

to install this patch from python revision 2.2, follow these
steps :
- get the python.diff file from this page
- cd Python-2.2
- run "patch -p1 <wherever_you_put_the_file/python.diff"
- make clobber
- make recheck
- make
- make test
the reason you cannot just type make to rebuild everything
is that .h files are not checked when modules are
recompiled. Thus they are not, unless you remove them with
"make clobber". make recheck is to rebuild pyconfig.h.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-28 18:30

Message:
Logged In: YES 
user_id=428521

This patch has been generated from python version 2.2.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-19 11:57

Message:
Logged In: YES 
user_id=428521

Granted. Seems fair.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-18 09:59

Message:
Logged In: YES 
user_id=21627

This is quite a complex change. If you want to see it
integrated, I recommend that you find people that try it out
and report their experience here.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 14:26:20 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 06:26:20 -0800
Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals()
Message-ID: <E16sPEq-0000ei-00@usw-sf-web3.sourceforge.net>

Patches item #511219, was opened at 2002-01-31 14:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Cesar Douady (douady)
Assigned to: Nobody/Anonymous (nobody)
Summary: suppress type restrictions on locals()

Initial Comment:
This patch suppresses the restriction that global and
local dictionaries do not access overloaded __getitem__
and __setitem__ if passed an object derived from class
dict.

An exception is made for the builtin insertion and
reference in the global dict to make sure this object
exists and to suppress the need for the derived class
to take care of this implementation dependent detail.

The behavior of eval and exec has been updated for code
objects which have the CO_NEWLOCALS flag set : if
explicitely passed a local dict, a new local dict is
not generated. This allows one to pass an explicit
local dict to the code object of a function (which
otherwise cannot be achieved). If this cannot be done
for backward compatibility problems, then an
alternative would consist in using the "new" module to
create a code object from a function with CO_NEWLOCALS
reset but it seems logical to me to use the information
explicitely provided.

Free and cell variables are not managed in this
version. If the patch is accepted, I am willing to
finish the job and implement free and cell variables,
but this requires a serious rework of the Cell object:
free variables should be accessed using the method of
the dict in which they relies and today, this dict is
not accessible from the Cell object.

Robustness : Currently, the plain test suite passes
(with a modification of test_desctut which precisely
verifies that the suppressed restriction is enforced).
 I have introduced a new test (test_subdict.py) which
verifies the new behavior.

Because of performance, the plain case (when the local
dict is a plain dict) is optimized so that differences
in performance are not measurable (within 1%) when run
on the test suite (i.e. I timed make test).


----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-04-02 14:26

Message:
Logged In: YES 
user_id=6656

So what?

Maybe you misunderstand me.  This patch was in the group 
"Python 2.2.x", which is the group we use for patches that 
are under consideration for being put into a 2.2.x release of 
Python (or in other words, a bugfix release of Python 2.2).

This patch is not going to go into a bugfix release of Python 
2.2 for at least two reasons: (1) it adds what is arguably a 
new feature and (2) it's big and complicated and so might 
cause bugs.

And now I've actually looked at the patch, it has even less 
chance: it would break binary compaitibilty of extensions.

So while I'm not against the patch in general (looks good, 
from an eyballing), it doesn't belong in the 2.2.x group.


----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-04-02 11:21

Message:
Logged In: YES 
user_id=428521

I successfully applied the patch as is to revision 2.2.1c2
with the following output (and then the same procedure as
mentioned for patching revision 2.2) :

patching file Include/dictobject.h
patching file Include/frameobject.h
patching file Include/object.h
patching file Lib/test/test_descrtut.py
patching file Lib/test/test_subdict.py
patching file Modules/cPickle.c
patching file Objects/classobject.c
patching file Objects/frameobject.c
patching file Python/ceval.c
Hunk #2 succeeded at 1534 (offset 3 lines).
Hunk #4 succeeded at 1613 (offset 3 lines).
Hunk #6 succeeded at 1655 (offset 3 lines).
Hunk #8 succeeded at 1860 (offset 3 lines).
Hunk #10 succeeded at 1889 (offset 3 lines).
Hunk #12 succeeded at 2635 (offset 3 lines).
Hunk #14 succeeded at 2893 (offset 3 lines).
Hunk #16 succeeded at 3038 (offset 3 lines).
Hunk #18 succeeded at 3657 (offset 3 lines).
Hunk #20 succeeded at 3722 (offset 3 lines).
patching file Python/compile.c
Hunk #1 succeeded at 2916 (offset 12 lines).
patching file Python/import.c
Hunk #1 succeeded at 1668 (offset -4 lines).
Hunk #3 succeeded at 1716 (offset -4 lines).
patching file Python/sysmodule.c
Hunk #1 succeeded at 238 (offset -4 lines).


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 11:27

Message:
Logged In: YES 
user_id=6656

And there's precisely no way it's going into 2.2.x.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-30 00:08

Message:
Logged In: YES 
user_id=428521

to install this patch from python revision 2.2, follow these
steps :
- get the python.diff file from this page
- cd Python-2.2
- run "patch -p1 <wherever_you_put_the_file/python.diff"
- make clobber
- make recheck
- make
- make test
the reason you cannot just type make to rebuild everything
is that .h files are not checked when modules are
recompiled. Thus they are not, unless you remove them with
"make clobber". make recheck is to rebuild pyconfig.h.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-28 17:30

Message:
Logged In: YES 
user_id=428521

This patch has been generated from python version 2.2.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-19 10:57

Message:
Logged In: YES 
user_id=428521

Granted. Seems fair.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-18 08:59

Message:
Logged In: YES 
user_id=21627

This is quite a complex change. If you want to see it
integrated, I recommend that you find people that try it out
and report their experience here.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 18:08:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 10:08:58 -0800
Subject: [Patches] [ python-Patches-511219 ] suppress type restrictions on locals()
Message-ID: <E16sSiI-0000Su-00@usw-sf-web2.sourceforge.net>

Patches item #511219, was opened at 2002-01-31 15:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Cesar Douady (douady)
Assigned to: Nobody/Anonymous (nobody)
Summary: suppress type restrictions on locals()

Initial Comment:
This patch suppresses the restriction that global and
local dictionaries do not access overloaded __getitem__
and __setitem__ if passed an object derived from class
dict.

An exception is made for the builtin insertion and
reference in the global dict to make sure this object
exists and to suppress the need for the derived class
to take care of this implementation dependent detail.

The behavior of eval and exec has been updated for code
objects which have the CO_NEWLOCALS flag set : if
explicitely passed a local dict, a new local dict is
not generated. This allows one to pass an explicit
local dict to the code object of a function (which
otherwise cannot be achieved). If this cannot be done
for backward compatibility problems, then an
alternative would consist in using the "new" module to
create a code object from a function with CO_NEWLOCALS
reset but it seems logical to me to use the information
explicitely provided.

Free and cell variables are not managed in this
version. If the patch is accepted, I am willing to
finish the job and implement free and cell variables,
but this requires a serious rework of the Cell object:
free variables should be accessed using the method of
the dict in which they relies and today, this dict is
not accessible from the Cell object.

Robustness : Currently, the plain test suite passes
(with a modification of test_desctut which precisely
verifies that the suppressed restriction is enforced).
 I have introduced a new test (test_subdict.py) which
verifies the new behavior.

Because of performance, the plain case (when the local
dict is a plain dict) is optimized so that differences
in performance are not measurable (within 1%) when run
on the test suite (i.e. I timed make test).


----------------------------------------------------------------------

>Comment By: Cesar Douady (douady)
Date: 2002-04-02 20:08

Message:
Logged In: YES 
user_id=428521

Well,
I think I am in sync now.
1/ I did take you initial comment as meaning the patch could
not be applied to 2.2.x
2/ I decided to generate a new patch to be applied to 2.2.1c2
3/ I realized that the patch could be applied as is
4/ I was lost
5/ I realized the meaning of the group was the one you just
mentioned.
6/ I decided to post the result of my trial anyway so people
could confidently apply the patch the lastest release
(specially because patch outputs some warnings).
7/ I did not understand this place could actually be used as
a forum (i.e. reply to previous post rather than general info).
Let me apologize for my previous misunderstandings.

about compatibility :
I did not find a way to make it backward binary compatible,
however my intent is to make it source compatible for
extensions.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-02 16:26

Message:
Logged In: YES 
user_id=6656

So what?

Maybe you misunderstand me.  This patch was in the group 
"Python 2.2.x", which is the group we use for patches that 
are under consideration for being put into a 2.2.x release of 
Python (or in other words, a bugfix release of Python 2.2).

This patch is not going to go into a bugfix release of Python 
2.2 for at least two reasons: (1) it adds what is arguably a 
new feature and (2) it's big and complicated and so might 
cause bugs.

And now I've actually looked at the patch, it has even less 
chance: it would break binary compaitibilty of extensions.

So while I'm not against the patch in general (looks good, 
from an eyballing), it doesn't belong in the 2.2.x group.


----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-04-02 13:21

Message:
Logged In: YES 
user_id=428521

I successfully applied the patch as is to revision 2.2.1c2
with the following output (and then the same procedure as
mentioned for patching revision 2.2) :

patching file Include/dictobject.h
patching file Include/frameobject.h
patching file Include/object.h
patching file Lib/test/test_descrtut.py
patching file Lib/test/test_subdict.py
patching file Modules/cPickle.c
patching file Objects/classobject.c
patching file Objects/frameobject.c
patching file Python/ceval.c
Hunk #2 succeeded at 1534 (offset 3 lines).
Hunk #4 succeeded at 1613 (offset 3 lines).
Hunk #6 succeeded at 1655 (offset 3 lines).
Hunk #8 succeeded at 1860 (offset 3 lines).
Hunk #10 succeeded at 1889 (offset 3 lines).
Hunk #12 succeeded at 2635 (offset 3 lines).
Hunk #14 succeeded at 2893 (offset 3 lines).
Hunk #16 succeeded at 3038 (offset 3 lines).
Hunk #18 succeeded at 3657 (offset 3 lines).
Hunk #20 succeeded at 3722 (offset 3 lines).
patching file Python/compile.c
Hunk #1 succeeded at 2916 (offset 12 lines).
patching file Python/import.c
Hunk #1 succeeded at 1668 (offset -4 lines).
Hunk #3 succeeded at 1716 (offset -4 lines).
patching file Python/sysmodule.c
Hunk #1 succeeded at 238 (offset -4 lines).


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:27

Message:
Logged In: YES 
user_id=6656

And there's precisely no way it's going into 2.2.x.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-30 01:08

Message:
Logged In: YES 
user_id=428521

to install this patch from python revision 2.2, follow these
steps :
- get the python.diff file from this page
- cd Python-2.2
- run "patch -p1 <wherever_you_put_the_file/python.diff"
- make clobber
- make recheck
- make
- make test
the reason you cannot just type make to rebuild everything
is that .h files are not checked when modules are
recompiled. Thus they are not, unless you remove them with
"make clobber". make recheck is to rebuild pyconfig.h.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-28 18:30

Message:
Logged In: YES 
user_id=428521

This patch has been generated from python version 2.2.

----------------------------------------------------------------------

Comment By: Cesar Douady (douady)
Date: 2002-03-19 11:57

Message:
Logged In: YES 
user_id=428521

Granted. Seems fair.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-18 09:59

Message:
Logged In: YES 
user_id=21627

This is quite a complex change. If you want to see it
integrated, I recommend that you find people that try it out
and report their experience here.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=511219&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 19:04:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 11:04:49 -0800
Subject: [Patches] [ python-Patches-537536 ] bug 535444 super() broken w/classmethods
Message-ID: <E16sTaL-0003xj-00@usw-sf-web3.sourceforge.net>

Patches item #537536, was opened at 2002-03-31 18:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Phillip J. Eby (pje)
Assigned to: Guido van Rossum (gvanrossum)
Summary: bug 535444 super() broken w/classmethods

Initial Comment:
This patch fixes bug #535444.  It is against the
current CVS version of Python, and addresses the
problem by adding a 'starttype' variable to
'super_getattro', which works the same as 'starttype'
in the pure-Python version of super in the descriptor
tutorial.  This variable is then passed to the
descriptor __get__ function, ala
'descriptor.__get__(self.__obj__,starttype)'.

This patch does not correct the pure-Python version of
'super' in the descriptor tutorial; I don't know where
that file is or how to submit a patch for it.

This patch also does not include a regression test for
the bug.  I do not know what would be considered the
appropriate test script to place this in.

Thanks.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-02 14:04

Message:
Logged In: YES 
user_id=6380

Committed to the trunk. I'll leave it to Michael to commit
it to 2.2.1.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-01 23:11

Message:
Logged In: YES 
user_id=6380

Accepted, also as bugfix for 2.2.1 (assuming it works there,
not tested).

I can check this in in the morning.

Thanks all!

----------------------------------------------------------------------

Comment By: Phillip J. Eby (pje)
Date: 2002-04-01 14:41

Message:
Logged In: YES 
user_id=56214

Here's the regression test.  It asserts 6 things, 5 of which
will fail without the typeobject.c patch to super() in
place.  Thanks.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 06:55

Message:
Logged In: YES 
user_id=21627

Please put tests for this stuff into test_descr.py.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-01 04:38

Message:
Logged In: YES 
user_id=6656

Guido gets the fix too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=537536&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 19:24:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 11:24:39 -0800
Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance
Message-ID: <E16sTtX-0001N3-00@usw-sf-web2.sourceforge.net>

Patches item #538395, was opened at 2002-04-02 10:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: ae* modules: handle type inheritance

Initial Comment:
The gensuitemodule script creates Python 
classes out of AppleScript types. It keeps track of 
properties in _propdict and elements in _elemdict. 
However, gensuitemodule does not accurately 
replicate the AppleScript inheritance heirarchy, and 
__getattr__ only looks in self._propdict and 
self._elemdict, therefore not finding elements and 
properties defined in superclasses.

Attached is a patch which:

1) Correctly identifies an AppleScript type's 
superclasses, and defines the Python classes 
with these superclasses. Since not all names may 
be defined by the time a new class is defined, this 
is accomplished by setting a new class' 
__bases__ after all names are defined.

2) Changes __getattr__ to recurse superclasses 
while looking through _propdict and _elemdict.

It also contains small usability enhancements 
which will automatically look for a .want or .which 
property when you are creating specifiers.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470


From noreply@sourceforge.net  Tue Apr  2 21:47:13 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 13:47:13 -0800
Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance
Message-ID: <E16sW7V-0002ro-00@usw-sf-web1.sourceforge.net>

Patches item #538395, was opened at 2002-04-02 21:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: ae* modules: handle type inheritance

Initial Comment:
The gensuitemodule script creates Python 
classes out of AppleScript types. It keeps track of 
properties in _propdict and elements in _elemdict. 
However, gensuitemodule does not accurately 
replicate the AppleScript inheritance heirarchy, and 
__getattr__ only looks in self._propdict and 
self._elemdict, therefore not finding elements and 
properties defined in superclasses.

Attached is a patch which:

1) Correctly identifies an AppleScript type's 
superclasses, and defines the Python classes 
with these superclasses. Since not all names may 
be defined by the time a new class is defined, this 
is accomplished by setting a new class' 
__bases__ after all names are defined.

2) Changes __getattr__ to recurse superclasses 
while looking through _propdict and _elemdict.

It also contains small usability enhancements 
which will automatically look for a .want or .which 
property when you are creating specifiers.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2002-04-02 23:47

Message:
Logged In: YES 
user_id=45365

Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them).

First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?)

Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this.

Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders.

Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain
import othersuite.superfoo
import Foo_Suite.foo
class foo(Foo_Suite.foo, othersuite.superfoo):
	pass

Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like
	_propdict = aetools.flatten_dicts(
		foosuite.foo._propdict,
		othersuite.superfoo._propdict)
and similar for elemdict.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 06:50:29 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 22:50:29 -0800
Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance
Message-ID: <E16sebF-0003ZE-00@usw-sf-web4.sourceforge.net>

Patches item #538395, was opened at 2002-04-02 10:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: ae* modules: handle type inheritance

Initial Comment:
The gensuitemodule script creates Python 
classes out of AppleScript types. It keeps track of 
properties in _propdict and elements in _elemdict. 
However, gensuitemodule does not accurately 
replicate the AppleScript inheritance heirarchy, and 
__getattr__ only looks in self._propdict and 
self._elemdict, therefore not finding elements and 
properties defined in superclasses.

Attached is a patch which:

1) Correctly identifies an AppleScript type's 
superclasses, and defines the Python classes 
with these superclasses. Since not all names may 
be defined by the time a new class is defined, this 
is accomplished by setting a new class' 
__bases__ after all names are defined.

2) Changes __getattr__ to recurse superclasses 
while looking through _propdict and _elemdict.

It also contains small usability enhancements 
which will automatically look for a .want or .which 
property when you are creating specifiers.

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2002-04-02 21:50

Message:
Logged In: YES 
user_id=111050

Jack:

Thanks a lot for your comments! You're right on target on most of them.

Not sure why I didn't make a context diff this time -- last time it was screwed up, I thought it was because of that, but it was really just the tabs vs spaces issue. cvs is still very new and ugly to me.

I did indeed manually apply your patches to my tree, because I was afraid of what an update would do to the production code that my boss would kill me if broke... I'll do an update on another machine and reapply the patches to that checkout. Is there any way I can get a log of what the update has done to my files, so I can check them manually?

Hmm. I hadn't thought about passing the module itself; how would I get a reference to a package from inside of that package's __init__.py? From aetools, I can get away with saying __import__(modulename), but inside of __init__.py, what do I use to get a reference to the module that __init__.py is initializing?

Finally, after thinking about it a bit, Fourth and fifth points may be better solved by a construct like this:

(applescript type bar inherits from foo)

bar._elemdict = copy(foo._elemdict)
bar._elemdict.update({ dict of new keyword/class mappings })

This has the advantage of flattening out the inheritance tree again, since all elements and properties each class needs to know about are in it's own copy of _elemdict and _propdict, and therefore no Python inheritance relationship needs to be made.

I had wanted to build the inheritance heirarchy properly, and dynamically look through bases, because it was "cool" but in retrospect, speed is more important :-)

Donovan

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-04-02 12:47

Message:
Logged In: YES 
user_id=45365

Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them).

First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?)

Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this.

Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders.

Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain
import othersuite.superfoo
import Foo_Suite.foo
class foo(Foo_Suite.foo, othersuite.superfoo):
	pass

Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like
	_propdict = aetools.flatten_dicts(
		foosuite.foo._propdict,
		othersuite.superfoo._propdict)
and similar for elemdict.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 07:03:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 02 Apr 2002 23:03:21 -0800
Subject: [Patches] [ python-Patches-536661 ] splitext performances improvement
Message-ID: <E16senh-0003jR-00@usw-sf-web4.sourceforge.net>

Patches item #536661, was opened at 2002-03-29 09:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: splitext performances improvement

Initial Comment:
After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.

The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.

def splitext(p):
    root, ext = '', ''
    for c in p:
        if c == '/':
            root, ext = root + ext + c, ''
        elif c == '.':
            if ext:
                root, ext = root + ext, c
            else:
                ext = c
        elif ext:
            ext = ext + c
        else:
            root = root + c
    return root, ext

def splitext2(p):
    i = p.rfind('.')
    if i<=p.rfind('/'):
        return p, ''
    else:
        return p[:i], p[i:]

l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')

l2 = (
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
'/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
    )

for i in l1+l2:
    assert splitext2(i) == splitext(i)

import time

def test(f,args):
    t = time.clock()
    for p in args:
        for i in range(1000):
            f(p)
    return time.clock() - t

def f(p):pass

a=test(splitext, l1)
b=test(splitext2, l1)
c=test(f,l1)
print a,b,c,(a-c)/(b-c)

a=test(splitext, l2)
b=test(splitext2, l2)
c=test(f,l2)
print a,b,c,(a-c)/(b-c)


----------------------------------------------------------------------

>Comment By: Sebastien Keim (s_keim)
Date: 2002-04-03 09:03

Message:
Logged In: YES 
user_id=498191

xxxpath.dif contains the splitext patch for posixpath, ntpath, dospath macpath and the corresponding test files (I have added a test file for macpath).

I have found better to not attempt to modify riscospath.py since I don't know this platform. Anyway, it already use a rfind strategy.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-02 11:24

Message:
Logged In: YES 
user_id=21627

Sharing code is a good thing. However, it would be critical
as to how exactly this is done, since os is such a central
module. If you start now, and don't get agreement
immediately, it may well be that you cannot complete until
Python 2.3.

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:28

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2002-04-02 09:15

Message:
Logged In: YES 
user_id=498191

I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-29 19:56

Message:
Logged In: YES 
user_id=31435

I like it fine so far as it goes, but I'd like it a lot 
more if it also patched the splitext and test 
implementations for other platforms.  It's not good that, 
e.g., posixpath.py and ntpath.py get more and more out of 
synch over time, and that their test suites also diverge.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 10:49

Message:
Logged In: YES 
user_id=21627

The patch looks good to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536661&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 09:29:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 01:29:44 -0800
Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance
Message-ID: <E16sh5M-0002aT-00@usw-sf-web1.sourceforge.net>

Patches item #538395, was opened at 2002-04-02 21:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: ae* modules: handle type inheritance

Initial Comment:
The gensuitemodule script creates Python 
classes out of AppleScript types. It keeps track of 
properties in _propdict and elements in _elemdict. 
However, gensuitemodule does not accurately 
replicate the AppleScript inheritance heirarchy, and 
__getattr__ only looks in self._propdict and 
self._elemdict, therefore not finding elements and 
properties defined in superclasses.

Attached is a patch which:

1) Correctly identifies an AppleScript type's 
superclasses, and defines the Python classes 
with these superclasses. Since not all names may 
be defined by the time a new class is defined, this 
is accomplished by setting a new class' 
__bases__ after all names are defined.

2) Changes __getattr__ to recurse superclasses 
while looking through _propdict and _elemdict.

It also contains small usability enhancements 
which will automatically look for a .want or .which 
property when you are creating specifiers.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-03 11:29

Message:
Logged In: YES 
user_id=21627

cvs update will keep a copy of the original file (the one
you edited) if it has to merge changes; it will name it
.#<file>.<version>. So in no case cvs will destroy your
changes. Normally, merging works quite well. If it finds a
conflict, it will print a 'C' on update, and put a conflict
marker in the file. The stuff above the ===== is your code,
the one below is the CVS code.

If you want to find out what cvs would do, use 'cvs status'.

If you don't want cvs to do merging, the following procedure
will work

cvs diff -u >patches
patch -p0 -R <patches
cvs up
patch -p0 <patches

Notice that the last patch may report rejected hunks in case
of conflicts.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2002-04-03 08:50

Message:
Logged In: YES 
user_id=111050

Jack:

Thanks a lot for your comments! You're right on target on most of them.

Not sure why I didn't make a context diff this time -- last time it was screwed up, I thought it was because of that, but it was really just the tabs vs spaces issue. cvs is still very new and ugly to me.

I did indeed manually apply your patches to my tree, because I was afraid of what an update would do to the production code that my boss would kill me if broke... I'll do an update on another machine and reapply the patches to that checkout. Is there any way I can get a log of what the update has done to my files, so I can check them manually?

Hmm. I hadn't thought about passing the module itself; how would I get a reference to a package from inside of that package's __init__.py? From aetools, I can get away with saying __import__(modulename), but inside of __init__.py, what do I use to get a reference to the module that __init__.py is initializing?

Finally, after thinking about it a bit, Fourth and fifth points may be better solved by a construct like this:

(applescript type bar inherits from foo)

bar._elemdict = copy(foo._elemdict)
bar._elemdict.update({ dict of new keyword/class mappings })

This has the advantage of flattening out the inheritance tree again, since all elements and properties each class needs to know about are in it's own copy of _elemdict and _propdict, and therefore no Python inheritance relationship needs to be made.

I had wanted to build the inheritance heirarchy properly, and dynamically look through bases, because it was "cool" but in retrospect, speed is more important :-)

Donovan

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-04-02 23:47

Message:
Logged In: YES 
user_id=45365

Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them).

First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?)

Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this.

Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders.

Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain
import othersuite.superfoo
import Foo_Suite.foo
class foo(Foo_Suite.foo, othersuite.superfoo):
	pass

Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like
	_propdict = aetools.flatten_dicts(
		foosuite.foo._propdict,
		othersuite.superfoo._propdict)
and similar for elemdict.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 20:43:47 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 12:43:47 -0800
Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type
Message-ID: <E16srbf-0005M2-00@usw-sf-web4.sourceforge.net>

Patches item #528022, was opened at 2002-03-10 00:45
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 285 - Adding a bool type

Initial Comment:
Here's a preliminary implementation of the PEP,
including unittests  checking the promises made in the
PEP (test_bool.py) and (some) documentation.

With this 12 tests fail for me (on Linux); I'll look
into these later.  They appear shallow (mostly doctests
dying on True or False where 1 or 0 was expected).

Note: the presence of this patch does not mean that the
PEP is accepted -- it just means that a sample
implementation exists in case someone wants to explore
the effects of the PEP on their code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-03 15:43

Message:
Logged In: YES 
user_id=6380

I've attached a new patch, booldiff3.txt, that solves the
two remaining problems:

- picke, cPickle and marshal roundtrip

- the test suite succeeds (a total of 12 tests had to be
fixed, all because of True/False vs. 1/0 in printed output)

I'm ready to check this in, but I'll first update the PEP.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 06:44

Message:
Logged In: YES 
user_id=21627

This patch does not support pickling of bools (the PEP
should probably spell out how they are pickled). marshalling
of bools does not round-trip (you get back an int).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-31 21:46

Message:
Logged In: YES 
user_id=6380

Here's an updated diff (booldiff2.txt). It fixes a refcount
bug in bool_repr(), and works with current CVS.

With this patch set, 10 standard tests fail for shallow
reasons having to do with str() or repr() returning False or
True instead of 0 or 1. Here are the failed tests:

    test_descr test_descrtut test_difflib test_doctest
test_extcall
    test_generators test_gettext test_richcmp
test_richcompare
    test_unicode


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 21:16:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 13:16:42 -0800
Subject: [Patches] [ python-Patches-538395 ] ae* modules: handle type inheritance
Message-ID: <E16ss7W-0005m1-00@usw-sf-web4.sourceforge.net>

Patches item #538395, was opened at 2002-04-02 21:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: ae* modules: handle type inheritance

Initial Comment:
The gensuitemodule script creates Python 
classes out of AppleScript types. It keeps track of 
properties in _propdict and elements in _elemdict. 
However, gensuitemodule does not accurately 
replicate the AppleScript inheritance heirarchy, and 
__getattr__ only looks in self._propdict and 
self._elemdict, therefore not finding elements and 
properties defined in superclasses.

Attached is a patch which:

1) Correctly identifies an AppleScript type's 
superclasses, and defines the Python classes 
with these superclasses. Since not all names may 
be defined by the time a new class is defined, this 
is accomplished by setting a new class' 
__bases__ after all names are defined.

2) Changes __getattr__ to recurse superclasses 
while looking through _propdict and _elemdict.

It also contains small usability enhancements 
which will automatically look for a .want or .which 
property when you are creating specifiers.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2002-04-03 23:16

Message:
Logged In: YES 
user_id=45365

Donovan, two comments-on-your-comments:
- You're absolutely right about the module names. Pickle also uses names, and it's probably the only way to do it.
- You're also absolutely right about how to update the _elemdict and _propdict.

Or, as Jean-Luc Picard would say: "Make it so!" :-)

Oh yes, on the production code/merging problem: aside from Martin's comments here's another tip: make a copy of the subtree that contains the conflict section (why not the whole Mac subtree in your case) and make sure you keep the CVS directories. Start hacking in this copy. Once you're satisfied do a commit from there. As long as you keep the CVS directory with the files there's little that can go wrong.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-03 11:29

Message:
Logged In: YES 
user_id=21627

cvs update will keep a copy of the original file (the one
you edited) if it has to merge changes; it will name it
.#<file>.<version>. So in no case cvs will destroy your
changes. Normally, merging works quite well. If it finds a
conflict, it will print a 'C' on update, and put a conflict
marker in the file. The stuff above the ===== is your code,
the one below is the CVS code.

If you want to find out what cvs would do, use 'cvs status'.

If you don't want cvs to do merging, the following procedure
will work

cvs diff -u >patches
patch -p0 -R <patches
cvs up
patch -p0 <patches

Notice that the last patch may report rejected hunks in case
of conflicts.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2002-04-03 08:50

Message:
Logged In: YES 
user_id=111050

Jack:

Thanks a lot for your comments! You're right on target on most of them.

Not sure why I didn't make a context diff this time -- last time it was screwed up, I thought it was because of that, but it was really just the tabs vs spaces issue. cvs is still very new and ugly to me.

I did indeed manually apply your patches to my tree, because I was afraid of what an update would do to the production code that my boss would kill me if broke... I'll do an update on another machine and reapply the patches to that checkout. Is there any way I can get a log of what the update has done to my files, so I can check them manually?

Hmm. I hadn't thought about passing the module itself; how would I get a reference to a package from inside of that package's __init__.py? From aetools, I can get away with saying __import__(modulename), but inside of __init__.py, what do I use to get a reference to the module that __init__.py is initializing?

Finally, after thinking about it a bit, Fourth and fifth points may be better solved by a construct like this:

(applescript type bar inherits from foo)

bar._elemdict = copy(foo._elemdict)
bar._elemdict.update({ dict of new keyword/class mappings })

This has the advantage of flattening out the inheritance tree again, since all elements and properties each class needs to know about are in it's own copy of _elemdict and _propdict, and therefore no Python inheritance relationship needs to be made.

I had wanted to build the inheritance heirarchy properly, and dynamically look through bases, because it was "cool" but in retrospect, speed is more important :-)

Donovan

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-04-02 23:47

Message:
Logged In: YES 
user_id=45365

Donovan, I love the functionality of your patch, but I would humbly request you make a couple of changes. Alternatively I'll make them, but that will delay the patch (as I have to find the time to do them).

First: please make it a context diff (cvs diff -c), as straight diffs are too error prone for moving targets. There are also mods I can't judge this way (such as why you moved the 'utxt' support in aepack.py to a different place. Or is this a whitespace mismatch?)

Second: you've diffed against the different version than against which you've patched. See gensuitemodule, for instance: it appears as if you've modified 1.22 but diffed against 1.21. Maybe you applied my 1.21->1.22 by hand without doing a cvs update? I think a cvs update (plus some manual work;-) should solve this.

Third: the passing of modules by name (to the decoding routines) seems error prone and not too elegant. Can't you pass the modules themselves in stead of their names? It would also save extra imports in the decoders.

Fourth: assigning to __bases__ seems like rather a big hack. Can't we generate the classes with a helper class, similarly to the event helper class in the __init__.py modules: FooApp/Foo_Suite.py would contain the class foo sketched above, and FooApp.__init__.py would contain
import othersuite.superfoo
import Foo_Suite.foo
class foo(Foo_Suite.foo, othersuite.superfoo):
	pass

Fifth, and least important: you're manually iterating over the base classes for lookup. Couldn't we statically combine the _propdict and _elemdict's of the base classes during class declaration, so that at lookup time we'd need only a single dictionary lookup? The "class foo" body in __init__.py would then become something like
	_propdict = aetools.flatten_dicts(
		foosuite.foo._propdict,
		othersuite.superfoo._propdict)
and similar for elemdict.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=538395&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 23:04:29 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 15:04:29 -0800
Subject: [Patches] [ python-Patches-528022 ] PEP 285 - Adding a bool type
Message-ID: <E16stnp-00074C-00@usw-sf-web4.sourceforge.net>

Patches item #528022, was opened at 2002-03-10 00:45
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 285 - Adding a bool type

Initial Comment:
Here's a preliminary implementation of the PEP,
including unittests  checking the promises made in the
PEP (test_bool.py) and (some) documentation.

With this 12 tests fail for me (on Linux); I'll look
into these later.  They appear shallow (mostly doctests
dying on True or False where 1 or 0 was expected).

Note: the presence of this patch does not mean that the
PEP is accepted -- it just means that a sample
implementation exists in case someone wants to explore
the effects of the PEP on their code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-03 18:04

Message:
Logged In: YES 
user_id=6380

Here's a new version of booldiff.txt that includes the new
files boolobject.[ch] and test_bool.py.  Sorry.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-03 15:43

Message:
Logged In: YES 
user_id=6380

I've attached a new patch, booldiff3.txt, that solves the
two remaining problems:

- picke, cPickle and marshal roundtrip

- the test suite succeeds (a total of 12 tests had to be
fixed, all because of True/False vs. 1/0 in printed output)

I'm ready to check this in, but I'll first update the PEP.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-01 06:44

Message:
Logged In: YES 
user_id=21627

This patch does not support pickling of bools (the PEP
should probably spell out how they are pickled). marshalling
of bools does not round-trip (you get back an int).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-31 21:46

Message:
Logged In: YES 
user_id=6380

Here's an updated diff (booldiff2.txt). It fixes a refcount
bug in bool_repr(), and works with current CVS.

With this patch set, 10 standard tests fail for shallow
reasons having to do with str() or repr() returning False or
True instead of 0 or 1. Here are the failed tests:

    test_descr test_descrtut test_difflib test_doctest
test_extcall
    test_generators test_gettext test_richcmp
test_richcompare
    test_unicode


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=528022&group_id=5470


From noreply@sourceforge.net  Wed Apr  3 23:48:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 15:48:42 -0800
Subject: [Patches] [ python-Patches-539005 ] error in RawPen-class (line 262)
Message-ID: <E16suUc-0007gj-00@usw-sf-web3.sourceforge.net>

Patches item #539005, was opened at 2002-04-04 01:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregor Lingl (glingl)
Assigned to: Nobody/Anonymous (nobody)
Summary: error in RawPen-class (line 262)

Initial Comment:
line 262 uses the global variable _canvas
instead of the instance-variable self._canvas
created in the RawPen - Constructor.

This certainly is a *very* old bug and it
seems strange, that it could remain undetected
that long.

for the patch look at lines 262 - 264

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 00:15:53 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 16:15:53 -0800
Subject: [Patches] [ python-Patches-539005 ] error in RawPen-class (line 262)
Message-ID: <E16suuv-0007qy-00@usw-sf-web4.sourceforge.net>

Patches item #539005, was opened at 2002-04-03 18:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470

Category: Tkinter
Group: Python 2.2.x
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Gregor Lingl (glingl)
Assigned to: Nobody/Anonymous (nobody)
Summary: error in RawPen-class (line 262)

Initial Comment:
line 262 uses the global variable _canvas
instead of the instance-variable self._canvas
created in the RawPen - Constructor.

This certainly is a *very* old bug and it
seems strange, that it could remain undetected
that long.

for the patch look at lines 262 - 264

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-03 19:15

Message:
Logged In: YES 
user_id=33168

This is a duplicate of #538991,
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=538991

The fix has been corrected on the main branch, 
but not in 2.2 branch yet.  I'm not sure the fix 
will go in 2.2.1.  It will be in 2.2.2.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539005&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 01:28:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 03 Apr 2002 17:28:35 -0800
Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE
Message-ID: <E16sw3H-0005hk-00@usw-sf-web1.sourceforge.net>

Patches item #539043, was opened at 2002-04-03 20:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Support PyChecker in IDLE

Initial Comment:
This patch adds SIMPLE support for pychecker in IDLE.
It is not complete.  It pops up a window, you can enter
filenames (not even a file dialog!), and run pychecker.

You cannot change examples.  If someone wants to really
integrate this, they should add the user interface in
pychecker (pychecker/options.py), use a file dialog
to enter files, and handle file modifications.
Since pychecker imports the files, they need to be
removed from sys.modules, so modifications will be seen.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 17:12:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 09:12:50 -0800
Subject: [Patches] [ python-Patches-534862 ] help asyncore recover from repr() probs
Message-ID: <E16tAn4-00083J-00@usw-sf-web1.sourceforge.net>

Patches item #534862, was opened at 2002-03-25 16:12
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
>Assigned to: Jeremy Hylton (jhylton)
Summary: help asyncore recover from repr() probs

Initial Comment:
I've had this patch my my copy of asyncore.py
for quite awhile.  It works for me as a way to
recover from repr() bogosities, though I'm
unfamiliar enough with repr/str issues and
asyncore to know if this is the right way to
make it more bulletproof (or if it should even be
made more bulletproof).

Skip


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 12:12

Message:
Logged In: YES 
user_id=6380

Jeremy, what do you think of this? Looks harmless to me...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534862&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 17:31:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 09:31:35 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E16tB5D-0008Hc-00@usw-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 14:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Fredrik Lundh (effbot)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 12:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 17:51:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 09:51:46 -0800
Subject: [Patches] [ python-Patches-536407 ] Comprehensibility patch (typeobject.c)
Message-ID: <E16tBOk-00031o-00@usw-sf-web3.sourceforge.net>

Patches item #536407, was opened at 2002-03-28 13:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536407&group_id=5470

Category: Core (C code)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: David Abrahams (david_abrahams)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Comprehensibility patch (typeobject.c)

Initial Comment:
--- typeobject.c	Mon Dec 17 12:14:22 2001
+++ typeobject.c.new	Thu Mar 28 13:46:03 2002
@@ -1186,8 +1186,8 @@
 type_getattro(PyTypeObject *type, PyObject *name)
 {
 	PyTypeObject *metatype = type->ob_type;
-	PyObject *descr, *res;
-	descrgetfunc f;
+	PyObject *meta_attribute, *attribute;
+	descrgetfunc meta_get;
 
 	/* Initialize this type (we'll assume the 
metatype is initialized) */
 	if (type->tp_dict == NULL) {
@@ -1195,34 +1195,50 @@
 			return NULL;
 	}
 
-	/* Get a descriptor from the metatype */
-	descr = _PyType_Lookup(metatype, name);
-	f = NULL;
-	if (descr != NULL) {
-		f = descr->ob_type->tp_descr_get;
-		if (f != NULL && PyDescr_IsData
(descr))
-			return f(descr,
-				 (PyObject *)type, 
(PyObject *)metatype);
-	}
+	/* No readable descriptor found yet */
+	meta_get = NULL;
+        
+	/* Look for the attribute in the metatype */
+	meta_attribute = _PyType_Lookup(metatype, 
name);
 
-	/* Look in tp_dict of this type and its bases 
*/
-	res = _PyType_Lookup(type, name);
-	if (res != NULL) {
-		f = res->ob_type->tp_descr_get;
-		if (f != NULL)
-			return f(res, (PyObject *)
NULL, (PyObject *)type);
-		Py_INCREF(res);
-		return res;
+	if (meta_attribute != NULL) {
+		meta_get = meta_attribute->ob_type-
>tp_descr_get;
+                
+		if (meta_get != NULL && PyDescr_IsData
(meta_attribute)) {
+            /* Data descriptors implement 
tp_descr_set to intercept
+             * writes. Assume the attribute is not 
overridden in
+             * type's tp_dict (and bases): call the 
descriptor now.
+             */
+			return meta_get
(meta_attribute,
+                            (PyObject *)type, 
(PyObject *)metatype);
+        }
 	}
 
-	/* Use the descriptor from the metatype */
-	if (f != NULL) {
-		res = f(descr, (PyObject *)type, 
(PyObject *)metatype);
-		return res;
+	/* No data descriptor found on metatype. Look 
in tp_dict of this
+     * type and its bases */
+	attribute = _PyType_Lookup(type, name);
+	if (attribute != NULL) {
+        /* Implement descriptor functionality, if 
any */
+		descrgetfunc local_get = attribute-
>ob_type->tp_descr_get;
+		if (local_get != NULL) {
+            /* NULL 2nd argument indicates the 
descriptor was found on
+             * the target object itself (or a base)  
*/
+			return local_get(attribute, 
(PyObject *)NULL, (PyObject *)type);
+        }
+        
+		Py_INCREF(attribute);
+		return attribute;
 	}
-	if (descr != NULL) {
-		Py_INCREF(descr);
-		return descr;
+
+	/* No attribute found in local __dict__ (or 
bases): use the
+     * descriptor from the metatype, if any */
+	if (meta_get != NULL)
+		return meta_get(meta_attribute, 
(PyObject *)type, (PyObject *)metatype);
+
+    /* If an ordinary attribute was found on the 
metatype, return it now. */
+	if (meta_attribute != NULL) {
+		Py_INCREF(meta_attribute);
+		return meta_attribute;
 	}
 
 	/* Give up */


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 12:51

Message:
Logged In: YES 
user_id=6380

Thanks, applied (after folding some long lines).

Next time, please don't call the patch "patch".  Call it
something like "typeobject.patch".

----------------------------------------------------------------------

Comment By: David Abrahams (david_abrahams)
Date: 2002-03-29 17:30

Message:
Logged In: YES 
user_id=52572

Thanks, Neil, I think I got the picture already (see 
Python-Dev).

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-29 17:27

Message:
Logged In: YES 
user_id=35752

Don't paste the patch in the comment box.

----------------------------------------------------------------------

Comment By: David Abrahams (david_abrahams)
Date: 2002-03-29 16:22

Message:
Logged In: YES 
user_id=52572

I have updated the patch so that it is made against the 
current sources.

-------

--- typeobject.c	Thu Mar 28 00:33:33 2002
+++ typeobject.c.new	Fri Mar 29 16:20:12 2002
@@ -1237,8 +1237,8 @@
 type_getattro(PyTypeObject *type, PyObject *name)
 {
 	PyTypeObject *metatype = type->ob_type;
-	PyObject *descr, *res;
-	descrgetfunc f;
+	PyObject *meta_attribute, *attribute;
+	descrgetfunc meta_get;
 
 	/* Initialize this type (we'll assume the metatype 
is initialized) */
 	if (type->tp_dict == NULL) {
@@ -1246,40 +1246,56 @@
 			return NULL;
 	}
 
-	/* Get a descriptor from the metatype */
-	descr = _PyType_Lookup(metatype, name);
-	f = NULL;
-	if (descr != NULL) {
-		f = descr->ob_type->tp_descr_get;
-		if (f != NULL && PyDescr_IsData(descr))
-			return f(descr,
-				 (PyObject *)type, 
(PyObject *)metatype);
-	}
+	/* No readable descriptor found yet */
+	meta_get = NULL;
+		
+	/* Look for the attribute in the metatype */
+	meta_attribute = _PyType_Lookup(metatype, name);
 
-	/* Look in tp_dict of this type and its bases */
-	res = _PyType_Lookup(type, name);
-	if (res != NULL) {
-		f = res->ob_type->tp_descr_get;
-		if (f != NULL)
-			return f(res, (PyObject *)NULL, 
(PyObject *)type);
-		Py_INCREF(res);
-		return res;
+	if (meta_attribute != NULL) {
+		meta_get = meta_attribute->ob_type-
>tp_descr_get;
+				
+		if (meta_get != NULL && PyDescr_IsData
(meta_attribute)) {
+			/* Data descriptors implement 
tp_descr_set to intercept
+			 * writes. Assume the attribute is 
not overridden in
+			 * type's tp_dict (and bases): 
call the descriptor now.
+			 */
+			return meta_get(meta_attribute,
+						
	(PyObject *)type, (PyObject *)metatype);
+		}
 	}
 
-	/* Use the descriptor from the metatype */
-	if (f != NULL) {
-		res = f(descr, (PyObject *)type, (PyObject 
*)metatype);
-		return res;
+	/* No data descriptor found on metatype. Look in 
tp_dict of this
+	 * type and its bases */
+	attribute = _PyType_Lookup(type, name);
+	if (attribute != NULL) {
+		/* Implement descriptor functionality, if 
any */
+		descrgetfunc local_get = attribute-
>ob_type->tp_descr_get;
+		if (local_get != NULL) {
+			/* NULL 2nd argument indicates the 
descriptor was found on
+			 * the target object itself (or a 
base)  */
+			return local_get(attribute, 
(PyObject *)NULL, (PyObject *)type);
+		}
+		
+		Py_INCREF(attribute);
+		return attribute;
 	}
-	if (descr != NULL) {
-		Py_INCREF(descr);
-		return descr;
+
+	/* No attribute found in local __dict__ (or 
bases): use the
+	 * descriptor from the metatype, if any */
+	if (meta_get != NULL)
+		return meta_get(meta_attribute, (PyObject 
*)type, (PyObject *)metatype);
+
+	/* If an ordinary attribute was found on the 
metatype, return it now. */
+	if (meta_attribute != NULL) {
+		Py_INCREF(meta_attribute);
+		return meta_attribute;
 	}
 
 	/* Give up */
 	PyErr_Format(PyExc_AttributeError,
-		     "type object '%.50s' has no 
attribute '%.400s'",
-		     type->tp_name, PyString_AS_STRING
(name));
+			 "type object '%.50s' has no 
attribute '%.400s'",
+			 type->tp_name, PyString_AS_STRING
(name));
 	return NULL;
 }
 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536407&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 17:52:47 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 09:52:47 -0800
Subject: [Patches] [ python-Patches-539360 ] Webbrowser.py and konqueror
Message-ID: <E16tBPj-0000Cz-00@usw-sf-web2.sourceforge.net>

Patches item #539360, was opened at 2002-04-04 09:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539360&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Andy McKay (zopezen)
Assigned to: Nobody/Anonymous (nobody)
Summary: Webbrowser.py and konqueror

Initial Comment:
The open function for konqueror would always fail on the assert. The assert would check the action 
did not contain a single quote. The url passed through in the open function would always contain a 
single quote.

The assert should check the incoming url for a single quote. If its properly quoted then you can 
pass on to _remote. Secondly since the _remote url is now correctly quoted, there is no need for a 
second set of quotes on the kfmclient.

Tested on Kde 2.2.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539360&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 17:55:14 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 09:55:14 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E16tBS6-00007T-00@usw-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 11:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Fredrik Lundh (effbot)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 09:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 09:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 18:59:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 10:59:23 -0800
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16tCSB-0003op-00@usw-sf-web3.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: Nobody/Anonymous (nobody)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 19:26:09 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 11:26:09 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E16tCs5-0001G3-00@usw-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 11:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Fredrik Lundh (effbot)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 11:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 09:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 09:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 19:57:40 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 11:57:40 -0800
Subject: [Patches] [ python-Patches-533008 ] specifying headers for extensions
Message-ID: <E16tDMa-0004VH-00@usw-sf-web3.sourceforge.net>

Patches item #533008, was opened at 2002-03-21 06:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: specifying headers for extensions

Initial Comment:
This patch allows to specify that C header files are 
part of source files for dependency checking. 
The 'sources' list in Extension instances can be 
simple filenames as before, but they can also be 
SourceFile instances created by

SourceFile("myfile.c", headers=["inc1.h", "inc2.h"]).

Unfortunately not only changes to command.build_ext 
and command.build_clib had to be made, also all the 
ccompiler (sub)classes have to be changed because the 
ccompiler does the actual dependency checking. I 
updated all the ccompiler subclasses except 
mwerkscompiler.py, but only msvccompiler has actually 
been tested.

The argument list which dep_util.newer_pairwise() now 
accepts has changed, the first arg must now be a 
sequence of SourceFile instances. This may be 
problematic, better would IMO be to move this function 
(with a new name?) into ccompiler.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-04 14:57

Message:
Logged In: YES 
user_id=3066

Wow!  That's certainly more patch than I'd expected, but the
approach looks about right to me.  I'd like to take another
look at it in a few days (mail me if I don't take action
soon) before we accept, just to make sure I understand it
better.

Thanks!

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-03-25 04:03

Message:
Logged In: YES 
user_id=11105

Fred requested it this way:
http://mail.python.org/pipermail/distutils-sig/2002-
March/002806.html

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-24 17:05

Message:
Logged In: YES 
user_id=6380

Why is this priority 7??????

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 20:16:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 12:16:38 -0800
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16tDew-0004jO-00@usw-sf-web3.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: Nobody/Anonymous (nobody)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-04 22:16

Message:
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 20:51:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 12:51:56 -0800
Subject: [Patches] [ python-Patches-523415 ] Explict proxies for urllib.urlopen()
Message-ID: <E16tED6-0002Jz-00@usw-sf-web1.sourceforge.net>

Patches item #523415, was opened at 2002-02-27 09:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523415&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Andy Gimblett (gimbo)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Explict proxies for urllib.urlopen()

Initial Comment:
This patch extends urllib.urlopen() so that
proxies may be specified explicitly.  This is
achieved by adding an optional "proxies"
parameter.  If this parameter is omitted,
urlopen() acts exactly as before, ie gets
proxy settings from the environment.

This is useful if you want to tell urlopen()
not to use the proxy: just pass an empty
dictionary.

Also included is a patch to the urllib
documentation explaining the new parameter.

Apologies if patch format is not exactly as
required: this is my first submission.  All
feedback appreciated.  :-)


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-04 15:51

Message:
Logged In: YES 
user_id=3066

I've checked this in, with some changes to the code for
urlopen().  When a proxy configuration is supplied, the
version I checked in does not save the opener if there isn't
one; it always discards it.  If you really want to use a
specific proxy configuration with the simple functions,
create the opener and assign it to urllib._urlopener.

----------------------------------------------------------------------

Comment By: Andy Gimblett (gimbo)
Date: 2002-03-21 06:08

Message:
Logged In: YES 
user_id=262849

OK, have updated docs as suggested by aimacintyre,
attached as urllib_proxies_docs.cdiff

I also added an example for explicit proxy
specification, since it illustrates how the proxies
dictionary should be structured.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2002-03-10 00:31

Message:
Logged In: YES 
user_id=250749

I think expanding the docs is the go here.

In looking at the 2.2 docs (11.4 urllib), the bits that I think could usefully be improved include:-
- the paragraph describing the proxy environment variables should note that on Windows,
  browser (at least for InternetExplorer - I don't know about Netscape) registry settings for proxies
  will be used when available;
- a short para noting that proxies can be overridden using URLopener/FancyURLopener 
  class instances, documented further down the page, placed just before the note about 
  not supporting authenticating proxies;
- adding a description of the "proxies" parameter to the URLopener class definition;
- adding an example of bypassing proxies to the examples subsection (11.4.2).

If/when you upload a doc patch, I suggest that you assign it to Fred Drake, who is the 
chief docs person.

----------------------------------------------------------------------

Comment By: Andy Gimblett (gimbo)
Date: 2002-03-04 04:33

Message:
Logged In: YES 
user_id=262849

Thanks for feedback re: diffs.  Have now found out
about context diffs and attached new version - hope
this is better.

Regarding the patch itself, this arose out of a newbie
question on c.l.py and I was reminded that this was an
issue I'd come across in my early days too.  Personally
I'd never picked up the hint that you should use
FancyURLopener directly.

If preferred, I could have a go at patching the docs
to make that clearer?


----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2002-03-02 22:34

Message:
Logged In: YES 
user_id=250749

BTW, the patch guidelines indicate a strong preference for context diffs with unified diffs a poor second.

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2002-03-02 22:32

Message:
Logged In: YES 
user_id=250749

Having just looked at this myself, I can understand where you're coming from, however my reading between the lines of the 
docs is that if you care about the proxies then you are supposed to use urllib.FancyURLopener (or urllib.URLopener) 
directly.  If this is the intent, the docs could be a little clearer about this.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523415&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 22:25:53 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 14:25:53 -0800
Subject: [Patches] [ python-Patches-539486 ] build info docs from sources
Message-ID: <E16tFg1-0003S5-00@usw-sf-web1.sourceforge.net>

Patches item #539486, was opened at 2002-04-04 22:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470

Category: Documentation
Group: Python 2.1.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 22:26:45 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 14:26:45 -0800
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E16tFgr-0003Sb-00@usw-sf-web1.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 22:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Thu Apr  4 23:45:59 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Apr 2002 15:45:59 -0800
Subject: [Patches] [ python-Patches-514662 ] On the update_slot() behavior
Message-ID: <E16tGvX-00077m-00@usw-sf-web4.sourceforge.net>

Patches item #514662, was opened at 2002-02-07 23:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=514662&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 6
Submitted By: Naofumi Honda (naofumi-h)
Assigned to: Guido van Rossum (gvanrossum)
Summary: On the update_slot() behavior

Initial Comment:
Inherited method __getitem__ of list type
in the new subclass is unexpectedly slow. 

For example,

x = list([1,2,3])
r = xrange(1, 1000000)
for i in r:
        x[1] = 2

==> excution time: real    0m2.390s 

class nlist(list):
        pass

x = nlist([1,2,3])
r = xrange(1, 1000000)
for i in r:
        x[1] = 2

==> excution time: real    0m7.040s
about 3times slower!!!

The reason is:
for the __getitem__ attribute, there are
two slotdefs in typeobject.c
(one for the mapping type, and
the other for the sequence type).

In the creation of new_type of list type, 
fixup_slot_dispatchers() and update_slot() functions
in typeobject.c allocate the functions
to both sq_item and mp_subscript slots
(the mp_subscript slot had originally no function,
  because the list type is a sequence type),
 and it's an unexpected allocation for the mapping
 slot since the descriptor type of __getitem__
 is now WrapperType for the sequence operations.

If you will trace x[1] using gdb,
you will find that in PyObject_GetItem() 
m->mp_subscript = slot_mp_subscript 
is called instead of a sequece operation
because mp_subscript slot was allocated by
fixup_slot_dispatchers().
In the slot_mp_subscirpt(),
call_method(self, "__getitem__", ...) is invoked,
and turn out to call a wrapper descriptors for
the sq_item.

As a result, the method of list type finally called,
but it needs many unexpected function calls.

I will fix the behavior of fixup_slot_dispachers()
and update_slot() as follows:

Only the case where 
*) two or more slotdefs have the same attribute
   name where at most one corresponding slot
   has a non null pointer
*) the descriptor type of the attribute is
   WrapperType,

these functions will allocate the only one
function to the apropriate slot.

The other case, the behavior not changed
to keep compatiblity!
(in particular, considering the case where
  user override methods exist!)

The following patch also includes speed up routines
to find the slotdef duplications,
but it's not essential!


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 18:45

Message:
Logged In: YES 
user_id=6380

Thanks!  Checked in, with much refactoring.

----------------------------------------------------------------------

Comment By: Naofumi Honda (naofumi-h)
Date: 2002-03-23 03:40

Message:
Logged In: YES 
user_id=452575

Yes. slot-1.dif is a new version.
At least, I purged ifdef ... as you want.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-22 22:47

Message:
Logged In: YES 
user_id=6380

Is slot-1.dif the promised new patch?

----------------------------------------------------------------------

Comment By: Naofumi Honda (naofumi-h)
Date: 2002-03-11 21:49

Message:
Logged In: YES 
user_id=452575

I will post a new patch containing a essential part of
previous one (i.e. without ifdef and almost all speed up
routines).

----------------------------------------------------------------------

Comment By: Naofumi Honda (naofumi-h)
Date: 2002-03-11 21:49

Message:
Logged In: YES 
user_id=452575

I will post a new patch containing a essential part of
previous one (i.e. without ifdef and almost all speed up
routines).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-10 17:14

Message:
Logged In: YES 
user_id=6380

Thanks for the analysis! Would you mind submitting a new
patch without the #ifdef ORIGINAL_CODE stuff? Just
delete/replace old code as needed -- cvs diff will show me
the original code. The ORIGINAL_CODE stuff makes it harder
for me to get the point of the diff. Also, maybe you could
leave the speedup code out, to show the absolutely minimal
amount of code needed.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=514662&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 15:18:20 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 07:18:20 -0800
Subject: [Patches] [ python-Patches-536578 ] patch for bug 462783 mmap bus error
Message-ID: <E16tVTo-0000OQ-00@usw-sf-web4.sourceforge.net>

Patches item #536578, was opened at 2002-03-28 22:02
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536578&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Green (gpgreen)
>Assigned to: A.M. Kuchling (akuchling)
Summary: patch for bug 462783 mmap bus error

Initial Comment:
This patch fixes SF 462783. The problem was that an
mmap'ed file caused a bus error when reading data from
the file. The root cause is that the file wasn't
flushed following a write. The patched module will
throw an OSError exception if the mmap object was
created without being flushed, fseek'ed, or closed,
following a write. This patch only applies to unix
systems. Windows seems to handle the condition ok.

The problem with the patch is that existing code can be
broken. On some systems, (FreeBSD, irix), as long as
the file was flushed before attempting to read from the
mmap object, it would work with no bus error. Linux
gets a bus error no matter what. So existing code that
did flush (or fseek) before a read will now get an
OSError exception during mmap creation instead.

I tried this on the cvs version of python 2.3, on linux
redhat 7.2, FreeBSD 4.5, irix 6.5 n32, and windows 2000.

-- Greg Green

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 10:18

Message:
Logged In: YES 
user_id=6380

One comment from Andrew Dalke (who submitted bug 462783)
about the patch: There's a small typo in the patch to
test_mmap.py.  Line 277 says
  ... not in ('win32'):
when it should say
  ... not in ('win32', ):

(Personally, I'd write ... != 'win32' or ... not in
['win32'] --GvR)

Assigning to AMK since it's his module.

----------------------------------------------------------------------

Comment By: Greg Green (gpgreen)
Date: 2002-03-29 13:49

Message:
Logged In: YES 
user_id=499627

my email is gregory.p.green@boeing.com

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536578&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 18:59:55 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 10:59:55 -0800
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16tYwF-00036R-00@usw-sf-web3.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernhard Reiter (ber)
>Assigned to: Martin v. Löwis (loewis)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-04 22:16

Message:
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 19:08:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 11:08:46 -0800
Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE
Message-ID: <E16tZ4o-0000Rh-00@usw-sf-web1.sourceforge.net>

Patches item #539043, was opened at 2002-04-04 03:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Support PyChecker in IDLE

Initial Comment:
This patch adds SIMPLE support for pychecker in IDLE.
It is not complete.  It pops up a window, you can enter
filenames (not even a file dialog!), and run pychecker.

You cannot change examples.  If someone wants to really
integrate this, they should add the user interface in
pychecker (pychecker/options.py), use a file dialog
to enter files, and handle file modifications.
Since pychecker imports the files, they need to be
removed from sys.modules, so modifications will be seen.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-05 21:08

Message:
Logged In: YES 
user_id=21627

I'm concerned about the copyright notice. "All rights
reserved" means "you cannot copy it". Could you consider
licensing this under the PSF license?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 19:24:43 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 11:24:43 -0800
Subject: [Patches] [ python-Patches-539043 ] Support PyChecker in IDLE
Message-ID: <E16tZKF-0000dS-00@usw-sf-web1.sourceforge.net>

Patches item #539043, was opened at 2002-04-03 20:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Support PyChecker in IDLE

Initial Comment:
This patch adds SIMPLE support for pychecker in IDLE.
It is not complete.  It pops up a window, you can enter
filenames (not even a file dialog!), and run pychecker.

You cannot change examples.  If someone wants to really
integrate this, they should add the user interface in
pychecker (pychecker/options.py), use a file dialog
to enter files, and handle file modifications.
Since pychecker imports the files, they need to be
removed from sys.modules, so modifications will be seen.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-05 14:24

Message:
Logged In: YES 
user_id=33168

No need to worry.  Really, just want to have MetaSlash
mentioned.  But note, this patch is still incomplete and
more work needs to be done before this should be accepted.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-05 14:08

Message:
Logged In: YES 
user_id=21627

I'm concerned about the copyright notice. "All rights
reserved" means "you cannot copy it". Could you consider
licensing this under the PSF license?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539043&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 19:38:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 11:38:46 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tZXq-0000ny-00@usw-sf-web1.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 20:11:12 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 12:11:12 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16ta3E-0001CB-00@usw-sf-web1.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 21:10:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 13:10:35 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tayh-0001su-00@usw-sf-web1.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 21:26:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 13:26:07 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tbDj-0004oV-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 21:38:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 13:38:23 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tbPb-0004rd-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 21:47:18 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 13:47:18 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tbYE-00053h-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr  5 21:50:14 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Apr 2002 13:50:14 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16tbb4-00055p-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-05 16:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sat Apr  6 17:23:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 09:23:23 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16ttuN-00069j-00@usw-sf-web2.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:07:00 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:07:00 -0800
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16u192-0002fT-00@usw-sf-web2.sourceforge.net>

Patches item #540394, was opened at 2002-04-07 01:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Tim Peters (tim_one)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:07:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:07:38 -0800
Subject: [Patches] [ python-Patches-536909 ] pymalloc for types and other cleanups
Message-ID: <E16u19e-0002g5-00@usw-sf-web2.sourceforge.net>

Patches item #536909, was opened at 2002-03-29 21:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536909&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Deleted
>Resolution: Out of Date
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: pymalloc for types and other cleanups

Initial Comment:
This patch changes typeobject to use pymalloc for
managing the memory of subclassable types. It also
fixes a bug that caused an interpreter built without
GC to crash.

Testing this patch was a bitch.  There are three knobs
related to MM now (with-cycle-gc, with-pymalloc,
and PYMALLOC_DEBUG).  I think I found different bugs
when testing with each possible combination.

There's one bit of ugliness in this patch.  Extension
module writers have to use _PyMalloc_Del to initialize
the tp_free pointer.  There should be a "public"
function for that.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-31 07:11

Message:
Logged In: YES 
user_id=31435

Neil, I appreciate the work!  I'm afraid I screwed you at 
the same time.  How do you want to proceed?  I think "the 
plan" now is that we go back to the PyObject_XXX interface, 
and when pymalloc is enabled map most flavors of "free 
memory" ({Py{Mem, Object}_{Del, DEL, Free, FREE}) to the 
pymalloc free.  You're not required <wink> to work on this, 
but if you've got some spare energy I could sure use the 
help.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-03-29 23:09

Message:
Logged In: YES 
user_id=35752

I'm counting on Tim to finish PyMem_NukeIt.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-29 22:47

Message:
Logged In: YES 
user_id=21627

I see another memory allocation family here: What function
should objects allocated through PyType_GenericAlloc be
released with?

If you change the behaviour of PyType_GenericAlloc, all
types in extensions written for 2.2 that use
PyType_GenericAlloc will break, since they will still have
PyObject_Del in their tp_free slot.

I believe "families" should always be complete, so along
with PyType_GenericAlloc goes PyType_GenericFree.

If you want it fully backwards compatible, you need to
introduce PyType_PyMallocAlloc...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536909&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:08:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:08:38 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16u1Ac-0005TT-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:14:55 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:14:55 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16u1Gh-0002kk-00@usw-sf-web2.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:16:55 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:16:55 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16u1Id-0005Z0-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-06 20:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 20:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 12:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 16:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 01:51:45 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 17:51:45 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16u1qL-0005uj-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-07 01:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 03:40:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 18:40:56 -0800
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16u2bw-0003SF-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 03:41:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 18:41:27 -0800
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16u2cR-0003Sc-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 03:59:13 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 18:59:13 -0800
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16u2td-0003mQ-00@usw-sf-web2.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 04:00:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 06 Apr 2002 19:00:08 -0800
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16u2uW-0006T7-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 03:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-07 01:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 15:36:40 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 07:36:40 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16uDma-0004Qe-00@usw-sf-web2.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 16:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 17:15:02 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 09:15:02 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16uFJm-0008Ja-00@usw-sf-web4.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 10:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 18:06:54 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 10:06:54 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16uG7y-0000S6-00@usw-sf-web4.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 16:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Guido van Rossum (gvanrossum)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-07 19:06

Message:
Logged In: YES 
user_id=21627

IMO, it would be good if it would fall back to HTML help if
the chm file is not found.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 18:10:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 10:10:56 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16uGBs-00068x-00@usw-sf-web2.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 16:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Guido van Rossum (gvanrossum)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-07 19:10

Message:
Logged In: YES 
user_id=112690

Ok. I can add the fallback.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-07 19:06

Message:
Logged In: YES 
user_id=21627

IMO, it would be good if it would fall back to HTML help if
the chm file is not found.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 19:20:20 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 11:20:20 -0700
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16uHH2-0006gV-00@usw-sf-web1.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: Martin v. Löwis (loewis)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

>Comment By: Bernhard Reiter (ber)
Date: 2002-04-07 20:20

Message:
Logged In: YES 
user_id=113859

I agree with your analysis that the appplication has
to set the locale, if it wants to support non-ASCII filenames.

This is why we fixed the _test_ code to demonstrate exactly
this. The code of the modules itself is untouched.
If you do not fix the _test_ code it will bomb on non-ascii
file names.

Our code also demonstrates that there might be a difference
in the file system encoding (suitable for open) and the
terminal encoding (suitable for printing).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-04 22:16

Message:
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Sun Apr  7 22:54:17 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 07 Apr 2002 14:54:17 -0700
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16uKc5-0003ZD-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-07 17:54

Message:
Logged In: YES 
user_id=31435

I like Raymond's new pop().  Problems:

+ "speficied" is misspelled in the docstring.

+ Should be declared METH_O, not METH_VARARGS (mimic how, 
e.g., dict_update is set up).

+ The decrefs have to be reworked:  a decref can trigger 
calls back into arbitrary Python code, due to __del__ 
methods getting invoked.  This means you can never leave 
any live object in an insane or inconsistent state *during* 
a decref.  What you need to do instead is first capture the 
key and value into local vrbls, plug dummy and NULL in to 
the dict slot, and decrement the used count.  This leaves 
the dict in a consistent state again.  Only then is it safe 
to decref the key and value.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 22:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-06 20:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 20:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 12:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 16:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 15:14:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 07:14:46 -0700
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16uZuw-0005jQ-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-08 14:14

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() incorporating Tim's 
ideas:
+ Docstring spelling fixed
+ Switched to METH_O instead of METH_VARARGS
+ Delayed decref until dict entry in consistent state
+ Removed unused int i=0 variable
+ Tabs replaced with spaces

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 21:54

Message:
Logged In: YES 
user_id=31435

I like Raymond's new pop().  Problems:

+ "speficied" is misspelled in the docstring.

+ Should be declared METH_O, not METH_VARARGS (mimic how, 
e.g., dict_update is set up).

+ The decrefs have to be reworked:  a decref can trigger 
calls back into arbitrary Python code, due to __del__ 
methods getting invoked.  This means you can never leave 
any live object in an insane or inconsistent state *during* 
a decref.  What you need to do instead is first capture the 
key and value into local vrbls, plug dummy and NULL in to 
the dict slot, and decrement the used count.  This leaves 
the dict in a consistent state again.  Only then is it safe 
to decref the key and value.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 03:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-07 01:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 15:25:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 07:25:42 -0700
Subject: [Patches] [ python-Patches-541031 ] context sensitive help/keyword search
Message-ID: <E16ua5W-00065w-00@usw-sf-web4.sourceforge.net>

Patches item #541031, was opened at 2002-04-08 16:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
Summary: context sensitive help/keyword search

Initial Comment:
This script/module looks up keywords in the Python 
manuals.

It is usable as CGI script - a version is online at
http://starship.python.net/crew/theller/cgi-
bin/pyhelp.cgi

It can also by used from the command line:
python pyhelp.py keyword

It can also be used to implement context sensitive 
help in IDLE or Xemacs (for example) by simply 
selecting a word and pressing F1.

It can use the online version of the manuals at 
www.python.org/doc/, or it can use local installed 
html pages.

The script/module scans the index pages of the docs 
for hyperlinks, and pickles the results to disk.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 16:00:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 08:00:56 -0700
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16uadc-0006Xc-00@usw-sf-web4.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: Martin v. Löwis (loewis)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-08 17:00

Message:
Logged In: YES 
user_id=21627

Sorry, I misinterpreted your patch first.

I agree with your distinction of a file system encoding, and
a terminal encoding; I still hope to enhance Python to
expose an estimate of both - then leaving it to the
application to make use of either as appropriate (the file
system encoding would be used implicitly as is done today).

As for the flaw in Tk: it turns out that Tcl has a different
notion of the default encoding than Python - Tcl always uses
a locale-aware default encoding, whereas Python has a
system-wide fixed default encoding (usually ASCII). 

It is a good thing that Tkinter manages to represent file
names correctly (i.e. as Unicode strings) in most cases - if
you want to get the file name in the encoding in which the
file system gave it to you, you need to establish the value
of Tcl's "encoding system" command.

Committed as tkFileDialog.py 1.7.

----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2002-04-07 20:20

Message:
Logged In: YES 
user_id=113859

I agree with your analysis that the appplication has
to set the locale, if it wants to support non-ASCII filenames.

This is why we fixed the _test_ code to demonstrate exactly
this. The code of the modules itself is untouched.
If you do not fix the _test_ code it will bomb on non-ascii
file names.

Our code also demonstrates that there might be a difference
in the file system encoding (suitable for open) and the
terminal encoding (suitable for printing).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-04 22:16

Message:
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 16:01:40 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 08:01:40 -0700
Subject: [Patches] [ python-Patches-539392 ] Unicode fix for test in tkFileDialog.py
Message-ID: <E16uaeK-0006M4-00@usw-sf-web3.sourceforge.net>

Patches item #539392, was opened at 2002-04-04 20:59
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470

Category: Tkinter
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: Martin v. Löwis (loewis)
Summary: Unicode fix for test in tkFileDialog.py

Initial Comment:
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html


Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-08 17:00

Message:
Logged In: YES 
user_id=21627

Sorry, I misinterpreted your patch first.

I agree with your distinction of a file system encoding, and
a terminal encoding; I still hope to enhance Python to
expose an estimate of both - then leaving it to the
application to make use of either as appropriate (the file
system encoding would be used implicitly as is done today).

As for the flaw in Tk: it turns out that Tcl has a different
notion of the default encoding than Python - Tcl always uses
a locale-aware default encoding, whereas Python has a
system-wide fixed default encoding (usually ASCII). 

It is a good thing that Tkinter manages to represent file
names correctly (i.e. as Unicode strings) in most cases - if
you want to get the file name in the encoding in which the
file system gave it to you, you need to establish the value
of Tcl's "encoding system" command.

Committed as tkFileDialog.py 1.7.

----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2002-04-07 20:20

Message:
Logged In: YES 
user_id=113859

I agree with your analysis that the appplication has
to set the locale, if it wants to support non-ASCII filenames.

This is why we fixed the _test_ code to demonstrate exactly
this. The code of the modules itself is untouched.
If you do not fix the _test_ code it will bomb on non-ascii
file names.

Our code also demonstrates that there might be a difference
in the file system encoding (suitable for open) and the
terminal encoding (suitable for printing).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-04 22:16

Message:
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539392&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 17:46:45 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 09:46:45 -0700
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16ucI1-0007d3-00@usw-sf-web3.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-08 12:46

Message:
Logged In: YES 
user_id=31435

Getting closer!  Two more questions:

+ Why switch from tabs to spaces?  The rest of this file 
uses hard tabs, and that's what Guido prefers in C source.

+ Think hard about whether we really want to decref the 
value -- I doubt we do, as we're *transferring* ownership 
of the value from the dict to the caller.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-08 10:14

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() incorporating Tim's 
ideas:
+ Docstring spelling fixed
+ Switched to METH_O instead of METH_VARARGS
+ Delayed decref until dict entry in consistent state
+ Removed unused int i=0 variable
+ Tabs replaced with spaces

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 17:54

Message:
Logged In: YES 
user_id=31435

I like Raymond's new pop().  Problems:

+ "speficied" is misspelled in the docstring.

+ Should be declared METH_O, not METH_VARARGS (mimic how, 
e.g., dict_update is set up).

+ The decrefs have to be reworked:  a decref can trigger 
calls back into arbitrary Python code, due to __del__ 
methods getting invoked.  This means you can never leave 
any live object in an insane or inconsistent state *during* 
a decref.  What you need to do instead is first capture the 
key and value into local vrbls, plug dummy and NULL in to 
the dict slot, and decrement the used count.  This leaves 
the dict in a consistent state again.  Only then is it safe 
to decref the key and value.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 22:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-06 20:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 20:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 12:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 16:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 19:47:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 11:47:11 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16ueAZ-00064A-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 20:18:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 12:18:07 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16ueeV-0001Bf-00@usw-sf-web4.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 15:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 21:35:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 13:35:24 -0700
Subject: [Patches] [ python-Patches-541210 ] build info docs from tex sources
Message-ID: <E16ufrI-0007Le-00@usw-sf-web1.sourceforge.net>

Patches item #541210, was opened at 2002-04-08 20:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch (same as for 2.2) adds Milan Zamazals
conversion script and modifies the mkinfo script to
build the info doc files from the latex sources.
Currently, the mac, doc and inst tex files are not
handled. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 22:08:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 14:08:38 -0700
Subject: [Patches] [ python-Patches-523424 ] Finding "home" in "user.py" for Windows
Message-ID: <E16ugNS-0002WG-00@usw-sf-web4.sourceforge.net>

Patches item #523424, was opened at 2002-02-27 16:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523424&group_id=5470

Category: Modules
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Gilles Lenfant (glenfant)
Assigned to: Nobody/Anonymous (nobody)
>Summary: Finding "home" in "user.py" for Windows

Initial Comment:
On my win2k French box + python 2.1.2:

>>> import user
>>> user.home
'C:\'

This isn't a great issue but this means that all users 
of this win2k box will share the same ".pythonrc.py".

The code provided by Jeff Bauer can be changed easily 
because the standard Python distro now has a "_winreg" 
module.

This patch gives the real user $HOME like folder for 
any user on whatever's Windows localization:

>>> import user
>>> user.home
u'C:\Documents and Settings\MyWindowsUsername\Mes 
documents'

This has been successfully tested with Win98 and 
Win2000. This should be tested on XP, NT4, and 95 but 
I can't.

Sorry for the "context or unified diffs" (dunno what 
it means) but the module is short and my patch is 
clearly emphasized.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-18 09:42

Message:
Logged In: YES 
user_id=21627

If there are no further comments in favour of accepting this
patch, it will be rejected.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-02-27 23:13

Message:
Logged In: YES 
user_id=21627

If it returns "My Documents", it is definitely *not* the
home directory of the user; \Documents and Settings\username
would be the home directory.

Furthermore, on many installations, HOME *is* set, and it is
the Administrator's choice where that points to; the typical
installation (in a domain) indeed is to assign HOMEDRIVE.

So I'm not in favour of that change.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=523424&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 22:16:57 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 14:16:57 -0700
Subject: [Patches] [ python-Patches-541210 ] build info docs from tex sources
Message-ID: <E16ugVV-0007pA-00@usw-sf-web1.sourceforge.net>

Patches item #541210, was opened at 2002-04-08 16:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470

Category: Documentation
Group: Python 2.3
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch (same as for 2.2) adds Milan Zamazals
conversion script and modifies the mkinfo script to
build the info doc files from the latex sources.
Currently, the mac, doc and inst tex files are not
handled. 

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:16

Message:
Logged In: YES 
user_id=3066

This is a duplicate patch; a duplicate report is not needed.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541210&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 22:19:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 14:19:31 -0700
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E16ugXz-0007rH-00@usw-sf-web1.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 17:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:19

Message:
Logged In: YES 
user_id=3066

I'll add a note here just in case:  This patch applies to
the 2.3 development as well as 2.2 maintenance tree.

This still seems to suffer the problems that all versions of
this conversion have suffered; it isn't portable between FSF
Emacs and XEmacs.  I'll see about installing FSF Emacs to
see if it'll work for me there.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 22:24:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 14:24:11 -0700
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E16ugcV-0007uT-00@usw-sf-web1.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 17:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:24

Message:
Logged In: YES 
user_id=3066

For the record, here's the specific errors I get when using
XEmacs with this patch on the current release22-maint branch
(hopefully SF won't munge them too badly):

grendel(.../r22-maint/Doc); make EMACS=xemacs info
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
xemacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el...
Mark set
Args out of range: 72, 132
xemacs exiting.
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:19

Message:
Logged In: YES 
user_id=3066

I'll add a note here just in case:  This patch applies to
the 2.3 development as well as 2.2 maintenance tree.

This still seems to suffer the problems that all versions of
this conversion have suffered; it isn't portable between FSF
Emacs and XEmacs.  I'll see about installing FSF Emacs to
see if it'll work for me there.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Mon Apr  8 22:30:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 08 Apr 2002 14:30:06 -0700
Subject: [Patches] [ python-Patches-512005 ] getrusage() returns struct-like object.
Message-ID: <E16ugiE-0002YB-00@usw-sf-web3.sourceforge.net>

Patches item #512005, was opened at 2002-02-02 03:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512005&group_id=5470

Category: Modules
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Kirill Simonov (kirill_simonov)
Assigned to: Nobody/Anonymous (nobody)
Summary: getrusage() returns struct-like object.

Initial Comment:
The function resource.getrusage() now returns
struct-like object (cf. os.stat() and time.gmtime()).

This is my first patch for Python so please don't
scorch me if something is wrong ;).

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-08 23:30

Message:
Logged In: YES 
user_id=21627

Thanks for the patch, applied as

libresource.tex 1.17
ACKS 1.165
NEWS 1.382
resource.c 2.24


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512005&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 09:05:28 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 01:05:28 -0700
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E16uqd6-0006HE-00@usw-sf-web1.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 22:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2002-04-09 08:05

Message:
Logged In: YES 
user_id=60903

Yes, forget to mention that Milan said, it only works for 
emacs. I built the info docs using emacs-21.2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 21:24

Message:
Logged In: YES 
user_id=3066

For the record, here's the specific errors I get when using
XEmacs with this patch on the current release22-maint branch
(hopefully SF won't munge them too badly):

grendel(.../r22-maint/Doc); make EMACS=xemacs info
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
xemacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el...
Mark set
Args out of range: 72, 132
xemacs exiting.
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 21:19

Message:
Logged In: YES 
user_id=3066

I'll add a note here just in case:  This patch applies to
the 2.3 development as well as 2.2 maintenance tree.

This still seems to suffer the problems that all versions of
this conversion have suffered; it isn't portable between FSF
Emacs and XEmacs.  I'll see about installing FSF Emacs to
see if it'll work for me there.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 14:39:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 06:39:08 -0700
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16uvq0-0004eY-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 19:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Remind
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-09 13:39

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() with hard tabs and 
corrected reference counts.  In a DEBUG build, I validated 
the ref counts against equivalent steps:  vv=d[k]; del d[k].
And, after Tim's suggestions, the code is fast and light.

In addition to d.pop(k), GvR's patch for d.popitem(k) 
should also go in.  The (k,v) return value feeds directly 
into d.__setitem__ or a dict(itemlist) constructor (see the 
code fragments in the 4/6/02 post).  The only downside is 
the time to process METH_VARARGS.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-08 16:46

Message:
Logged In: YES 
user_id=31435

Getting closer!  Two more questions:

+ Why switch from tabs to spaces?  The rest of this file 
uses hard tabs, and that's what Guido prefers in C source.

+ Think hard about whether we really want to decref the 
value -- I doubt we do, as we're *transferring* ownership 
of the value from the dict to the caller.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-08 14:14

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() incorporating Tim's 
ideas:
+ Docstring spelling fixed
+ Switched to METH_O instead of METH_VARARGS
+ Delayed decref until dict entry in consistent state
+ Removed unused int i=0 variable
+ Tabs replaced with spaces

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 21:54

Message:
Logged In: YES 
user_id=31435

I like Raymond's new pop().  Problems:

+ "speficied" is misspelled in the docstring.

+ Should be declared METH_O, not METH_VARARGS (mimic how, 
e.g., dict_update is set up).

+ The decrefs have to be reworked:  a decref can trigger 
calls back into arbitrary Python code, due to __del__ 
methods getting invoked.  This means you can never leave 
any live object in an insane or inconsistent state *during* 
a decref.  What you need to do instead is first capture the 
key and value into local vrbls, plug dummy and NULL in to 
the dict slot, and decrement the used count.  This leaves 
the dict in a consistent state again.  Only then is it safe 
to decref the key and value.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 03:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-07 01:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-07 01:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-07 01:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 17:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 21:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 21:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 21:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 20:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 15:17:47 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 07:17:47 -0700
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E16uwRP-0001wB-00@usw-sf-web1.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 17:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-09 10:17

Message:
Logged In: YES 
user_id=3066

I just installed emacs 20.7 'cause those are the RPMs that
came with the distro I have on this box (RedHat 7.2), and
that produced a similar error.  I'll have to ask that a more
robust patch be available before I can spend more time on
it; this one will be marked as rejected.  Until then, I'm
glad to publish contributed GNU info versions provided by
community members.

For the record, here's the specific error output I got and
the FSF Emacs version info:

grendel(.../r22-maint/Doc); make info 
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
emacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Mark set
Args out of range: 27914, 27916
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2
[2] grendel(.../r22-maint/Doc); emacs --version
GNU Emacs 20.7.1
Copyright (C) 1999 Free Software Foundation, Inc.
GNU Emacs comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of Emacs
under the terms of the GNU General Public License.
For more information about these matters, see the file named
COPYING.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2002-04-09 04:05

Message:
Logged In: YES 
user_id=60903

Yes, forget to mention that Milan said, it only works for 
emacs. I built the info docs using emacs-21.2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:24

Message:
Logged In: YES 
user_id=3066

For the record, here's the specific errors I get when using
XEmacs with this patch on the current release22-maint branch
(hopefully SF won't munge them too badly):

grendel(.../r22-maint/Doc); make EMACS=xemacs info
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
xemacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el...
Mark set
Args out of range: 72, 132
xemacs exiting.
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 17:19

Message:
Logged In: YES 
user_id=3066

I'll add a note here just in case:  This patch applies to
the 2.3 development as well as 2.2 maintenance tree.

This still seems to suffer the problems that all versions of
this conversion have suffered; it isn't portable between FSF
Emacs and XEmacs.  I'll see about installing FSF Emacs to
see if it'll work for me there.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 15:20:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 07:20:24 -0700
Subject: [Patches] [ python-Patches-539486 ] build info docs from sources
Message-ID: <E16uwTw-0002EM-00@usw-sf-web2.sourceforge.net>

Patches item #539486, was opened at 2002-04-04 17:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470

Category: Documentation
Group: Python 2.1.2
>Status: Pending
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-09 10:20

Message:
Logged In: YES 
user_id=3066

Is this essentially the same patch as the 2.2.x version,
with the differences being in the comprehension of the
generated HTML?  If so, I'll reject this on the same grounds
(insufficiently robust to spend time on).  If this is less
fragile, I'll consider it for the 2.1.x tree independently.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539486&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 17:27:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 09:27:21 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16uySn-0006BE-00@usw-sf-web3.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 12:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 15:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 17:30:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 09:30:06 -0700
Subject: [Patches] [ python-Patches-536278 ] force gzip to open files with 'b'
Message-ID: <E16uyVS-0003Yo-00@usw-sf-web2.sourceforge.net>

Patches item #536278, was opened at 2002-03-28 08:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536278&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Nobody/Anonymous (nobody)
Summary: force gzip to open files with 'b'

Initial Comment:
It doesn't make sense that the gzip module should
try to open a file in text mode.  The attached
patch forces a 'b' into the file open mode if it
wasn't given.  I also modified the test slightly to
try and tickle this code, but I can't test it very
effectively, because I don't do Windows... :-)


----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2002-04-09 11:30

Message:
Logged In: YES 
user_id=44345

good point.  updated patch.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-30 22:12

Message:
Logged In: YES 
user_id=31435

I suggest fixing this via changing the test to

if mode and 'b' not in mode:

Then mode=None and mode='' will be left alone (as Neal 
says, the code already does the right thing for those).


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-03-28 09:04

Message:
Logged In: YES 
user_id=33168

There is a problem (sorry, I have an evil mind). :-)

If '' is passed as the mode, before the patch, this would
have been converted to 'rb'.  After the patch, mode will
become 'b' and that will raise an exception:

>>> open('/dev/null', 'b')
IOError: [Errno 22] Invalid argument: b

If you add an (and mode) condition and that should be fine.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536278&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 19:47:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 11:47:39 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16v0eZ-0004oo-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-09 14:47

Message:
Logged In: YES 
user_id=31435

Clarifying or just repeating Guido here:

+ Binary compatibility is important.  It's better on Unix 
than it appears <wink> -- while you'll get a warning if you 
run an old 1.5.2 extension with 2.2 today and without 
recompiling, it will almost certainly work anyway.  So in 
the case of macros that expanded to a private API function 
before, that private API function must still exist, but the 
macro needn't expand to that anymore (nor even *be* a macro 
anymore).  _PyObject_Del is a particular problem cuz it's 
even documented in the C API manual -- there simply wasn't 
a public API function before that did the same thing and 
could be used as a function designator.  You're making life 
better for future generations.

+ Casts on tp_free slots are par for the course, 
because "destructor" has an impractical signature.  I'm 
afraid that can't change either, so the casts stay.

+ Fred and I agreed to add PyObject_Del to the "minimal 
recommended API", so, for the next round of this, feel 
wholly righteous in leaving existing PyObject_Del calls 
alone.

If anything's unclear, hit me.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 12:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 15:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 20:15:40 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 12:15:40 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16v15g-00056G-00@usw-sf-web1.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Nobody/Anonymous (nobody)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 20:16:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 12:16:46 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16v16k-00057L-00@usw-sf-web1.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
>Assigned to: Skip Montanaro (montanaro)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 21:29:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 13:29:15 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16v2Et-0005y0-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-07 01:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-09 20:29

Message:
Logged In: YES 
user_id=35752

It might be a day or two before I get to this.

Regarding the type of tp_free, could we change it to be
something like:

  typedef void (*freefunc)(void *);
  ...
  freefunc tp_free;

and leave the type of tp_dealloc alone.  Maybe it's too
late now that 2.2 is out and uses 'destructor'.  I don't
see how this relates to binary compatibility though.
Why does it matter if the function takes a PyObject pointer
or a void pointer?  The worse I see happening is that people
could get warnings when they compile their extension
modules.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 18:47

Message:
Logged In: YES 
user_id=31435

Clarifying or just repeating Guido here:

+ Binary compatibility is important.  It's better on Unix 
than it appears <wink> -- while you'll get a warning if you 
run an old 1.5.2 extension with 2.2 today and without 
recompiling, it will almost certainly work anyway.  So in 
the case of macros that expanded to a private API function 
before, that private API function must still exist, but the 
macro needn't expand to that anymore (nor even *be* a macro 
anymore).  _PyObject_Del is a particular problem cuz it's 
even documented in the C API manual -- there simply wasn't 
a public API function before that did the same thing and 
could be used as a function designator.  You're making life 
better for future generations.

+ Casts on tp_free slots are par for the course, 
because "destructor" has an impractical signature.  I'm 
afraid that can't change either, so the casts stay.

+ Fred and I agreed to add PyObject_Del to the "minimal 
recommended API", so, for the next round of this, feel 
wholly righteous in leaving existing PyObject_Del calls 
alone.

If anything's unclear, hit me.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 16:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 19:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 18:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Tue Apr  9 21:43:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 13:43:44 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16v2Su-00068N-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-09 16:43

Message:
Logged In: YES 
user_id=31435

It'll be a day or two before PLabs can get back to Python 
work anyway.

Reassigning to Guido -- I'm not even going to try to 
channel him on backwards compatibility, or the feasibility 
of introducing possible warnings.  If I were you I'd check 
in the patch with the casts in; they can be taken out again 
later if Guido is agreeable.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-09 16:29

Message:
Logged In: YES 
user_id=35752

It might be a day or two before I get to this.

Regarding the type of tp_free, could we change it to be
something like:

  typedef void (*freefunc)(void *);
  ...
  freefunc tp_free;

and leave the type of tp_dealloc alone.  Maybe it's too
late now that 2.2 is out and uses 'destructor'.  I don't
see how this relates to binary compatibility though.
Why does it matter if the function takes a PyObject pointer
or a void pointer?  The worse I see happening is that people
could get warnings when they compile their extension
modules.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 14:47

Message:
Logged In: YES 
user_id=31435

Clarifying or just repeating Guido here:

+ Binary compatibility is important.  It's better on Unix 
than it appears <wink> -- while you'll get a warning if you 
run an old 1.5.2 extension with 2.2 today and without 
recompiling, it will almost certainly work anyway.  So in 
the case of macros that expanded to a private API function 
before, that private API function must still exist, but the 
macro needn't expand to that anymore (nor even *be* a macro 
anymore).  _PyObject_Del is a particular problem cuz it's 
even documented in the C API manual -- there simply wasn't 
a public API function before that did the same thing and 
could be used as a function designator.  You're making life 
better for future generations.

+ Casts on tp_free slots are par for the course, 
because "destructor" has an impractical signature.  I'm 
afraid that can't change either, so the casts stay.

+ Fred and I agreed to add PyObject_Del to the "minimal 
recommended API", so, for the next round of this, feel 
wholly righteous in leaving existing PyObject_Del calls 
alone.

If anything's unclear, hit me.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 12:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 15:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Wed Apr 10 01:53:05 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Apr 2002 17:53:05 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16v6MD-0000Ol-00@usw-sf-web1.sourceforge.net>

Patches item #540394, was opened at 2002-04-06 20:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 20:53

Message:
Logged In: YES 
user_id=6380

The binary compatibility issue is extensions compiled for
2.2 that have references to _PyObject_Del compiled into them
and aren't recompiled for 2.3. I think that should work
(even if they get a warning). To make it work, the
_PyObject_Del entry point must continue to exist.

Back to Neil, I think my instructions are clear enough.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 16:43

Message:
Logged In: YES 
user_id=31435

It'll be a day or two before PLabs can get back to Python 
work anyway.

Reassigning to Guido -- I'm not even going to try to 
channel him on backwards compatibility, or the feasibility 
of introducing possible warnings.  If I were you I'd check 
in the patch with the casts in; they can be taken out again 
later if Guido is agreeable.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-09 16:29

Message:
Logged In: YES 
user_id=35752

It might be a day or two before I get to this.

Regarding the type of tp_free, could we change it to be
something like:

  typedef void (*freefunc)(void *);
  ...
  freefunc tp_free;

and leave the type of tp_dealloc alone.  Maybe it's too
late now that 2.2 is out and uses 'destructor'.  I don't
see how this relates to binary compatibility though.
Why does it matter if the function takes a PyObject pointer
or a void pointer?  The worse I see happening is that people
could get warnings when they compile their extension
modules.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 14:47

Message:
Logged In: YES 
user_id=31435

Clarifying or just repeating Guido here:

+ Binary compatibility is important.  It's better on Unix 
than it appears <wink> -- while you'll get a warning if you 
run an old 1.5.2 extension with 2.2 today and without 
recompiling, it will almost certainly work anyway.  So in 
the case of macros that expanded to a private API function 
before, that private API function must still exist, but the 
macro needn't expand to that anymore (nor even *be* a macro 
anymore).  _PyObject_Del is a particular problem cuz it's 
even documented in the C API manual -- there simply wasn't 
a public API function before that did the same thing and 
could be used as a function designator.  You're making life 
better for future generations.

+ Casts on tp_free slots are par for the course, 
because "destructor" has an impractical signature.  I'm 
afraid that can't change either, so the casts stay.

+ Fred and I agreed to add PyObject_Del to the "minimal 
recommended API", so, for the next round of this, feel 
wholly righteous in leaving existing PyObject_Del calls 
alone.

If anything's unclear, hit me.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 12:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 15:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 14:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-06 21:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Wed Apr 10 10:09:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 10 Apr 2002 02:09:33 -0700
Subject: [Patches] [ python-Patches-541924 ] this.py too verbose
Message-ID: <E16vE6f-0005dT-00@usw-sf-web1.sourceforge.net>

Patches item #541924, was opened at 2002-04-10 09:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Duncan Booth (duncanb)
Assigned to: Nobody/Anonymous (nobody)
Summary: this.py too verbose

Initial Comment:
The 'Easter Egg' file this.py might be regarded as 
something of an advert for Python, but its 
implementation is excessively verbose as it rolls its 
own rot13 decoding code when Python already has 
perfectly usable rot13 coding built in.

The attached context diff replaces the 5 lines 
currently used to decode the Zen string with a single 
line:
  print s.decode('rot13')


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470


From noreply@sourceforge.net  Wed Apr 10 19:11:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 10 Apr 2002 11:11:15 -0700
Subject: [Patches] [ python-Patches-541924 ] this.py too verbose
Message-ID: <E16vMYt-0006Jv-00@usw-sf-web3.sourceforge.net>

Patches item #541924, was opened at 2002-04-10 11:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Duncan Booth (duncanb)
Assigned to: Nobody/Anonymous (nobody)
Summary: this.py too verbose

Initial Comment:
The 'Easter Egg' file this.py might be regarded as 
something of an advert for Python, but its 
implementation is excessively verbose as it rolls its 
own rot13 decoding code when Python already has 
perfectly usable rot13 coding built in.

The attached context diff replaces the 5 lines 
currently used to decode the Zen string with a single 
line:
  print s.decode('rot13')


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-10 20:11

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470


From noreply@sourceforge.net  Wed Apr 10 22:31:29 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 10 Apr 2002 14:31:29 -0700
Subject: [Patches] [ python-Patches-541924 ] this.py too verbose
Message-ID: <E16vPgf-0005v7-00@usw-sf-web2.sourceforge.net>

Patches item #541924, was opened at 2002-04-10 05:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Duncan Booth (duncanb)
>Assigned to: Tim Peters (tim_one)
Summary: this.py too verbose

Initial Comment:
The 'Easter Egg' file this.py might be regarded as 
something of an advert for Python, but its 
implementation is excessively verbose as it rolls its 
own rot13 decoding code when Python already has 
perfectly usable rot13 coding built in.

The attached context diff replaces the 5 lines 
currently used to decode the Zen string with a single 
line:
  print s.decode('rot13')


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-10 17:31

Message:
Logged In: YES 
user_id=31435

Sorry, Guido deliberately refused to use rot13.  He wants 
it to be obscure.  I told him that I instantly recognized 
the by-hand implementation of rot13, but had no idea 
what .decode('rot13') might do.  He wasn't swayed <wink>.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-10 14:11

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541924&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 17:23:29 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 09:23:29 -0700
Subject: [Patches] [ python-Patches-526840 ] PEP 263 Implementation
Message-ID: <E16vhM9-00044V-00@usw-sf-web4.sourceforge.net>

Patches item #526840, was opened at 2002-03-07 08:55
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470

Category: Parser/Compiler
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Martin v. Löwis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: PEP 263 Implementation

Initial Comment:
The attached patch implements PEP 263. The following
differences to the PEP (rev. 1.8) are known:

- The implementation interprets "ASCII compatible" as
meaning "bytes below 128 always denote ASCII
characters", although this property is only used for
",', and \. There have been other readings of "ASCII
compatible", so this should probably be elaborated in
the PEP.

- The check whether all bytes follow the declared or
system encoding (including comments and string
literals) is only performed if the encoding is "ascii".


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:23

Message:
Logged In: YES 
user_id=38388

Apart from the codec changes, the patch looks ok.

I would still like two APIs for the two different codec tasks, 
though. I don't expect anything much to change in the
codecs, so maintenance is not an issue.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-21 10:25

Message:
Logged In: YES 
user_id=21627

Version 2 of this patch implements revision 1.11 of the PEP
(phase 1). The check of the complete source file for
compliance with the declared encoding is implemented by
decoding the input line-by-line; I believe that for all
supported encodings, this is not different compared to
decoding the entire source file at once.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-07 18:24

Message:
Logged In: YES 
user_id=21627

Changing the decoding functions will not result in one
additional function, but in two of them: you'll also get
PyUnicode_DecodeRawUnicodeEscapeFromUnicode.

That seems quite unmaintainable to me: any change now needs
to propagate into four functions. OTOH, I don't think that
the code that allows parsing a variable-sized strings is
overly complicated.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-07 18:01

Message:
Logged In: YES 
user_id=38388

Ok, I've had a look at the patch. 

It looks good except for the overly 
complicated implementation of the 
unicode-escape codec. 

Even though there's a bit of code duplication, 
I'd prefer to have two separate functions here: 
one for the standard char* pointer type and 
another one for Py_UNICODE*, ie.

PyUnicode_DecodeUnicodeEscape(char*...)
and
PyUnicode_DecodeUnicodeEscapeFromUnicode(Py_UNICODE*...)

This is easier to support and gives better
performance since the compiler can optimize
the two functions making different 
assumptions.

You'll also need to include a name mangling
at the top of the header for the new API.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-07 14:06

Message:
Logged In: YES 
user_id=6380

I've set the group to Python 2.3 so the priority has some
context (I'd rather you move the priority down to 5 but I
understand this is your personal priority).

I haven't accepted the PEP yet (although I expect I will),
so please don't check this in yet (if you feel it needs to
be saved in CVS, use a branch).


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-07 11:06

Message:
Logged In: YES 
user_id=38388

Thank you !

I'll add a note to the PEP about the way the first two lines
are processed (removing the ASCII mention...).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-07 09:11

Message:
Logged In: YES 
user_id=21627

A note on the implementation strategy: it turned out that
communicating the encoding into the abstract syntax was the
biggest challenge. 

To solve this, I introduced encoding_decl pseudo node: it is
an unused non-terminal whose STR() is the encoding, and
whose only child is the true root of the syntax tree. As
such, it is the only non-terminal which has a STR value.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=526840&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 17:34:41 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 09:34:41 -0700
Subject: [Patches] [ python-Patches-542562 ] clean up trace.py
Message-ID: <E16vhWz-00072a-00@usw-sf-web1.sourceforge.net>

Patches item #542562, was opened at 2002-04-11 16:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470

Category: Demos and tools
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Zooko O'Whielacronx (zooko)
Assigned to: Nobody/Anonymous (nobody)
Summary: clean up trace.py

Initial Comment:
moderately interesting changes:
 * bugfix: remove "feature" of ignoring files in the
tmpdir, as I was trying to run it on file in the tmpdir
and couldn't figure out why it gave no answer!  I think
the original motivation for that feature (spurious
"/tmp/" filenames for builtin functions??) has gone
away, but I'm not sure.
 * add more usage docs and warning about common mistake

pretty mundane changes:
 * remove unnecessary checks for backwards
compatibility with a version that never escaped from my
(Zooko's) laptop
 * add a future-compatible check: if the interpreter
offers an attribute called `sys.optimized', and it is
"true", and the user is trying to do something that
can't be done with an optimizing interpreter, then
error out


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 17:59:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 09:59:31 -0700
Subject: [Patches] [ python-Patches-531901 ] binary packagers
Message-ID: <E16vhv1-0007KZ-00@usw-sf-web1.sourceforge.net>

Patches item #531901, was opened at 2002-03-19 15:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: binary packagers

Initial Comment:
zip file with updated Solaris and HP-UX packagers.
Replaces 415226, 415227, 415228.

Changes made to take advantage of new PEP241 changes in
the Distribution class.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:59

Message:
Logged In: YES 
user_id=38388

Mark, could you reupload the ZIP file ? I cannot download it
from the SF page (the file is mostly empty).

Also, is the documentation already included in the ZIP file ?
If not, it would be nice if you could add them as well.

I don't require a special PEP for these changes, BTW, but
I do require you to maintain them.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 19:55

Message:
Logged In: YES 
user_id=12810

OK, the PEP seems to me to mean most of this is done.

These additions are not library modules, they are Distutils
"commands". So the way i read it, the Distutils-SIG (where
I've been hanging around for some time) are the Maintainers.

The documentation will be 2 new chapters for the Distutils
manual "Creating Solaris packages" and "Creating HP-UX
packages" each looking a whole lot like "Creating RPM packages".

Does that clarify anything, or am I still missing a clue?

p.s. Thanks for cleaning up the extra uploads!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 15:35

Message:
Logged In: YES 
user_id=21627

You volunteering as the maintainer is part of the
prerequisites of accepting new modules, when following PEP
2, see

http://python.sourceforge.net/peps/pep-0002.html

It says: "developers ... will first form a group of
maintainers. Then, this group shall produce a PEP called a
library PEP."

So existance of a PEP describing these library extensions
would be a prerequisite for accepting them. If MAL wants to
waive this requirement, it would be fine with me. However,
such a PEP could also share text with the documentation, so
it might not be wasted effort.


----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 14:49

Message:
Logged In: YES 
user_id=12810

Any of the three (they're all the same). SourceForge
hiccuped during the upload, and I don't have permission to
delete the duplicates.

I don't exactly understand what you mean by applying PEP 2.
I uploaded this per Marc Lemburg's request for the latest
versions of patches 41522[6-8]. He's acting as as the
integrator in this case (see
http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html).
I let him know about the duplicate uploads, so hopefully
he'll correct it. If you can and want, feel free to delete
the 2 of your choice.

I agree they need to be documented. As soon as I can, I'll
submit changes to the Distutils documentation.

Finally, yes, I'll act as maintainer. I'm on the
Distutils-sig and as soon as some other poor soul who has to
deal with Solaris or HP-UX tries them, I'm there to work out
issues.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 07:35

Message:
Logged In: YES 
user_id=21627

Which of the three attached files is the right one (19633,
19634, or 19635)? Unless they are all needed, we should
delete the extra copies.

I recommend to apply PEP 2 to this patch: A library PEP is
needed (which could be quite short), documentation, perhaps
test cases. Most importantly, there must be an identified
maintainer of these modules. Are you willing to act as the
maintainer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 18:01:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 10:01:50 -0700
Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py
Message-ID: <E16vhxG-0007MB-00@usw-sf-web1.sourceforge.net>

Patches item #542569, was opened at 2002-04-11 19:01
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: tp_print tp_repr tp_str in test_bool.py

Initial Comment:
Those slots are not being tested by test_bool.py if it
was run standalone.
bool_print() does not run during the complete
regression test suite.

I was using Neal's tools and choose boolobject.c
(because it's an easy module :-) to get in touch with
the internals.

I don't know if this patch would be useful to you
because  I didn't see similar checks done for other
types. Ie: the
eval(repr(x))==x property, or the tp_print slot (I
found only one for dicts.)

Hope it helps,
-Hernan


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 21:31:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 13:31:11 -0700
Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support
Message-ID: <E16vlDr-00018T-00@usw-sf-web1.sourceforge.net>

Patches item #476814, was opened at 2001-10-31 17:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Jack Jansen (jackjansen)
Summary: foreign-platform newline support

Initial Comment:
This patch enables Python to interpret all known
newline conventions,
CR, LF or CRLF, on all platforms.

This support is enabled by configuring with
--with-universal-newlines
(so by default it is off, and everything should behave
as usual).

With universal newline support enabled two things
happen:
- When importing or otherwise parsing .py files any
newline convention
  is accepted.
- Python code can pass a new "t" mode parameter to
open() which
  reads files with any newline convention. "t" cannot
be combined with
  any other mode flags like "w" or "+", for obvious
reasons.

File objects have a new attribute "newlines" which
contains the type of
newlines encountered in the file (or None when no
newline has been seen,
or "mixed" if there were various types of newlines).

Also included is a test script which tests both file
I/O and parsing.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-11 22:31

Message:
Logged In: YES 
user_id=21627

What is the rationale for making this a compile-time option?
It seems to complicate things, with no apparent advantage.

If this is for backwards compatibility, don't make it an
option: nobody will rebuild Python just to work around a
compatibility problem.

Apart from that, the patch looks goo.d

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-29 00:17

Message:
Logged In: YES 
user_id=45365

New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 22:07

Message:
Logged In: YES 
user_id=6380

Thanks! But there's no documentation. Could I twist your arm
for a separate doc patch?

I'm tempted to give this a +1, but I'd like to hear from MvL
and MAL to see if they foresee any interaction with their
PEP 262 implemetation.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-13 23:44

Message:
Logged In: YES 
user_id=45365

A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-01-16 23:47

Message:
Logged In: YES 
user_id=45365

This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-14 00:57

Message:
Logged In: YES 
user_id=31435

Back to Jack -- and sorry for sitting on it so long.  
Clearly this isn't making it into 2.2 in the core.  As I 
said on Python-Dev, I believe this needs a PEP:  the design 
decisions are debatable, so *should* be debated outside the 
Mac community too.  Note, though, that I can't stop you 
from adding it to the 2.2 Mac distribution (if you want it 
badly enough there).

If a PEP won't be written, I suggest finding someone else 
to review it again; maybe Guido.  Note that the patch needs 
doc changes too.  The patch to regrtest.py doesn't belong 
here (I assume it just slipped in).  There seems a lot of 
code in support of the f_newlinetypes member, and the value 
of that member isn't clear -- I can't imagine a good use 
for it (maybe it's a Mac thing?).  The implementation of 
Py_UniversalNewlineFread appears incorrect to me:  it reads 
n bytes *every* time around the outer loop, no matter how 
few characters are still required, and n doesn't change 
inside the loop.  The business about the GIL may be due to 
the lack of docs:  are, or are not, people supposed to 
release the GIL themselves around calls to these guys?  
It's not documented, and it appears your intent differed 
from my guess.  Finally, it would be better to call ferror
() after calling fread() instead of before it <wink>.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-11-14 16:13

Message:
Logged In: YES 
user_id=45365

Here's a new version of the patch. To address your issues
one by one:
- get_line and Py_UniversalNewlineFgets are too difficult to
integrate, at leat,
I don't see how I could do it. The storage management of
get_line gets in the way.

- The global lock comment I don't understand. The
Universal... routines are
replacements for fgets() and fread(), so have nothing to do
with the interpreter lock.

- The logic of all three routines (get_line too) has changed
and I've put comments in.
I hope this addresses some of the points.

- If universal_newline is false for a certain PyFileObject
we now immedeately take
a quick exit via fgets() or fread().

There's also a new test script, that tests some more border
cases (like lines longer
than 100 characters, and a lone CR just before end of file).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-11-05 09:16

Message:
Logged In: YES 
user_id=31435

It would be better if get_line just called 
Py_UniversalNewlineFgets (when appropriate) instead of 
duplicating its logic inline.

Py_UniversalNewlineFgets and Py_UniversalNewlineFread 
should deal with releasing the global lock themselves -- 
the correct granularity for lock release/reacquire is 
around the C-level input routines (esp. for fread).

The new routines never check for I/O errors!  Why not?  It 
seems essential.

The new Fgets checks for EOF at the end of the loop instead 
of the top.  This is surprising, and I stared a long time 
in vain trying to guess why.  Setting

newlinetypes |= NEWLINE_CR;

immediately after seeing an '\r' would be as fast (instead 
of waiting to see EOF and then inferring the prior 
existence of '\r' indirectly from the state of the 
skipnextlf flag).

Speaking of which <wink>, the fobj tests in the inner loop 
waste cycles.  Set the local flag vrbls whether or not fobj 
is NULL.  When you're *out* of the inner loop you can 
simply decline to store the new masks when fobj is NULL 
(and you're already doing the latter anyway).  A test and 
branch inside the loop is much more expensive than or'ing 
in a flag bit inside the loop, ditto harder to understand.

Floating the univ_newline test out of the loop (and 
duplicating the loop body, one way for univ_newline true 
and the other for it false) would also save a test and 
branch on every character.

Doing fread one character at a time is very inefficient.  
Since you know you need to obtain n characters in the end, 
and that these transformations require reading at least n 
characters, you could very profitably read n characters in 
one gulp at the start, then switch to k at a time where k 
is the number of \r\n pairs seen since the last fread 
call.  This is easier to code than it sounds <wink>.

It would be fine by me if you included (and initialized) 
the new file-object fields all the time, whether or not 
universal newlines are configured.  I'd rather waste a few 
bytes in a file object than see #ifdefs spread thru the 
code.

I'll be damned if I can think of a quick way to do this 
stuff on Windows -- native Windows fgets() is still the 
only Windows handle we have on avoiding crushing thread 
overhead inside MS's C library.  I'll think some more about 
it (the thrust still being to eliminate the 't' mode flag, 
as whined about <wink> on Python-Dev).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-10-31 18:38

Message:
Logged In: YES 
user_id=6380

Tim, can you review this or pass it on to someone else who
has time?

Jack developed this patch after a discussion in which I was
involved in some of the design, but I won't have time to
look at it until December.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 21:38:20 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 13:38:20 -0700
Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup
Message-ID: <E16vlKm-0006qO-00@usw-sf-web4.sourceforge.net>

Patches item #542659, was opened at 2002-04-11 22:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Olivier Dormond (odormond)
Assigned to: Nobody/Anonymous (nobody)
Summary: PyCode_New NULL parameters cleanup

Initial Comment:
This patch remove the creation of an empty tuple for
freevars or cellvars if they are equal to NULL because
this case is handle earlier (at the same time all the
other parameters are checked) by raising a
PyErr_BadInternalCall.

It's almost a oneliner.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470


From noreply@sourceforge.net  Thu Apr 11 21:46:26 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 11 Apr 2002 13:46:26 -0700
Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup
Message-ID: <E16vlSc-0006wV-00@usw-sf-web4.sourceforge.net>

Patches item #542659, was opened at 2002-04-11 22:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 1
Submitted By: Olivier Dormond (odormond)
Assigned to: Nobody/Anonymous (nobody)
Summary: PyCode_New NULL parameters cleanup

Initial Comment:
This patch remove the creation of an empty tuple for
freevars or cellvars if they are equal to NULL because
this case is handle earlier (at the same time all the
other parameters are checked) by raising a
PyErr_BadInternalCall.

It's almost a oneliner.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 15:51:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 07:51:06 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16w2OI-0007JN-00@usw-sf-web2.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Fixed
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: A.M. Kuchling (akuchling)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 16:12:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 08:12:15 -0700
Subject: [Patches] [ python-Patches-539949 ] dict.popitem(key=None)
Message-ID: <E16w2il-0001qU-00@usw-sf-web4.sourceforge.net>

Patches item #539949, was opened at 2002-04-05 14:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: dict.popitem(key=None)

Initial Comment:
This patch implements the feature request at
http://sourceforge.net/tracker/index.php?
func=detail&aid=495086&group_id=5470&atid=355470 which 
asks for an optional argument to popitem so that it 
returns a key/value pair for a specified key or, if 
not specified, an arbitrary key.

The benefit is in providing a fast, explicit way to 
retrieve and remove and particular key/value pair from 
a dictionary.  By using only a single lookup, it is 
faster than the usual Python code:
   value = d[key]
   del d[key]
   return (key, value)

which now becomes:
   return d.popitem(key)

There is no magic or new code in the implementation -- 
it uses a few lines each from getitem, delitem, and 
popitem.  If an argument is specified, the new code is 
run; otherwise, the existing code is run.  This 
assures that the patch does not cause a performance 
penalty.

The diff is about -3 lines and +25 lines.
There are four sections:
1.  Replacement code for dict_popitem in dictobject.c
2.  Replacement docstring for popitem in dictobject.c
3.  Replacement registration line for popitem in 
dictobject.c
4.  Sample Python test code.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 11:12

Message:
Logged In: YES 
user_id=6380

Thanks! Accepted and checked in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-09 09:39

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() with hard tabs and 
corrected reference counts.  In a DEBUG build, I validated 
the ref counts against equivalent steps:  vv=d[k]; del d[k].
And, after Tim's suggestions, the code is fast and light.

In addition to d.pop(k), GvR's patch for d.popitem(k) 
should also go in.  The (k,v) return value feeds directly 
into d.__setitem__ or a dict(itemlist) constructor (see the 
code fragments in the 4/6/02 post).  The only downside is 
the time to process METH_VARARGS.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-08 12:46

Message:
Logged In: YES 
user_id=31435

Getting closer!  Two more questions:

+ Why switch from tabs to spaces?  The rest of this file 
uses hard tabs, and that's what Guido prefers in C source.

+ Think hard about whether we really want to decref the 
value -- I doubt we do, as we're *transferring* ownership 
of the value from the dict to the caller.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-08 10:14

Message:
Logged In: YES 
user_id=80475

Here is a revised patch for D.pop() incorporating Tim's 
ideas:
+ Docstring spelling fixed
+ Switched to METH_O instead of METH_VARARGS
+ Delayed decref until dict entry in consistent state
+ Removed unused int i=0 variable
+ Tabs replaced with spaces

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 17:54

Message:
Logged In: YES 
user_id=31435

I like Raymond's new pop().  Problems:

+ "speficied" is misspelled in the docstring.

+ Should be declared METH_O, not METH_VARARGS (mimic how, 
e.g., dict_update is set up).

+ The decrefs have to be reworked:  a decref can trigger 
calls back into arbitrary Python code, due to __del__ 
methods getting invoked.  This means you can never leave 
any live object in an insane or inconsistent state *during* 
a decref.  What you need to do instead is first capture the 
key and value into local vrbls, plug dummy and NULL in to 
the dict slot, and decrement the used count.  This leaves 
the dict in a consistent state again.  Only then is it safe 
to decref the key and value.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 22:00

Message:
Logged In: YES 
user_id=80475

Here's a more fleshed-out implementation of D.pop(). It 
doesn't rely on popitem(), doesn't malloc a tuple, and the 
refcounts should be correct.

One change from Neil's version, since k isn't being 
returned, then an arbitrary pair doesn't make sense, so the 
key argument to pop is required rather than optional.

The diff is off of 2.123.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:51

Message:
Logged In: YES 
user_id=35752

Here's a quick implementation.  D.pop() is not as efficient
as it could be (it uses popitem and then promply deallocates
the item tuple).  I'm not sure it matters though.

Someone should probably check the refcounts.  I always screw
them up. :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-06 20:16

Message:
Logged In: YES 
user_id=6380

Not a bad idea, Neil!  Care to work the code around to
implement that?

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-06 20:14

Message:
Logged In: YES 
user_id=35752

I think this should be implemented as pop() instead:

  D.pop([key]) -> value -- remove and return value by key
(default a random value)

It makes no sense to return the key when you already have
it. pop() also matches well with list pop():

  L.pop([index]) -> item -- remove and return item at index
(default last)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 20:08

Message:
Logged In: YES 
user_id=80475

The tests and documentation patches have been added.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-06 12:23

Message:
Logged In: YES 
user_id=80475

Q: Does the new function signature slow the existing no 
argument case?  A:  Yes.  The function is already so fast, 
that the small overhead of PyArg_ParseTuple is measurable.  
My timing shows a 8% drop in speed.

Q: Is _,v=d.popitem(k) slower than v=d.popvalue(k)?  A: 
Yes.  Though popvalue is a non-existing strawman, it would 
be quicker: it would cost two calls to Py_DECREF while 
saving a call to PyTuple_New and two calls to 
PyTuple_SET_ITEM.  Still, the running time for popvalue 
would be dominated by the rest of the function and not the 
single malloc.  Also, I think it unlikely that the 
dictionary interface would ever be expanded for popvalue, 
so the comparison is moot.

Q: Are there cases where (k,v) is needed?  A:  Yes. One 
common case is where the tuple still needs to be formed to 
help build another dictionary:  dict([d.popitem(k) for k in 
xferlist]) or [n.__setitem__(d.popitem(k)) for k in 
xferlist].

Also, it is useful when the key is computed by a function 
and then needs to be used in an expression.  I often do 
something like that with setdefault:  uniqInOrder=
[u.setdefault(k,k) for k in alist if k not in u].

Also, when the key is computed by a function, it may need 
to be saved only when .popitem succeeds but not when the 
key is missing:  "get and remove key if present; trigger 
exception if absent"  This pattern is used in validating 
user input keys for deletion.

Q:  Where is the unittest and doc patch?  A:  Coming this 
weekend.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-05 16:50

Message:
Logged In: YES 
user_id=31435

Are there examples of concrete use cases?  The idea that 
dict.popitem(k) returns (k, dict[k]) seems kinda goofy,  
since you necessarily already have k.

So the question is whether this is the function signature 
that's really desired, or whether it's too much a hack.  As 
is, it slows down popitem() without an argument because it 
requires using a fancier calling sequence, and because it 
now defers that case to a taken branch; it's also much 
slower than a function that just returned v could be, due 
to the need to allocate a 2-tuple to hold a redundant copy 
of the key.

Perhaps there are use cases of the form

    k, v = dict.popitem(f(x, y, z))

where the key is known only implicitly?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:47

Message:
Logged In: YES 
user_id=6380

FYI, I'm uploading my version of the patch, with code
cleanup, as popdict2.txt. I've moved the popitem-with-arg
code before the allocation of res, because there were
several places where this code returned NULL without
DECREF'ing res. Repeating the PyTuple_New(2) call seemed the
lesser evil.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:38

Message:
Logged In: YES 
user_id=6380

I've reviewed the patch and see only cosmetic things that
need to be changed. I'll check it in as soon as you submit a
unittest and doc patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 16:26

Message:
Logged In: YES 
user_id=6380

Now, if you could also upload a unittest and a doc patch,
that would be great!


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-05 16:10

Message:
Logged In: YES 
user_id=80475

Context diff uploaded at poppatch.c below.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-05 15:11

Message:
Logged In: YES 
user_id=6380

Please upload a context or unified diff.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539949&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 18:08:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 10:08:19 -0700
Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_ function
Message-ID: <E16w4X5-0003bj-00@usw-sf-web3.sourceforge.net>

Patches item #543098, was opened at 2002-04-12 19:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: start docs for PyEval_ function

Initial Comment:
The start of a new (sub)section for the api manual.
Should this go into api/utilities?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 18:13:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 10:13:49 -0700
Subject: [Patches] [ python-Patches-462936 ] Improved modulefinder
Message-ID: <E16w4cP-00060W-00@usw-sf-web1.sourceforge.net>

Patches item #462936, was opened at 2001-09-19 19:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462936&group_id=5470

Category: Modules
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
Summary: Improved modulefinder

Initial Comment:
This patch adds two improvements to 
freeze/modulefinder.

1. ModuleFinder now keeps track of which module is 
imported by whom.

2. ModuleFinder, when instantiated with the new 
scan_extdeps=1 argument, tries to track dependencies 
of builtin and extension modules.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-12 19:13

Message:
Logged In: YES 
user_id=11105

Closed as rejected.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-09-23 23:50

Message:
Logged In: YES 
user_id=21627

I would still prefer the inelegant approach. Your approach 
appears to be quite dangerous: the packager would 
essentially run arbitrary C code... If you absolutely have 
to use such a feature, I think you can do better than 
analysing the python -v output: watch sys.modules before 
and after the import.

As for extending the hard-coded knowledge: I was 
suggesting that the packaging tool that uses modulefinder 
has a mechanism to extend the hard-coded knowledge by 
other hard-coded knowledge (which lives in the packaging 
tool, instead of living in modulefinder). If the packaging 
tool absolutely wants to, it also could run the Python 
interpreter through a pipe and put the gathered output 
into modulefinder :-)


----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2001-09-20 11:42

Message:
Logged In: YES 
user_id=11105

The use case is to find as much dependencies as possible.

Sure, you cannot assume that importing an extension module 
finds all dependencies - only those which are executed 
inside the initmodule function. OTOH, this covers a *lot* 
of problematic cases, pygame and numpy for example.

The situation is (somewhat) similar to finding dependencies 
of python modules - only those donewith normal import 
statements are found, __import__, eval, or exec is not 
handled.

A possible solution would be to run the script 
in 'profiling mode', where the script is actually run, and 
all imports are monitored. This is however far beyond 
ModuleFinder's scope.

Hardcoding the knowledge about dependencies into 
ModuleFinder for the core modules would be possible 
although inelegant IMO. An API for non-standard modules 
would be possible, but how should this be used without 
executing any code?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-09-19 22:28

Message:
Logged In: YES 
user_id=21627

I dislike the chunk on finding external dependencies. What 
is the typical use case (i.e. what module has what 
external dependencies)?

It seems easier to hard-code knowledge about external 
dependencies into ModuleFinder; this hard-coded knowledge 
should cover all core modules. In addition, there should 
be an API to extend this knowledge for non-standard 
modules.

Furthermore, by executing an import, you cannot be sure 
that you really find all dependencies - some may only show 
up when a certain function is used.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462936&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 18:14:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 10:14:49 -0700
Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py
Message-ID: <E16w4dN-0000Xt-00@usw-sf-web2.sourceforge.net>

Patches item #542569, was opened at 2002-04-11 19:01
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: tp_print tp_repr tp_str in test_bool.py

Initial Comment:
Those slots are not being tested by test_bool.py if it
was run standalone.
bool_print() does not run during the complete
regression test suite.

I was using Neal's tools and choose boolobject.c
(because it's an easy module :-) to get in touch with
the internals.

I don't know if this patch would be useful to you
because  I didn't see similar checks done for other
types. Ie: the
eval(repr(x))==x property, or the tp_print slot (I
found only one for dicts.)

Hope it helps,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-12 19:14

Message:
Logged In: YES 
user_id=112690

The patch file "21067: test_bool.diff" is the good one.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 18:21:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 10:21:44 -0700
Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions
Message-ID: <E16w4k4-0003lR-00@usw-sf-web3.sourceforge.net>

Patches item #543098, was opened at 2002-04-12 19:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
>Summary: start docs for PyEval_* functions

Initial Comment:
The start of a new (sub)section for the api manual.
Should this go into api/utilities?


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-12 19:21

Message:
Logged In: YES 
user_id=11105

Typo in the summary.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 19:37:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 11:37:50 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16w5vi-0004Hl-00@usw-sf-web4.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Fixed
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: A.M. Kuchling (akuchling)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 20:26:37 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 12:26:37 -0700
Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions
Message-ID: <E16w6gv-0005El-00@usw-sf-web3.sourceforge.net>

Patches item #543098, was opened at 2002-04-12 13:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
>Assigned to: Thomas Heller (theller)
Summary: start docs for PyEval_* functions

Initial Comment:
The start of a new (sub)section for the api manual.
Should this go into api/utilities?


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-12 15:26

Message:
Logged In: YES 
user_id=3066

The section needs a better heading. ;)  (The Utilities
chapter is fine; it can go at the end.)  I'd also like to
see more content in the section before it gets added (though
that's just as easily fixed once the boilerplate is checked
in).  It would be good to review the material in
"Documenting Python"; this is part of the standard
documentation.

PyEval_SetProfile() and PyEval_SetTrace() are already
documented.

Please continue with this!

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-12 13:21

Message:
Logged In: YES 
user_id=11105

Typo in the summary.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 20:32:47 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 12:32:47 -0700
Subject: [Patches] [ python-Patches-543098 ] start docs for PyEval_* functions
Message-ID: <E16w6mt-0002B9-00@usw-sf-web2.sourceforge.net>

Patches item #543098, was opened at 2002-04-12 19:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Thomas Heller (theller)
Summary: start docs for PyEval_* functions

Initial Comment:
The start of a new (sub)section for the api manual.
Should this go into api/utilities?


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-12 21:32

Message:
Logged In: YES 
user_id=11105

> The section needs a better heading. ;)

This is where I need your help. These functions are in 
ceval.c, and I even don't know why. The reason would 
probably make a good header. Suggestions?

Since I currently have no internet access except email and 
http, I will work on a local version.

I'm also relying on you checking and fixing the markup ;-)

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-12 21:26

Message:
Logged In: YES 
user_id=3066

The section needs a better heading. ;)  (The Utilities
chapter is fine; it can go at the end.)  I'd also like to
see more content in the section before it gets added (though
that's just as easily fixed once the boilerplate is checked
in).  It would be good to review the material in
"Documenting Python"; this is part of the standard
documentation.

PyEval_SetProfile() and PyEval_SetTrace() are already
documented.

Please continue with this!

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-12 19:21

Message:
Logged In: YES 
user_id=11105

Typo in the summary.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543098&group_id=5470


From noreply@sourceforge.net  Fri Apr 12 21:43:36 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 13:43:36 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16w7tQ-0005iA-00@usw-sf-web4.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Sat Apr 13 02:00:17 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 18:00:17 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16wBtp-0002ce-00@usw-sf-web1.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
>Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From Thanks4asking@alliedmarketing.net  Sat Apr 13 03:06:20 2002
From: Thanks4asking@alliedmarketing.net (Thanks4asking@alliedmarketing.net)
Date: Fri, 12 Apr 2002 22:06:20 -0400
Subject: [Patches] Webmaster Information
Message-ID: <4D4508BD-4E5A-11D6-8F4E-00500471CA25@5MGU9XE9>

------=_NextPart_000_00Z1_11B11E2G.G1333L54
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: base64

PEhUTUw+DQo8SEVBRD4NCiAgPFRJVExFPlRoYW5rcy00LUFza2luZzwvVElUTEU+DQogIDxMSU5L
IHJlbD0ic3R5bGVzaGVldCIgaHJlZj0iaHR0cDovL2NvZ25pZ2VuLm5ldC9jb3Jwb3JhdGUvY29n
bmlnZW4uY3NzIj4NCjwvSEVBRD4NCjxCT0RZIEJHQ09MT1I9IiNmZmZmZmYiPg0KPENFTlRFUj4N
CiAgPFRBQkxFIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElORz0iMCIgQUxJR049IkNlbnRlciJ3
aWR0aD01MDA+DQogICAgPFRSPg0KICAgICAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VM
TFBBRERJTkc9IjAiPg0KCSAgPFRSPg0KCSAgICA8VEQgYmdjb2xvcj0jMDAwMDAwPjxQIEFMSUdO
PUNlbnRlcj4NCgkgICAgICA8SU1HIFNSQz0icGljdDQ1LmpwZyIgQUxJR049IkJvdHRvbSIgQUxU
PSJbSW1hZ2VdIiBXSURUSD0iMzY1IiBIRUlHSFQ9IjcyIj48L1REPg0KCSAgPC9UUj4NCgkgIDxU
Uj4NCgkgICAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQk8
VFI+DQoJCSAgPFREPjxUQUJMRSBDRUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQkg
ICAgICA8VFI+DQoJCQk8VEQ+PFRBQkxFIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElORz0iMCI+
DQoJCQkgICAgPFRSPg0KCQkJICAgICAgPFREPg0KCQkJCSAgPEhSPg0KCQkJICAgICAgPC9URD4N
CgkJCSAgICA8L1RSPg0KCQkJICAgIDxUUj4NCgkJCSAgICAgIDxURD48VEFCTEUgQ0VMTFNQQUNJ
Tkc9IjAiIENFTExQQURESU5HPSIwIj4NCgkJCQkgIDxUUj4NCgkJCQkgICAgPFREPjxUQUJMRSBD
RUxMU1BBQ0lORz0iMCIgQ0VMTFBBRERJTkc9IjAiPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PEI+
PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxCSUc+PEJJRz48QklHPkludGVybmV0IFdlYiBUcmFmZmlj
IEZvcg0KCQkJCQkgIFNhbGU8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJ
CTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+U3Bl
Y2lhbCBQcmljaW5nIEZvciBUcmFmZmljIFRvIFlvdXIgV2ViDQoJCQkJCSAgU2l0ZTo8L0k+PC9T
UEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDs8L1REPg0K
CQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxC
Pk1haW5zdHJlYW0gU2l0ZXMgRXhpdCBvciBQb3B1bmRlcg0KCQkJCQkgIFRyYWZmaWM6PC9CPjwv
U1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9
ImdlbmVyYWwiPjxJPkdlbmVyaWMgVHJhZmZpYyAtICQzLjI1IENQTTwvST48L1NQQU4+PC9URD4N
CgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48
ST5DYXRlZ29yeSBTcGVjaWZpYyBUcmFmZmljIC0gJDUuMDAgQ1BNPC9JPjwvU1BBTj48L1REPg0K
CQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4N
CgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48Qj5BZHVsdCBTaXRl
cyBFeGl0IG9yIFBvcHVuZGVyIFRyYWZmaWM6PC9CPjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0K
CQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVyYWwiPjxJPkZyb20gUGF5c2l0
ZXMgLSAkNS4wMCBDUE08L0k+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJ
CQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+RnJvbSBGcmVlc2l0ZXMgLSAkNC4wMCBD
UE08L0k+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJz
cDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9Imdl
bmVyYWwiPjxCPlNlYXJjaCBFbmdpbmUgVHJhZmZpYyBBdmFpYWxibGUgVXBvbiBSZXF1ZXN0Lg0K
CQkJCQkgIEZvciBBIFF1b3RlIE9uIFNlYXJjaCBFbmdpbmUgVHJhZmZpYyBQbGVhc2UgY2FsbCB1
cyBhdCA5NzMuOTkyLjM5ODUgb3IgZW1haWwNCgkJCQkJICA8QSBIUkVGPSJtYWlsdG86dHJhZmZp
Y0BhbGxpZWRtYXJrZXRpbmcubmV0Ij50cmFmZmljQGFsbGllZG1hcmtldGluZy5uZXQ8L0E+PC9C
PjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9U
RD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFs
Ij48Qj5Gb3IgQWRkaXRpb25hbCBJbmZvcm1hdGlvbiBBYm91dCBQdXJjaGFzaW5nDQoJCQkJCSAg
VHJhZmZpYyBUaHJvdWdoIEFsbGllZCBJbnRlcm5ldCBNYXJrZXRpbmcgUGxlYXNlIFZpc2l0DQoJ
CQkJCSAgPEEgSFJFRj0iaHR0cDovL3RyYWZmaWMuYWxsaWVkbWFya2V0aW5nLm5ldCIgVEFSR0VU
PSJfYmxhbmsiPmh0dHA6Ly90cmFmZmljLmFsbGllZG1hcmtldGluZy5uZXQ8L0E+PC9CPjwvU1BB
Tj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+DQoJCQkJCSAgIDxIUj4N
CgkJCQkJIDwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDs8L1RE
Pg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PEI+PFNQQU4gY2xhc3M9ImdlbmVy
YWwiPjxCSUc+PEJJRz48QklHPk9wdC1JbiBFbWFpbCBUcmFmZmljIEZvcg0KCQkJCQkgIFNhbGU8
L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4N
CgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEk+U3BlY2lhbCBQcmljaW5nIEZvciBF
bWFpbCBUcmFmZmljIFRvIFlvdXIgV2ViDQoJCQkJCSAgU2l0ZTo8L0k+PC9TUEFOPjwvVEQ+DQoJ
CQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD4mbmJzcDsgJm5ic3A7PC9URD4NCgkJCQkJ
PC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48Qj5XZSBo
YXZlIDcgZGlmZmVyZW50IE5ld3NsZXR0ZXJzIHRvIHNlbGVjdA0KCQkJCQkgIGZyb206PC9CPjwv
U1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9
ImdlbmVyYWwiPjxJPlV0aWxpemUgb3VyIG1haW5zdHJlYW0gYW5kIGFkdWx0IG5ld3NsZXR0ZXJz
DQoJCQkJCSAgdG8gYnJvYWRjYXN0IHlvdXIgbWVzc2FnZSBvdXQgdG8gb3ZlciAxMiBtaWxsaW9u
IE9wdC1pbiBNZW1iZXJzLiBQcmljZXMgYXJlDQoJCQkJCSAgYXMgbG93IGFzICQxLjAwIHBlciAx
LDAwMCBkZWxpdmVyZWQgZW1haWxzLjwvST48L1NQQU4+PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJ
PFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ
IDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEI+Rm9yIEFkZGl0aW9uYWwgSW5mb3JtYXRpb24g
QWJvdXQgUHVyY2hhc2luZw0KCQkJCQkgIEVtYWlsIFRyYWZmaWMgVGhyb3VnaCBBbGxpZWQgSW50
ZXJuZXQgTWFya2V0aW5nIFBsZWFzZSBWaXNpdA0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly9idWxr
ZW1haWwuYWxsaWVkbWFya2V0aW5nLm5ldCIgVEFSR0VUPSJfYmxhbmsiPmh0dHA6Ly9idWxrZW1h
aWwuYWxsaWVkbWFya2V0aW5nLm5ldDwvQT48L0I+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJ
CQkJCTxUUj4NCgkJCQkJIDxURD4NCgkJCQkJICAgPEhSPg0KCQkJCQkgPC9URD4NCgkJCQkJPC9U
Uj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxU
Uj4NCgkJCQkJIDxURD48Qj48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEJJRz48QklHPjxCSUc+Tm9u
LU9wdC1JbiBFbWFpbA0KCQkJCQkgIExpc3RzPC9CSUc+PC9CSUc+PC9CSUc+PC9TUEFOPjwvQj48
L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQQU4gY2xhc3M9ImdlbmVy
YWwiPjxJPlNwZWNpYWwgUHJpY2luZyBGb3IgRW1haWwgVHJhZmZpYyBUbyBZb3VyIFdlYg0KCQkJ
CQkgIFNpdGU6PC9JPjwvU1BBTj48L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8
VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRSPg0KCQkJCQkgPFREPjxTUEFOIGNs
YXNzPSJnZW5lcmFsIj48Qj5XZSBoYXZlIHZhcmlvdXMgZGlmZmVyZW50IG5vbi1PcHQtSW4gTGlz
dHMgZm9yDQoJCQkJCSAgc2FsZS4gVmlzaXQgb3VyIGVtYWlsIHNpdGUgYXQNCgkJCQkJICA8QSBI
UkVGPSJodHRwOi8vYnVsa2VtYWlsLmFsbGllZG1hcmtldGluZy5uZXQiIFRBUkdFVD0iX2JsYW5r
Ij5odHRwOi8vYnVsa2VtYWlsLmFsbGllZG1hcmtldGluZy5uZXQ8L0E+DQoJCQkJCSAgZm9yIGFk
ZGl0aW9uYWwgZGV0YWlscy48L0I+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4N
CgkJCQkJIDxURD4NCgkJCQkJICAgPEhSPg0KCQkJCQkgPC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJ
PFRSPg0KCQkJCQkgPFREPiZuYnNwOzwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ
IDxURD48Qj48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+PEJJRz48QklHPjxCSUc+U0VDVVJFDQoJCQkJ
CSAgRU1BSUxJTkc8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+PC9CPjwvVEQ+DQoJCQkJCTwvVFI+
DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0iZ2VuZXJhbCI+RG8geW91IGhhdmUg
YSBsaXN0IGFuZCBuZWVkIHVzIHRvIHNlbmQgb3V0IHlvdXINCgkJCQkJICBlbWFpbHMgdGhyb3Vn
aCBvdXIgc2VydmVyPyBObyBwcm9ibGVtISBXZSBjYW4gZG8gaXQgZm9yIG9ubHkgNTAgY2VudHMg
cGVyDQoJCQkJCSAgMSwwMDAuPC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJ
CQkJIDxURD4mbmJzcDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+PFNQ
QU4gY2xhc3M9ImdlbmVyYWwiPjxCPmNhbGwgdXMgYXQgOTczLjk5Mi4zOTg1IG9yIGVtYWlsDQoJ
CQkJCSAgPEEgSFJFRj0ibWFpbHRvOnRyYWZmaWNAYWxsaWVkbWFya2V0aW5nLm5ldCI+dHJhZmZp
Y0BhbGxpZWRtYXJrZXRpbmcubmV0PC9BPjwvQj48L1NQQU4+PC9URD4NCgkJCQkJPC9UUj4NCgkJ
CQkJPFRSPg0KCQkJCQkgPFREPg0KCQkJCQkgICA8SFI+DQoJCQkJCSA8L1REPg0KCQkJCQk8L1RS
Pg0KCQkJCQk8VFI+DQoJCQkJCSA8VEQ+Jm5ic3A7PC9URD4NCgkJCQkJPC9UUj4NCgkJCQkJPFRS
Pg0KCQkJCQkgPFREPjxCPjxTUEFOIGNsYXNzPSJnZW5lcmFsIj48QklHPjxCSUc+PEJJRz5GdXR1
cmVIaXRzIFRyYWZmaWMgLSA0DQoJCQkJCSAgRlJFRSE8L0JJRz48L0JJRz48L0JJRz48L1NQQU4+
PC9CPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJIDxURD48U1BBTiBjbGFzcz0i
Z2VuZXJhbCI+RnV0dXJlSGl0cyBpcyBhIHNvZnR3YXJlIHByb2dyYW0gaG9zdGVkIG9uIG91cg0K
CQkJCQkgIHNlcnZlciB0aGF0Jm5ic3A7cGxhY2VzIGEgJm5ic3A7SUNPTiBvbiB5b3VyIHVzZXJz
IGRlc2t0b3AsIGZhdm9yaXRlcywNCgkJCQkJICBsaW5rcywmbmJzcDtzdGFydCBidXR0b24gYW5k
IGNoYW5nZXMgdGhlaXIgaG9tZXBhZ2UuIFRoaXMgaXMgZG9uZSBlaXRoZXINCgkJCQkJICBhdXRv
bWF0aWNhbGx5IG9yIHdpdGggYWxlcnQuIFRoaXMgcHJvZHVjdCBpcyBmcmVlIGFuZCBpcyBsb2Nh
dGVkIG9uIG91cg0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly93d3cuYWxsaWVkbWFya2V0aW5nLm5l
dCI+Y29ycG9yYXRlIHdlYiBzaXRlPC9BPiBvciBvbg0KCQkJCQkgIDxBIEhSRUY9Imh0dHA6Ly9m
dXR1cmVoaXRzLmFsbGllZG1hcmtldGluZy5uZXQiPmh0dHA6Ly9mdXR1cmVoaXRzLmFsbGllZG1h
cmtldGluZy5uZXQ8L0E+PC9TUEFOPjwvVEQ+DQoJCQkJCTwvVFI+DQoJCQkJCTxUUj4NCgkJCQkJ
IDxURD4mbmJzcDs8L1REPg0KCQkJCQk8L1RSPg0KCQkJCSAgICAgIDwvVEFCTEU+DQoJCQkJICAg
IDwvVEQ+DQoJCQkJICA8L1RSPg0KCQkJCTwvVEFCTEU+DQoJCQkgICAgICA8L1REPg0KCQkJICAg
IDwvVFI+DQoJCQkgICAgPFRSPg0KCQkJICAgICAgPFREPg0KCQkJCSAgPEhSPg0KCQkJICAgICAg
PC9URD4NCgkJCSAgICA8L1RSPg0KCQkJICAgIDxUUj4NCgkJCSAgICAgIDxURD48U1BBTiBjbGFz
cz0iZ2VuZXJhbCI+VG8gcmVtb3ZlIHlvdXIgZW1haWwgYWRkcmVzcyBmcm9tIHRoaXMgbGlzdCBh
bmQNCgkJCQlhbnkgb3RoZXIgbGlzdHMgYXNzb2NpYXRlZCB0byBUaGUtRW1haWwtSW5mb3JtYXRv
cnkNCgkJCQk8QSBIUkVGPSJodHRwOi8vYWRzZXJ2ZXIuY3liZXJzdWJzY3JpYmVyLmNvbS9yZW1v
dmUuaHRtbCI+Q0xJQ0sNCgkJCQlIRVJFPC9BPjwvU1BBTj48L1REPg0KCQkJICAgIDwvVFI+DQoJ
CQkgIDwvVEFCTEU+DQoJCQk8L1REPg0KCQkgICAgICA8L1RSPg0KCQkgICAgPC9UQUJMRT4NCgkJ
ICAgIDxQPg0KCQkgIDwvVEQ+DQoJCTwvVFI+DQoJICAgICAgPC9UQUJMRT4NCgkgICAgPC9URD4N
CgkgIDwvVFI+DQoJPC9UQUJMRT4NCiAgICAgIDwvVEQ+DQogICAgPC9UUj4NCiAgPC9UQUJMRT4N
CjwvQ0VOVEVSPg0KPFA+DQo8L0JPRFk+PC9IVE1MPg0K


From noreply@sourceforge.net  Sat Apr 13 05:31:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 21:31:24 -0700
Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key)
Message-ID: <E16wFC8-0004dE-00@usw-sf-web1.sourceforge.net>

Patches item #543316, was opened at 2002-04-13 04:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: UserDict.pop(key)

Initial Comment:
This two line patch modifies UserDict.py to match the 
new dictionary behavior, d.pop(k).

I originally omitted this patch on the theory that 
UserDict is headed toward deprecation, but I checked 
the docs and they promise that UserDict implements all 
of the methods for dictionaries.  This patch makes 
that statement true once again.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470


From noreply@sourceforge.net  Sat Apr 13 05:35:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 12 Apr 2002 21:35:56 -0700
Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key)
Message-ID: <E16wFGW-0001qm-00@usw-sf-web4.sourceforge.net>

Patches item #543316, was opened at 2002-04-13 04:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: UserDict.pop(key)

Initial Comment:
This two line patch modifies UserDict.py to match the 
new dictionary behavior, d.pop(k).

I originally omitted this patch on the theory that 
UserDict is headed toward deprecation, but I checked 
the docs and they promise that UserDict implements all 
of the methods for dictionaries.  This patch makes 
that statement true once again.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470


From noreply@sourceforge.net  Sat Apr 13 15:03:54 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 13 Apr 2002 07:03:54 -0700
Subject: [Patches] [ python-Patches-543316 ] UserDict.pop(key)
Message-ID: <E16wO8A-0002Ey-00@usw-sf-web1.sourceforge.net>

Patches item #543316, was opened at 2002-04-13 00:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: UserDict.pop(key)

Initial Comment:
This two line patch modifies UserDict.py to match the 
new dictionary behavior, d.pop(k).

I originally omitted this patch on the theory that 
UserDict is headed toward deprecation, but I checked 
the docs and they promise that UserDict implements all 
of the methods for dictionaries.  This patch makes 
that statement true once again.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 10:03

Message:
Logged In: YES 
user_id=6380

Thanks! Added.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543316&group_id=5470


From noreply@sourceforge.net  Sat Apr 13 18:41:48 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 13 Apr 2002 10:41:48 -0700
Subject: [Patches] [ python-Patches-543447 ] Inclusion of mknod() in posixmodule
Message-ID: <E16wRX2-0007eB-00@usw-sf-web2.sourceforge.net>

Patches item #543447, was opened at 2002-04-13 17:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Inclusion of mknod() in posixmodule

Initial Comment:
As discussed, here is a patch implementing mknod() 
in posixmodule.c. 
 
As a side note, this patch also renames the "file" 
parameter of mkfifo() to "filename", to better 
reflect its meaning. 
 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470


From noreply@sourceforge.net  Sat Apr 13 23:07:10 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 13 Apr 2002 15:07:10 -0700
Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py
Message-ID: <E16wVfq-000414-00@usw-sf-web4.sourceforge.net>

Patches item #543498, was opened at 2002-04-13 22:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: s/Copyright/License/ in bdist_rpm.py

Initial Comment:
The "Copyright" field in RPM spec files is obsolete. 
"License" should be used instead. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 00:31:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 13 Apr 2002 16:31:44 -0700
Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support
Message-ID: <E16wWzg-0005LC-00@usw-sf-web3.sourceforge.net>

Patches item #476814, was opened at 2001-10-31 17:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Jack Jansen (jackjansen)
Summary: foreign-platform newline support

Initial Comment:
This patch enables Python to interpret all known
newline conventions,
CR, LF or CRLF, on all platforms.

This support is enabled by configuring with
--with-universal-newlines
(so by default it is off, and everything should behave
as usual).

With universal newline support enabled two things
happen:
- When importing or otherwise parsing .py files any
newline convention
  is accepted.
- Python code can pass a new "t" mode parameter to
open() which
  reads files with any newline convention. "t" cannot
be combined with
  any other mode flags like "w" or "+", for obvious
reasons.

File objects have a new attribute "newlines" which
contains the type of
newlines encountered in the file (or None when no
newline has been seen,
or "mixed" if there were various types of newlines).

Also included is a test script which tests both file
I/O and parsing.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2002-04-14 01:31

Message:
Logged In: YES 
user_id=45365

A final tweaks: return a tuple of newline values in stead of 'mixed'.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-11 22:31

Message:
Logged In: YES 
user_id=21627

What is the rationale for making this a compile-time option?
It seems to complicate things, with no apparent advantage.

If this is for backwards compatibility, don't make it an
option: nobody will rebuild Python just to work around a
compatibility problem.

Apart from that, the patch looks goo.d

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-29 00:17

Message:
Logged In: YES 
user_id=45365

New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 22:07

Message:
Logged In: YES 
user_id=6380

Thanks! But there's no documentation. Could I twist your arm
for a separate doc patch?

I'm tempted to give this a +1, but I'd like to hear from MvL
and MAL to see if they foresee any interaction with their
PEP 262 implemetation.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-13 23:44

Message:
Logged In: YES 
user_id=45365

A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-01-16 23:47

Message:
Logged In: YES 
user_id=45365

This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-14 00:57

Message:
Logged In: YES 
user_id=31435

Back to Jack -- and sorry for sitting on it so long.  
Clearly this isn't making it into 2.2 in the core.  As I 
said on Python-Dev, I believe this needs a PEP:  the design 
decisions are debatable, so *should* be debated outside the 
Mac community too.  Note, though, that I can't stop you 
from adding it to the 2.2 Mac distribution (if you want it 
badly enough there).

If a PEP won't be written, I suggest finding someone else 
to review it again; maybe Guido.  Note that the patch needs 
doc changes too.  The patch to regrtest.py doesn't belong 
here (I assume it just slipped in).  There seems a lot of 
code in support of the f_newlinetypes member, and the value 
of that member isn't clear -- I can't imagine a good use 
for it (maybe it's a Mac thing?).  The implementation of 
Py_UniversalNewlineFread appears incorrect to me:  it reads 
n bytes *every* time around the outer loop, no matter how 
few characters are still required, and n doesn't change 
inside the loop.  The business about the GIL may be due to 
the lack of docs:  are, or are not, people supposed to 
release the GIL themselves around calls to these guys?  
It's not documented, and it appears your intent differed 
from my guess.  Finally, it would be better to call ferror
() after calling fread() instead of before it <wink>.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-11-14 16:13

Message:
Logged In: YES 
user_id=45365

Here's a new version of the patch. To address your issues
one by one:
- get_line and Py_UniversalNewlineFgets are too difficult to
integrate, at leat,
I don't see how I could do it. The storage management of
get_line gets in the way.

- The global lock comment I don't understand. The
Universal... routines are
replacements for fgets() and fread(), so have nothing to do
with the interpreter lock.

- The logic of all three routines (get_line too) has changed
and I've put comments in.
I hope this addresses some of the points.

- If universal_newline is false for a certain PyFileObject
we now immedeately take
a quick exit via fgets() or fread().

There's also a new test script, that tests some more border
cases (like lines longer
than 100 characters, and a lone CR just before end of file).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-11-05 09:16

Message:
Logged In: YES 
user_id=31435

It would be better if get_line just called 
Py_UniversalNewlineFgets (when appropriate) instead of 
duplicating its logic inline.

Py_UniversalNewlineFgets and Py_UniversalNewlineFread 
should deal with releasing the global lock themselves -- 
the correct granularity for lock release/reacquire is 
around the C-level input routines (esp. for fread).

The new routines never check for I/O errors!  Why not?  It 
seems essential.

The new Fgets checks for EOF at the end of the loop instead 
of the top.  This is surprising, and I stared a long time 
in vain trying to guess why.  Setting

newlinetypes |= NEWLINE_CR;

immediately after seeing an '\r' would be as fast (instead 
of waiting to see EOF and then inferring the prior 
existence of '\r' indirectly from the state of the 
skipnextlf flag).

Speaking of which <wink>, the fobj tests in the inner loop 
waste cycles.  Set the local flag vrbls whether or not fobj 
is NULL.  When you're *out* of the inner loop you can 
simply decline to store the new masks when fobj is NULL 
(and you're already doing the latter anyway).  A test and 
branch inside the loop is much more expensive than or'ing 
in a flag bit inside the loop, ditto harder to understand.

Floating the univ_newline test out of the loop (and 
duplicating the loop body, one way for univ_newline true 
and the other for it false) would also save a test and 
branch on every character.

Doing fread one character at a time is very inefficient.  
Since you know you need to obtain n characters in the end, 
and that these transformations require reading at least n 
characters, you could very profitably read n characters in 
one gulp at the start, then switch to k at a time where k 
is the number of \r\n pairs seen since the last fread 
call.  This is easier to code than it sounds <wink>.

It would be fine by me if you included (and initialized) 
the new file-object fields all the time, whether or not 
universal newlines are configured.  I'd rather waste a few 
bytes in a file object than see #ifdefs spread thru the 
code.

I'll be damned if I can think of a quick way to do this 
stuff on Windows -- native Windows fgets() is still the 
only Windows handle we have on avoiding crushing thread 
overhead inside MS's C library.  I'll think some more about 
it (the thrust still being to eliminate the 't' mode flag, 
as whined about <wink> on Python-Dev).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-10-31 18:38

Message:
Logged In: YES 
user_id=6380

Tim, can you review this or pass it on to someone else who
has time?

Jack developed this patch after a discussion in which I was
involved in some of the design, but I won't have time to
look at it until December.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 10:54:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 02:54:16 -0700
Subject: [Patches] [ python-Patches-542659 ] PyCode_New NULL parameters cleanup
Message-ID: <E16wgi8-0008Tk-00@usw-sf-web2.sourceforge.net>

Patches item #542659, was opened at 2002-04-11 22:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 1
Submitted By: Olivier Dormond (odormond)
Assigned to: Nobody/Anonymous (nobody)
Summary: PyCode_New NULL parameters cleanup

Initial Comment:
This patch remove the creation of an empty tuple for
freevars or cellvars if they are equal to NULL because
this case is handle earlier (at the same time all the
other parameters are checked) by raising a
PyErr_BadInternalCall.

It's almost a oneliner.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 11:54

Message:
Logged In: YES 
user_id=21627

Thanks for the patch, appplied as compile.c 2.240

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542659&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 10:58:17 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 02:58:17 -0700
Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py
Message-ID: <E16wgm1-0008WJ-00@usw-sf-web2.sourceforge.net>

Patches item #543498, was opened at 2002-04-14 00:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: s/Copyright/License/ in bdist_rpm.py

Initial Comment:
The "Copyright" field in RPM spec files is obsolete. 
"License" should be used instead. 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 11:58

Message:
Logged In: YES 
user_id=21627

Can you provide a pointer that shows this obsoletion?

http://www.rpm.org/RPM-HOWTO/build.html#SPEC-FILE

still says Copyright.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:21:10 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:21:10 -0700
Subject: [Patches] [ python-Patches-543447 ] Inclusion of mknod() in posixmodule
Message-ID: <E16wh8A-0000JW-00@usw-sf-web2.sourceforge.net>

Patches item #543447, was opened at 2002-04-13 19:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470

Category: Modules
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Inclusion of mknod() in posixmodule

Initial Comment:
As discussed, here is a patch implementing mknod() 
in posixmodule.c. 
 
As a side note, this patch also renames the "file" 
parameter of mkfifo() to "filename", to better 
reflect its meaning. 
 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:21

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Committed as

configure 1.298
configure.in 1.308
pyconfig.h.in 1.29
libos.tex 1.79
NEWS 1.386
posixmodule.c 2.228


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543447&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:24:34 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:24:34 -0700
Subject: [Patches] [ python-Patches-542569 ] tp_print tp_repr tp_str in test_bool.py
Message-ID: <E16whBS-0000Lf-00@usw-sf-web2.sourceforge.net>

Patches item #542569, was opened at 2002-04-11 19:01
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470

Category: Tests
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: tp_print tp_repr tp_str in test_bool.py

Initial Comment:
Those slots are not being tested by test_bool.py if it
was run standalone.
bool_print() does not run during the complete
regression test suite.

I was using Neal's tools and choose boolobject.c
(because it's an easy module :-) to get in touch with
the internals.

I don't know if this patch would be useful to you
because  I didn't see similar checks done for other
types. Ie: the
eval(repr(x))==x property, or the tp_print slot (I
found only one for dicts.)

Hope it helps,
-Hernan


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:24

Message:
Logged In: YES 
user_id=21627

As a principle, there should be a test for each line of
code, so yes, this patch is useful; I've applied it as
test_bool.py 1.4. Feel free to contribute more of those.

I'm not so sure tp_print is useful in the first place: the
fall-back would have worked just as fine for boo.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-12 19:14

Message:
Logged In: YES 
user_id=112690

The patch file "21067: test_bool.diff" is the good one.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542569&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:27:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:27:16 -0700
Subject: [Patches] [ python-Patches-542562 ] clean up trace.py
Message-ID: <E16whE4-0000Na-00@usw-sf-web2.sourceforge.net>

Patches item #542562, was opened at 2002-04-11 18:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470

Category: Demos and tools
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Zooko O'Whielacronx (zooko)
Assigned to: Nobody/Anonymous (nobody)
Summary: clean up trace.py

Initial Comment:
moderately interesting changes:
 * bugfix: remove "feature" of ignoring files in the
tmpdir, as I was trying to run it on file in the tmpdir
and couldn't figure out why it gave no answer!  I think
the original motivation for that feature (spurious
"/tmp/" filenames for builtin functions??) has gone
away, but I'm not sure.
 * add more usage docs and warning about common mistake

pretty mundane changes:
 * remove unnecessary checks for backwards
compatibility with a version that never escaped from my
(Zooko's) laptop
 * add a future-compatible check: if the interpreter
offers an attribute called `sys.optimized', and it is
"true", and the user is trying to do something that
can't be done with an optimizing interpreter, then
error out


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:27

Message:
Logged In: YES 
user_id=21627

Can you also provide the other cleanup that Guido requested
(change of license, removal of change logs, etc)?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=542562&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:31:22 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:31:22 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16whI2-0000Pt-00@usw-sf-web2.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 16:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Guido van Rossum (gvanrossum)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:31

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as EditorWindow.py 1.41.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-07 19:10

Message:
Logged In: YES 
user_id=112690

Ok. I can add the fallback.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-07 19:06

Message:
Logged In: YES 
user_id=21627

IMO, it would be good if it would fall back to HTML help if
the chm file is not found.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:31:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:31:35 -0700
Subject: [Patches] [ python-Patches-540583 ] IDLE calls MS HTML Help Python Docs
Message-ID: <E16whIF-0000Q0-00@usw-sf-web2.sourceforge.net>

Patches item #540583, was opened at 2002-04-07 16:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470

Category: IDLE
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Guido van Rossum (gvanrossum)
Summary: IDLE calls MS HTML Help Python Docs

Initial Comment:
A little patch to enable IDLE to call a Python Docs in
HTML Help format if it becomes part of the standard
Windows distribution.
A few things:
- The patch uses os.startfile() instead of
webbrowser.open() because the default browser may not
be IExplorer.
- The name of .chm file is hardwire
- I assume that the .chm file resides in the same
directory of the python exec.
- I'll try to upload a similar patch on idlefork.

Regards,
-Hernan


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:31

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as EditorWindow.py 1.41.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-07 19:10

Message:
Logged In: YES 
user_id=112690

Ok. I can add the fallback.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-07 19:06

Message:
Logged In: YES 
user_id=21627

IMO, it would be good if it would fall back to HTML help if
the chm file is not found.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540583&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:35:02 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:35:02 -0700
Subject: [Patches] [ python-Patches-403972 ] threaded profiler.
Message-ID: <E16whLa-0000Sg-00@usw-sf-web2.sourceforge.net>

Patches item #403972, was opened at 2001-02-23 16:21
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403972&group_id=5470

Category: Demos and tools
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Amila Fernando (amila)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: threaded profiler.

Initial Comment:
Basically a profiler that can handle threaded programs and generate
profiling snapshots. It does however have some situations it cannot handle well . (see included README for details). 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:35

Message:
Logged In: YES 
user_id=21627

Since there has been no further comments on this issue, I
reject this patch.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-09 11:47

Message:
Logged In: YES 
user_id=21627

I recommend to reject this patch. Since it is pure-Python,
it is probably more suited as a stand-alone package.

For inclusion into Python, trying to hook into thread
creation is a hack, IMO, there are certainly ways to cheat
that technique.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2001-07-04 06:27

Message:
Logged In: YES 
user_id=3066

Assigned to me since I've been digging into the profiling
support lately.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2001-05-09 18:11

Message:
Logged In: YES 
user_id=31392

Perhaps you could share this on comp.lang.python and see if
people can help you fix the situations it doesn't handle
well.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403972&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 11:38:26 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 03:38:26 -0700
Subject: [Patches] [ python-Patches-418465 ] patches for python-mode.el V4.1
Message-ID: <E16whOs-0000Un-00@usw-sf-web2.sourceforge.net>

Patches item #418465, was opened at 2001-04-24 07:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=418465&group_id=5470

Category: Demos and tools
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Bob Weiner (bwcto)
Assigned to: Barry Warsaw (bwarsaw)
Summary: patches for python-mode.el V4.1

Initial Comment:
This patch fixes a number of issues with python-
mode.el (the fixes are documented within the patch)
and also extends python-mode.el so that it works
with the emacs interface to pydoc, pydoc.el which
was just released <www.deepware.com/pub/python/>.

Please let me know if you decide to apply all of
these changes and put this into a production
release of Python, at which point, I will stop
distributing the modified python-mode.el with the
pydoc.el package.

Thanks,

Bob


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 12:38

Message:
Logged In: YES 
user_id=21627

Since there is apparently no interest in this patch anymore,
I reject it.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-09 11:55

Message:
Logged In: YES 
user_id=21627

The patch fails completely when applied to python-mode.el.
This is certainly not the fault of the submitter, but due to
the fact that it has been sitting around for such a long time.

Bob, are you still interested in this patch, and willing to
provide an updated version?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=418465&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 17:46:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 09:46:27 -0700
Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py
Message-ID: <E16wn91-0000DZ-00@usw-sf-web4.sourceforge.net>

Patches item #543498, was opened at 2002-04-13 22:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: s/Copyright/License/ in bdist_rpm.py

Initial Comment:
The "Copyright" field in RPM spec files is obsolete. 
"License" should be used instead. 

----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-14 16:46

Message:
Logged In: YES 
user_id=7887

The rpm.org site is much more obsolete than this tag  
<wink>.  
  
Here is an excerpt from a message of Jeff Johnson in  
rpm-list (subject is "Re: three questions about building  
rpms"):  
  
----  
[...] 
This is historical legacy. Originally rpm had  
        Copyright: GPL  
but everyone said  
        GPL is not a copyright.  
  
So, rpm changed the tag name to License:, and, for  
backward compatibility, used the same numeric value as  
RPMTAG_COPYRIGHT. Now, everyone gets to ask the next  
question  
  
        Which is it Copyright: or License:?  
  
and the answer is <shrug> :-)  
----  
  
Every distribution working with rpms, including redhat,  
has changed (or is changing) the tag to License.  
Copyright, as Jeff said by himself, is a misgiven name  
for that field.  
 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 09:58

Message:
Logged In: YES 
user_id=21627

Can you provide a pointer that shows this obsoletion?

http://www.rpm.org/RPM-HOWTO/build.html#SPEC-FILE

still says Copyright.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 21:15:59 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 13:15:59 -0700
Subject: [Patches] [ python-Patches-476814 ] foreign-platform newline support
Message-ID: <E16wqPn-0005Pm-00@usw-sf-web1.sourceforge.net>

Patches item #476814, was opened at 2001-10-31 17:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
>Assigned to: Barry Warsaw (bwarsaw)
Summary: foreign-platform newline support

Initial Comment:
This patch enables Python to interpret all known
newline conventions,
CR, LF or CRLF, on all platforms.

This support is enabled by configuring with
--with-universal-newlines
(so by default it is off, and everything should behave
as usual).

With universal newline support enabled two things
happen:
- When importing or otherwise parsing .py files any
newline convention
  is accepted.
- Python code can pass a new "t" mode parameter to
open() which
  reads files with any newline convention. "t" cannot
be combined with
  any other mode flags like "w" or "+", for obvious
reasons.

File objects have a new attribute "newlines" which
contains the type of
newlines encountered in the file (or None when no
newline has been seen,
or "mixed" if there were various types of newlines).

Also included is a test script which tests both file
I/O and parsing.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2002-04-14 22:15

Message:
Logged In: YES 
user_id=45365

Barry,
I've checked in the code fixes, but I don't dare check in the documentation fixes myself, as I have no TeX and hence no way to test them. Could you do this, please, and also check that I'm following all the relevant guidelines?
Thanks!

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-04-14 01:31

Message:
Logged In: YES 
user_id=45365

A final tweaks: return a tuple of newline values in stead of 'mixed'.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-11 22:31

Message:
Logged In: YES 
user_id=21627

What is the rationale for making this a compile-time option?
It seems to complicate things, with no apparent advantage.

If this is for backwards compatibility, don't make it an
option: nobody will rebuild Python just to work around a
compatibility problem.

Apart from that, the patch looks goo.d

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-29 00:17

Message:
Logged In: YES 
user_id=45365

New doc patch, and new version of the patch that mainly allows the U to be specified (no-op) in non-univ-newline-builds.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-25 22:07

Message:
Logged In: YES 
user_id=6380

Thanks! But there's no documentation. Could I twist your arm
for a separate doc patch?

I'm tempted to give this a +1, but I'd like to hear from MvL
and MAL to see if they foresee any interaction with their
PEP 262 implemetation.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-03-13 23:44

Message:
Logged In: YES 
user_id=45365

A new version of the patch. Main differences are that U is now the mode character to trigger universal newline input and --with-universal-newlines is default on.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2002-01-16 23:47

Message:
Logged In: YES 
user_id=45365

This version of the patch addresses the bug in Py_UniversalNewlineFread and fixes up some minor details. Tim's other issues are addressed (at least: I think they are:-) in a forthcoming PEP.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-12-14 00:57

Message:
Logged In: YES 
user_id=31435

Back to Jack -- and sorry for sitting on it so long.  
Clearly this isn't making it into 2.2 in the core.  As I 
said on Python-Dev, I believe this needs a PEP:  the design 
decisions are debatable, so *should* be debated outside the 
Mac community too.  Note, though, that I can't stop you 
from adding it to the 2.2 Mac distribution (if you want it 
badly enough there).

If a PEP won't be written, I suggest finding someone else 
to review it again; maybe Guido.  Note that the patch needs 
doc changes too.  The patch to regrtest.py doesn't belong 
here (I assume it just slipped in).  There seems a lot of 
code in support of the f_newlinetypes member, and the value 
of that member isn't clear -- I can't imagine a good use 
for it (maybe it's a Mac thing?).  The implementation of 
Py_UniversalNewlineFread appears incorrect to me:  it reads 
n bytes *every* time around the outer loop, no matter how 
few characters are still required, and n doesn't change 
inside the loop.  The business about the GIL may be due to 
the lack of docs:  are, or are not, people supposed to 
release the GIL themselves around calls to these guys?  
It's not documented, and it appears your intent differed 
from my guess.  Finally, it would be better to call ferror
() after calling fread() instead of before it <wink>.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2001-11-14 16:13

Message:
Logged In: YES 
user_id=45365

Here's a new version of the patch. To address your issues
one by one:
- get_line and Py_UniversalNewlineFgets are too difficult to
integrate, at leat,
I don't see how I could do it. The storage management of
get_line gets in the way.

- The global lock comment I don't understand. The
Universal... routines are
replacements for fgets() and fread(), so have nothing to do
with the interpreter lock.

- The logic of all three routines (get_line too) has changed
and I've put comments in.
I hope this addresses some of the points.

- If universal_newline is false for a certain PyFileObject
we now immedeately take
a quick exit via fgets() or fread().

There's also a new test script, that tests some more border
cases (like lines longer
than 100 characters, and a lone CR just before end of file).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-11-05 09:16

Message:
Logged In: YES 
user_id=31435

It would be better if get_line just called 
Py_UniversalNewlineFgets (when appropriate) instead of 
duplicating its logic inline.

Py_UniversalNewlineFgets and Py_UniversalNewlineFread 
should deal with releasing the global lock themselves -- 
the correct granularity for lock release/reacquire is 
around the C-level input routines (esp. for fread).

The new routines never check for I/O errors!  Why not?  It 
seems essential.

The new Fgets checks for EOF at the end of the loop instead 
of the top.  This is surprising, and I stared a long time 
in vain trying to guess why.  Setting

newlinetypes |= NEWLINE_CR;

immediately after seeing an '\r' would be as fast (instead 
of waiting to see EOF and then inferring the prior 
existence of '\r' indirectly from the state of the 
skipnextlf flag).

Speaking of which <wink>, the fobj tests in the inner loop 
waste cycles.  Set the local flag vrbls whether or not fobj 
is NULL.  When you're *out* of the inner loop you can 
simply decline to store the new masks when fobj is NULL 
(and you're already doing the latter anyway).  A test and 
branch inside the loop is much more expensive than or'ing 
in a flag bit inside the loop, ditto harder to understand.

Floating the univ_newline test out of the loop (and 
duplicating the loop body, one way for univ_newline true 
and the other for it false) would also save a test and 
branch on every character.

Doing fread one character at a time is very inefficient.  
Since you know you need to obtain n characters in the end, 
and that these transformations require reading at least n 
characters, you could very profitably read n characters in 
one gulp at the start, then switch to k at a time where k 
is the number of \r\n pairs seen since the last fread 
call.  This is easier to code than it sounds <wink>.

It would be fine by me if you included (and initialized) 
the new file-object fields all the time, whether or not 
universal newlines are configured.  I'd rather waste a few 
bytes in a file object than see #ifdefs spread thru the 
code.

I'll be damned if I can think of a quick way to do this 
stuff on Windows -- native Windows fgets() is still the 
only Windows handle we have on avoiding crushing thread 
overhead inside MS's C library.  I'll think some more about 
it (the thrust still being to eliminate the 't' mode flag, 
as whined about <wink> on Python-Dev).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-10-31 18:38

Message:
Logged In: YES 
user_id=6380

Tim, can you review this or pass it on to someone else who
has time?

Jack developed this patch after a discussion in which I was
involved in some of the design, but I won't have time to
look at it until December.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=476814&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 23:03:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 15:03:04 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16ws5Q-0006aM-00@usw-sf-web1.sourceforge.net>

Patches item #543865, was opened at 2002-04-15 00:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 23:16:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 15:16:38 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16wsIY-0003ng-00@usw-sf-web4.sourceforge.net>

Patches item #543865, was opened at 2002-04-14 18:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
>Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-14 18:16

Message:
Logged In: YES 
user_id=31435

Note that 543840 got fixed before you uploaded this patch, 
so please take that out of this patch (one bug == one patch 
is an excellent idea, and you can attach a patch to the bug 
report instead).  Please combine the remaining patch with 
the test-suite change too -- opening lots of distinct 
tracker items makes more work for everyone (including you!).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Sun Apr 14 23:18:13 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 15:18:13 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16wsK5-0003ox-00@usw-sf-web4.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 00:47:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 16:47:08 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16wti8-0002Fm-00@usw-sf-web2.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 00:48:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 16:48:33 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16wtjV-0002Ge-00@usw-sf-web2.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 01:03:17 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 17:03:17 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16wtxl-0004s7-00@usw-sf-web4.sourceforge.net>

Patches item #543865, was opened at 2002-04-15 00:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 02:03

Message:
Logged In: YES 
user_id=112690

Ok, done!

Bug report #543387 has patch and test suite.

Pure enhancements to complex numbers tests stay at patch
submission #543847 but now they don't include code related
to the the reported bugs.

Please, sorry for the mess.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-15 00:16

Message:
Logged In: YES 
user_id=31435

Note that 543840 got fixed before you uploaded this patch, 
so please take that out of this patch (one bug == one patch 
is an excellent idea, and you can attach a patch to the bug 
report instead).  Please combine the remaining patch with 
the test-suite change too -- opening lots of distinct 
tracker items makes more work for everyone (including you!).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 01:34:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 17:34:19 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16wuRn-0002ji-00@usw-sf-web2.sourceforge.net>

Patches item #543865, was opened at 2002-04-14 18:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-14 20:34

Message:
Logged In: YES 
user_id=33168

Should this patch be closed since #543867 was entered?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 20:03

Message:
Logged In: YES 
user_id=112690

Ok, done!

Bug report #543387 has patch and test suite.

Pure enhancements to complex numbers tests stay at patch
submission #543847 but now they don't include code related
to the the reported bugs.

Please, sorry for the mess.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-14 18:16

Message:
Logged In: YES 
user_id=31435

Note that 543840 got fixed before you uploaded this patch, 
so please take that out of this patch (one bug == one patch 
is an excellent idea, and you can attach a patch to the bug 
report instead).  Please combine the remaining patch with 
the test-suite change too -- opening lots of distinct 
tracker items makes more work for everyone (including you!).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 02:06:53 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 18:06:53 -0700
Subject: [Patches] [ python-Patches-541031 ] context sensitive help/keyword search
Message-ID: <E16wuxJ-0005hS-00@usw-sf-web3.sourceforge.net>

Patches item #541031, was opened at 2002-04-08 10:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470

>Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: context sensitive help/keyword search

Initial Comment:
This script/module looks up keywords in the Python 
manuals.

It is usable as CGI script - a version is online at
http://starship.python.net/crew/theller/cgi-
bin/pyhelp.cgi

It can also by used from the command line:
python pyhelp.py keyword

It can also be used to implement context sensitive 
help in IDLE or Xemacs (for example) by simply 
selecting a word and pressing F1.

It can use the online version of the manuals at 
www.python.org/doc/, or it can use local installed 
html pages.

The script/module scans the index pages of the docs 
for hyperlinks, and pickles the results to disk.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:06

Message:
Logged In: YES 
user_id=6380

Maybe Fred finds this interesting?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 02:10:12 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 14 Apr 2002 18:10:12 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16wv0W-0005jc-00@usw-sf-web3.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 10:43:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 02:43:27 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16x31D-0002Lq-00@usw-sf-web4.sourceforge.net>

Patches item #543865, was opened at 2002-04-15 00:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 11:43

Message:
Logged In: YES 
user_id=112690

Yes. I think this entry should be closed as its targets 
are/were taken care in bug entries #543387 / #543840.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-15 02:34

Message:
Logged In: YES 
user_id=33168

Should this patch be closed since #543867 was entered?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 02:03

Message:
Logged In: YES 
user_id=112690

Ok, done!

Bug report #543387 has patch and test suite.

Pure enhancements to complex numbers tests stay at patch
submission #543847 but now they don't include code related
to the the reported bugs.

Please, sorry for the mess.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-15 00:16

Message:
Logged In: YES 
user_id=31435

Note that 543840 got fixed before you uploaded this patch, 
so please take that out of this patch (one bug == one patch 
is an excellent idea, and you can attach a patch to the bug 
report instead).  Please combine the remaining patch with 
the test-suite change too -- opening lots of distinct 
tracker items makes more work for everyone (including you!).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 12:42:34 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 04:42:34 -0700
Subject: [Patches] [ python-Patches-544113 ] merging sorted sequences
Message-ID: <E16x4sU-0003wJ-00@usw-sf-web3.sourceforge.net>

Patches item #544113, was opened at 2002-04-15 13:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544113&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: merging sorted sequences

Initial Comment:
This patch is intended to add to the bisect module a function witch permit to merge several sorted sequences into an ordered list.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544113&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 12:49:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 04:49:44 -0700
Subject: [Patches] [ python-Patches-543865 ] bugfixes on complexobject.c
Message-ID: <E16x4zQ-00041L-00@usw-sf-web3.sourceforge.net>

Patches item #543865, was opened at 2002-04-14 18:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470

Category: Core (C code)
Group: None
>Status: Closed
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: bugfixes on complexobject.c

Initial Comment:
A patch that fixes bugs #543840 (complex() constructor
doesn't fail in certain cases) and #543387 (floor
division doen't raise exception as indicates PEP 238)
is included here.

For the first bug, I moved a block of C code that
checks the presence of '\0' outside the loop.

For the other one, I just cleared the nb_floor_divide
entry in the table. Also deleted the complex_int_div()
function.

I'm uploading also tests for this patch, but it goes on
a   different submit.


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 05:43

Message:
Logged In: YES 
user_id=112690

Yes. I think this entry should be closed as its targets 
are/were taken care in bug entries #543387 / #543840.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-14 20:34

Message:
Logged In: YES 
user_id=33168

Should this patch be closed since #543867 was entered?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 20:03

Message:
Logged In: YES 
user_id=112690

Ok, done!

Bug report #543387 has patch and test suite.

Pure enhancements to complex numbers tests stay at patch
submission #543847 but now they don't include code related
to the the reported bugs.

Please, sorry for the mess.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-14 18:16

Message:
Logged In: YES 
user_id=31435

Note that 543840 got fixed before you uploaded this patch, 
so please take that out of this patch (one bug == one patch 
is an excellent idea, and you can attach a patch to the bug 
report instead).  Please combine the remaining patch with 
the test-suite change too -- opening lots of distinct 
tracker items makes more work for everyone (including you!).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543865&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 14:41:47 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 06:41:47 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16x6jr-0008Ad-00@usw-sf-web1.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 14:53:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 06:53:21 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16x6v3-0005Zn-00@usw-sf-web3.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 09:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 09:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 15:43:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 07:43:07 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16x7hD-0003YL-00@usw-sf-web2.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 15:47:13 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 07:47:13 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16x7lB-0005w7-00@usw-sf-web4.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 10:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 10:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 09:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 09:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From novaluz@sp.mailbr.com.br  Mon Apr 15 16:51:06 2002
From: novaluz@sp.mailbr.com.br (NovaLuz)
Date: Mon, 15 Apr 2002 12:51:06 -0300
Subject: [Patches] Concorra a uma Luminaria - Novaluz - GRATÏS!!
Message-ID: <1100074-22002411515516810@sp.mailbr.com.br>

This is a multi-part message in MIME format.

------=_NextPart_94915C5ABAF209EF376268C8
Content-Type: multipart/alternative; 
	boundary="----=_NextPart_84815C5ABAF209EF376268C8"

------=_NextPart_84815C5ABAF209EF376268C8
Content-type: text/plain; charset=US-ASCII

                                                                  
------=_NextPart_84815C5ABAF209EF376268C8
Content-Type: text/html; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml"
xmlns:o=3D"urn:schemas-microsoft-com:office:office"
xmlns:w=3D"urn:schemas-microsoft-com:office:word"
xmlns=3D"http://www=2Ew3=2Eorg/TR/REC-html40">

<head>
<meta http-equiv=3DContent-Type content=3D"text/html; charset=3Dwindows-12=
52">
<meta name=3DProgId content=3DWord=2EDocument>
<meta name=3DGenerator content=3D"Microsoft Word 9">
<meta name=3DOriginator content=3D"Microsoft Word 9">
<link rel=3DFile-List href=3D"=2E/sorteio_arquivos/filelist=2Exml">
<link rel=3DEdit-Time-Data href=3D"=2E/sorteio_arquivos/editdata=2Emso">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
=2E=2Eshape {behavior:url(#default#VML);}
</style>
<![endif]-->
<title>A NovaLuz est=E1 sorteando um equipamento de Ilumina=E7=E3o de Emer=
g=EAncia</title>
<!--[if gte mso 9]><xml>
 <o:DocumentProperties>
  <o:Template>Normal</o:Template>
  <o:LastAuthor>Leticia</o:LastAuthor>
  <o:Revision>2</o:Revision>
  <o:TotalTime>27</o:TotalTime>
  <o:Created>2002-04-01T14:34:00Z</o:Created>
  <o:LastSaved>2002-04-01T14:34:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Words>118</o:Words>
  <o:Characters>677</o:Characters>
  <o:Company>NOVA LUZ</o:Company>
  <o:Lines>5</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:CharactersWithSpaces>831</o:CharactersWithSpaces>
  <o:Version>9=2E2812</o:Version>
 </o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <w:WordDocument>
  <w:Zoom>75</w:Zoom>
  <w:HyphenationZone>21</w:HyphenationZone>
 </w:WordDocument>
</xml><![endif]-->
<style>
<!--
 /* Font Definitions */
@font-face
=09{font-family:"Lucida Handwriting";
=09panose-1:3 1 1 1 1 1 1 1 1 1;
=09mso-font-charset:0;
=09mso-generic-font-family:script;
=09mso-font-pitch:variable;
=09mso-font-signature:3 0 0 0 1 0;}
@font-face
=09{font-family:"Lucida Sans";
=09panose-1:2 11 6 2 3 5 4 9 2 4;
=09mso-font-charset:0;
=09mso-generic-font-family:swiss;
=09mso-font-pitch:variable;
=09mso-font-signature:3 0 0 0 1 0;}
@font-face
=09{font-family:"Comic Sans MS";
=09panose-1:3 15 7 2 3 3 2 2 2 4;
=09mso-font-charset:0;
=09mso-generic-font-family:script;
=09mso-font-pitch:variable;
=09mso-font-signature:647 0 0 0 159 0;}
@font-face
=09{font-family:Verdana;
=09panose-1:2 11 6 4 3 5 4 4 2 4;
=09mso-font-charset:0;
=09mso-generic-font-family:swiss;
=09mso-font-pitch:variable;
=09mso-font-signature:647 0 0 0 159 0;}
 /* Style Definitions */
p=2EMsoNormal, li=2EMsoNormal, div=2EMsoNormal
=09{mso-style-parent:"";
=09margin:0cm;
=09margin-bottom:=2E0001pt;
=09mso-pagination:widow-orphan;
=09font-size:12=2E0pt;
=09font-family:"Times New Roman";
=09mso-fareast-font-family:"Times New Roman";}
h1
=09{mso-style-next:Normal;
=09margin:0cm;
=09margin-bottom:=2E0001pt;
=09text-align:center;
=09mso-pagination:widow-orphan;
=09page-break-after:avoid;
=09mso-outline-level:1;
=09font-size:12=2E0pt;
=09font-family:Verdana;
=09mso-font-kerning:0pt;}
p=2EMsoBodyText, li=2EMsoBodyText, div=2EMsoBodyText
=09{margin:0cm;
=09margin-bottom:=2E0001pt;
=09text-align:center;
=09mso-pagination:widow-orphan;
=09font-size:12=2E0pt;
=09font-family:"Lucida Handwriting";
=09mso-fareast-font-family:"Times New Roman";
=09mso-bidi-font-family:"Times New Roman";}
a:link, span=2EMsoHyperlink
=09{color:blue;
=09text-decoration:underline;
=09text-underline:single;}
a:visited, span=2EMsoHyperlinkFollowed
=09{color:purple;
=09text-decoration:underline;
=09text-underline:single;}
@page Section1
=09{size:612=2E0pt 792=2E0pt;
=09margin:70=2E85pt 3=2E0cm 70=2E85pt 3=2E0cm;
=09mso-header-margin:35=2E4pt;
=09mso-footer-margin:35=2E4pt;
=09mso-paper-source:0;}
div=2ESection1
=09{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext=3D"edit" spidmax=3D"1032"/>
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext=3D"edit">
  <o:idmap v:ext=3D"edit" data=3D"1"/>
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=3DPT-BR link=3Dblue vlink=3Dpurple style=3D'tab-interval:35=2E4=
pt'>

<div class=3DSection1>

<p class=3DMsoBodyText><b><span style=3D'font-family:"Lucida Sans";color:n=
avy'>A
NovaLuz est=E1 sorteando uma Lumin=E1ria Normal e Emerg=EAncia!!!<o:p></o:=
p></span></b></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-family:Verdana'><![if !supportEmptyParas]>&nbsp;<![endif]><o=
:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-family:Verdana'>Para participar basta acessar o website: </s=
pan><span
style=3D'font-size:14=2E0pt;mso-bidi-font-size:12=2E0pt;font-family:Verdan=
a'><a
href=3D"http://www=2Enovaluz=2Ecom=2Ebr/"><span style=3D'font-family:"Time=
s New Roman"'>http://www=2Enovaluz=2Ecom=2Ebr/</span></a></span><span
style=3D'font-family:Verdana'>, preencher o formul=E1rio do concurso e ind=
icar este
site aos amigos=2E<o:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-family:Verdana'><![if !supportEmptyParas]>&nbsp;<![endif]><o=
:p></o:p></span></p>

<div align=3Dcenter>

<table border=3D0 cellspacing=3D0 cellpadding=3D0 style=3D'border-collapse=
:collapse;
 mso-padding-alt:0cm 3=2E5pt 0cm 3=2E5pt'>
 <tr>
  <td width=3D638 colspan=3D2 valign=3Dtop style=3D'width:478=2E25pt;paddi=
ng:0cm 3=2E5pt 0cm 3=2E5pt'>
  <h1>Sorteio =96 Dia 30 de Abril/2002 </h1>
  </td>
 </tr>
 <tr>
  <td width=3D191 valign=3Dtop style=3D'width:143=2E1pt;padding:0cm 3=2E5p=
t 0cm 3=2E5pt'>
  <p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><!--[if =
gte vml 1]><v:shapetype
   id=3D"_x0000_t75" coordsize=3D"21600,21600" o:spt=3D"75" o:preferrelati=
ve=3D"t"
   path=3D"m@4@5l@4@11@9@11@9@5xe" filled=3D"f" stroked=3D"f">
   <v:stroke joinstyle=3D"miter"/>
   <v:formulas>
    <v:f eqn=3D"if lineDrawn pixelLineWidth 0"/>
    <v:f eqn=3D"sum @0 1 0"/>
    <v:f eqn=3D"sum 0 0 @1"/>
    <v:f eqn=3D"prod @2 1 2"/>
    <v:f eqn=3D"prod @3 21600 pixelWidth"/>
    <v:f eqn=3D"prod @3 21600 pixelHeight"/>
    <v:f eqn=3D"sum @0 0 1"/>
    <v:f eqn=3D"prod @6 1 2"/>
    <v:f eqn=3D"prod @7 21600 pixelWidth"/>
    <v:f eqn=3D"sum @8 21600 0"/>
    <v:f eqn=3D"prod @7 21600 pixelHeight"/>
    <v:f eqn=3D"sum @10 21600 0"/>
   </v:formulas>
   <v:path o:extrusionok=3D"f" gradientshapeok=3D"t" o:connecttype=3D"rect=
"/>
   <o:lock v:ext=3D"edit" aspectratio=3D"t"/>
  </v:shapetype><v:shape id=3D"_x0000_i1025" type=3D"#_x0000_t75" style=3D=
'width:114pt;
   height:119=2E25pt'>
   <v:imagedata src=3D"=2E/sorteio_arquivos/image001=2Egif" o:title=3D"nl9=
capamenor"/>
  </v:shape><![endif]--><![if !vml]><img border=3D0 width=3D152 height=3D1=
59
  src=3D"cid:38100-2200241151539523804@sp=2Emailbr=2Ecom=2Ebr" v:shapes=3D=
"_x0000_i1025"><![endif]><br>
  <!--[if gte vml 1]><v:shape id=3D"_x0000_i1026" type=3D"#_x0000_t75" sty=
le=3D'width:90pt;
   height:30pt'>
   <v:imagedata src=3D"=2E/sorteio_arquivos/image002=2Egif" o:title=3D"nov=
o2"/>
   <o:lock v:ext=3D"edit" cropping=3D"t"/>
  </v:shape><![endif]--><![if !vml]><img border=3D0 width=3D120 height=3D4=
0
  src=3D"cid:255661-2200241151539523805@sp=2Emailbr=2Ecom=2Ebr" v:shapes=3D=
"_x0000_i1026"><![endif]><span
  style=3D'font-family:Verdana'><o:p></o:p></span></p>
  </td>
  <td width=3D447 valign=3Dtop style=3D'width:335=2E15pt;padding:0cm 3=2E5=
pt 0cm 3=2E5pt'>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'><![if !supportE=
mptyParas]>&nbsp;<![endif]><o:p></o:p></span></p>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'>Lumin=E1ria Nor=
mal e
  Emerg=EAncia =96 Modelo NL 3x9 NE<o:p></o:p></span></p>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'><![if !supportE=
mptyParas]>&nbsp;<![endif]><o:p></o:p></span></p>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'>Equipamento com=
 fun=E7=E3o 2
  em 1=2E Possui 3 l=E2mpadas PL de 9 watts, sendo que duas delas proporci=
onam a
  ilumina=E7=E3o normal para o ambiente e a terceira =E9 acionada automati=
camente na
  falta de energia el=E9trica=2E<o:p></o:p></span></p>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'><![if !supportE=
mptyParas]>&nbsp;<![endif]><o:p></o:p></span></p>
  <p class=3DMsoNormal><span style=3D'font-family:Verdana'>Autonomia para =
a luz de
  emerg=EAncia: 4 horas<o:p></o:p></span></p>
  <p class=3DMsoNormal><![if !supportEmptyParas]>&nbsp;<![endif]><span
  style=3D'font-family:Verdana'><o:p></o:p></span></p>
  </td>
 </tr>
</table>

</div>

<p class=3DMsoBodyText><b><span style=3D'font-size:16=2E0pt;mso-bidi-font-=
size:12=2E0pt;
font-family:"Comic Sans MS";color:#00CC00'><![if !supportEmptyParas]>&nbsp=
;<![endif]><o:p></o:p></span></b></p>

<p class=3DMsoBodyText><b><span style=3D'font-size:16=2E0pt;mso-bidi-font-=
size:12=2E0pt;
font-family:"Comic Sans MS";color:#FF6600'>Boa Sorte!!!<o:p></o:p></span><=
/b></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-family:Verdana'><![if !supportEmptyParas]>&nbsp;<![endif]><o=
:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><b><span
style=3D'font-family:"Lucida Sans"'>El=E9trica NovaLuz Ltda<o:p></o:p></sp=
an></b></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><b><span
style=3D'font-family:"Lucida Sans"'><a href=3D"http://www=2Eanovaluz=2Ecom=
=2Ebr/"><span
style=3D'text-decoration:none;text-underline:none'><!--[if gte vml 1]><v:s=
hape
 id=3D"_x0000_i1027" type=3D"#_x0000_t75" style=3D'width:135=2E75pt;height=
:46=2E5pt'>
 <v:imagedata src=3D"=2E/sorteio_arquivos/image003=2Egif" o:title=3D"logot=
iponovaluz2"/>
</v:shape><![endif]--><![if !vml]><img border=3D0 width=3D181 height=3D62
src=3D"cid:188792-2200241151539523806@sp=2Emailbr=2Ecom=2Ebr" v:shapes=3D"=
_x0000_i1027"><![endif]></span></a></span></b><span
style=3D'font-family:Verdana'><o:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-size:10=2E0pt;mso-bidi-font-size:12=2E0pt;font-family:Verdan=
a'><a
href=3D"http://www=2Eanovaluz=2Ecom=2Ebr/">http://www=2Eanovaluz=2Ecom=2Eb=
r/</a><o:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-size:10=2E0pt;mso-bidi-font-size:12=2E0pt;font-family:Verdan=
a'>Tel: (11)
222 2699 * Fax: (11) 3331 3033<o:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-size:10=2E0pt;mso-bidi-font-size:12=2E0pt;font-family:Verdan=
a'>Luz de
Emerg=EAncia &amp; Sensores de Presen=E7a</span><span style=3D'font-family=
:Verdana'> <o:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-family:Verdana'><![if !supportEmptyParas]>&nbsp;<![endif]><o=
:p></o:p></span></p>

<p class=3DMsoNormal align=3Dcenter style=3D'text-align:center'><span
style=3D'font-size:10=2E0pt;mso-bidi-font-size:12=2E0pt;font-family:Verdan=
a'><![if !supportEmptyParas]>&nbsp;<![endif]><o:p></o:p></span></p>

</div>

</body>

</html>

------=_NextPart_84815C5ABAF209EF376268C8--

------=_NextPart_94915C5ABAF209EF376268C8
Content-Type: application/octet-stream; name="image001.gif"
Content-Description: image001.gif
Content-Id: <38100-2200241151539523804@sp.mailbr.com.br>

R0lGODlhmACfAPUYACkpKDk5QjlCQkVDM0lKSXFwaAAY3hUt3jRF1k1e1mNrzmt7znuEe3uEyIV7
couHeJWVjaSdlamnmLWyrZ+iyb3Gvca9ts3IusDAwM7MytnX1N7e5+fe2e7p2fDt6/39/MDAwAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAEAACAAIf4IR2lmIEx1YmUALAAA
AACYAJ8AxSkpKDk5QjlCQkVDM0lKSXFwaAAY3hUt3jRF1k1e1mNrzmt7znuEe3uEyIV7couHeJWV
jaSdlamnmLWyrZ+iyb3Gvca9ts3IusDAwM7MytnX1N7e5+fe2e7p2fDt6/39/MDAwAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb+QINwSCwaj8ikcslsOp/QqAJEpRwoVGpimu16
v+CweEwum8/oNFUh5IIaCC8lrq7b7/h8nW1w6/+AgYJ3fH4UCQgKE4OMjY54hVl8RoMZGg8THR0c
mhcXH6ChHhkTpRMaHhqqGmesj46Ra0h/GRkQAAUTER4fHh4docGhpBAFBRAQF6jDFxISFsvCvSAQ
ERMZqauur2mxWQhYWQZ3GRcA5wAEA8gSvZq+79Ly8/TBqRMSDg4PEhccvKmaPYsGqoMnWx22DfJG
BcMBNlbCpcnwYQAuAhgxVpvw4Z8wXqB8udPgD9jHX5re+VpZr2VIaxEkdJDAC5vCQAxBRNT5sA7+
qYwZcU14gEpDB4AceoVMyYGTR5fyAEJdykFDVU0aCijj6IHjzT85t2SxUkcDUAICAqzjN0HgKQ00
V8qdO1cpyKlR70LV9OHCKUYMMSw4cCCBRDUTzmJMt7GcLgjO2lqoZUkbJ7pzq6qq+i9zZ7lNMf9D
tbIA0QKoIH/Nk/OP2cXn0MoWIEC2bcW4gaJLR4DBIsePJSy6YOECCOIgammzumE5ZWIFJETI9QD1
wjaNlnsAoasA7Nq5caebUACAxQgwH0BQX+1YsQcM2DmLKTyCA2NnYxM4VioCsQgPpNJIa9wEstwH
CZUCwX3IxIRMgxIUE50EBPCmGGqqFOgFgRr+dujhgNh9KOKIgnBI4okokmFiiiy2CMKKLsY4Iowy
1qihNwocgMCOPI5j448g9jHWFVk45AeQSOqBIx0UGEBkklAqGSKJtVgQATQZgLBKlAUSiEECBiCA
QSCkALCeBgimgkxbJvWiAT7CtfnRm+vZpA2XX9CYBzYVDrCOMwUphdc8o0hQnQPWRJOKBc70o1co
HeiSaDZb3jhlFxNQcNhEGZSnH0YRynSUS+/wNeip9sC13gPKGGUVMBStJqWQXURkgBUL+NSBYrVF
eEEqGwDEkj2bkHJKL8PWwxQ8KR2VbD0gBKhML8oEeaSRFDTwpBoP5BcAZFdqMBlpIpWL6rn+9sjl
Ei8SZHgdrbUe4GSudVwgXlsA8jMQZqBl4yy/AAsbsFSihOQLXNDEVUBC77pxCGENLHJHt7rtBgB4
FaOznzW1+CVdKco9R9lmHnzmi2bLWZbychxMdwFkHWTQbsOMaIBOALNhVBvG4YUnAC4SeAeZfEMp
FoDG/CU3AYTC+WVKKRaYEuF690lc4qUnptyLLgxgFEGFQin44IOmVBMhfTFNp54x7kFAXni4UCor
IHriSYbW93R79DkRuHti3XYHriLWghfeDeGGJz44vIo3PgbgjhcOeeSB53TIjg1QrjiOR4Lgo+aV
X6ppraBLPiU4X3xeOpeWI4JA5quHzrj+TpvGjqc3CcAOwhUNqG47kpP//iOMvtOSpVUlcaBMqKTd
eYfft09JgQLUV089mRncx6o7wBhUijKPioLPpB84H3lgYCZgNSDYeAcgKpugCQqd+/BzwahuMioc
QanqIpzcrEOcIGrhgQL4aUICO5UHKmCoAjjAbQQRF5zuFxWZuc1O0JsR4XakgARsyycesEg61gEZ
g4lCXeiyx8uM8YAIUDAkF0CPC8OHIJlFJyGVspYcDjAmECQgAXbIQAd2k5Gl0QR/vehMumiYQmkQ
4wFQ7IdJUiGd5HFgf7CQnqa2uEWffMBTZ2HHJpgoDM00qy5NdMn9cAgXUExAQIDBGur+fFS8MvAJ
N+uxQCrwV7KoNEUVSKwHPJqSkmcNaigfyET5Mkg3Ae7pNbqhTluM1Yta2AMUFwBOS1AIlUGekZOX
qMonIjU3SMjxMHUkg8wUI5R+ZMACuXgUGdN4QpFMxRfluOLW4Hi12XHHdbUzQ6fO8rMWNk1mbZEA
NjBTMEHRspPwsEo0gNGWUtohLPTiiU+I6bVkRC0yECxZUy7DkoGZiy7IwgsnpXGwB5SPZrI4Qh0S
gxt8lIIfxVHOKkxGF9KgLDTmDGjAPrABArTrAUfRijXVwKFDOKkOEGAlbyzWm0xaQkEtlIwFJlMO
faqCLpwpmdys4oHmcKA5m9nMBkb+U5TOFKMDD9CjLuIIr+lZr3p18E5QzvGz2+wMLTgzGtIu+EpJ
Cec3WUqqRwH6i5GuTBsZ+NrLIFAOCCz0cL7Uw2s0FtTa4OynPfNZOvCz09jw52kwMUU5omac5FTG
OUvjx3ocWIAs9bJzf6DQbn4K1rCKpwDTQUZ0yNM2s6FHOhGYK1kvBrb9uG0CUZOUfJBRHQxdFat4
dQTeQDAdsFkoKAQgj5/WYQ18jE2Gia2GfA47HxZW1m0MCKpQ0wGNyzLUkS7SmgYkJFhryKdRaXuQ
McZGNfLIVjFv8VDwoqQNai0NAgxgwDHQcxugBKAAvBTRcle3shZtV3gx+i54vYv+2/FCSbzmRRF6
00ui9bJXu+V9r43cK98O0be+XYqCfvfL3/76979OuKmAB0zgAhv4wAhOsIIXzOAGDzi++HXRfSP8
CIZoawGaQgAPKcwib8BhhxxOEedSF2L1apGLXSxxey/1wQ+qWLmnq90BXqxBeIklXjSGL+MunK3C
5FjHmf3xirMq5CH7wYMaVkAPi2wpeKGOCho+gO6YXGHpSXkOM9YJlbkx4S3flshebnKQqZBKPVCk
GZ6ISVsACAhW2LbD5S0zOcr3R83sIolwOtZyzNvlMWDjAQ74VUrKBw16jGIofVskI88nvZsi4KHs
s1kLR6mJpwTEnkKcRwe4wzH+Nm+OcFZw8Z6iaiaENnXTEVIPm0ByD2coQ07B4HQ1Ff1mONeUMME0
swYGEFoXFqUdH4FOFDMdElK4mn9LaUZMYpXD+ULYzBewSC5ogqxZioI7EuIHsWsIJ2iE7x6J1XOz
TQxmMk3APANAIIKciSo6sdCFbfKAX5BhgUCKImjJONi4xQyG6TVgyeQ49wjVA2yCPbN8+IBiRvEX
swkahZ0yaxqa9k3TIy2ADiC4ODlkprEBJLYdTQWIvQ8eqfeoxx/CUrYU5VFyt2UXnt+QiEOC2Cki
eryEId/EUw4uyDfdxzSeYDVcXM2UpPil1mdojbY0jID1oYHjGuvmEZk4l3j+8Dwq7kbUKX6hPKNw
whOLnpUb4MCjsnvRU58NldVborxN/Ovq9ditJzrxq768s+JfuNzovFghxUBGVN+2ZRJJYipkFdJc
V29LX/xB7V8hvQxhmcI4yKIGivDa7xCwQJruYm9eKG+NgkSJ2w/fRA8Yaia8KIoOu5A7z2mz8h7I
jaEc7xFDJjsD41TiJpnyScFDBZHTorjY8y6vA2RzIpAE7VCUcTw3yfIDluBAVOW3RHqs3fDNevs8
LhEzNBnk8YtzwySK4BN75acAFqCqBjIwpkoSjFn6VqftCYUSp6iEFx1Q6Ce6EnZTzu51Q+IT9BRJ
SyNE5UAuJ5QuzzR/9If+QxegUCD3RuAnBpEXgJwSUUDxM9GRDCRxDb7ne3B3EiAoSKNCExMYBq3x
JQhgGHegUxlxNOAyGcTxDMuERiEITeqCEgVzDScIBjmBLdqSa2OQfBihFmbTFiBzUcxUMKx2g7t3
FIVWCj2YJ6fUBXIWBhnAK6UmGdJxVG/0LzlYF+XShNZmaAz4EQnRN1PYBZZDJLhSL9wEAKeAWJHh
gZ4RUgIlhukEMKH3LM6CXWsoCc+WBgNoGwNwDeTxWKsAhgBTZyKVh5CITvNjUBcQIAXUf9c0Jb3D
gnlQiEBhDcpkT6+2fm9FMnNBLishTSQlF0fBT5EoFxtwDBrgAEcBQTD+VytgImV1QCESZVb+4RjS
8XeTdBAjA0gDgzKb8RnjZBX/EBrNWFLNSC4MQIta0RV9c4tUAAcGwIlqgIEZIUKMVVZmdQ2ZVApe
aFEio0+umAp/9FSckVIqE1LTsVtvFFV2hRNTwgZbcFM5lTEXw1i0ATdElAuWwB1LExMcRYxuZRmn
qAEopQql+FQa4DLVYBD+gY2A4IIa01d+JR4agwnJ4TQf8z1ulU9L1TzIuH4qCZGcFVr4FiGBGE9j
ZgevAY4/81U505EZwTP5gQ6AxYMxtBFQs1alQAURmTK1MBSD1ULGEJMvMoit4FkEoBZpoTPVpZM7
aSbzgR4S0pNj5Tb+xtEficUxUAM1bpWUUpNYbOOUXWYzscGTtgGXHYkLMkQAlNUgf5dYlaWRSOMb
pgATXWhPcKJYplEZGKlVE1c0u3FcWIkWjIEODrIRa0JYquUgoBiM3jFROHN5pbV+hKUL6oFQTimT
3HAgCFcMsBFWjIERvKZaDRKZeIkeD9SFMjQfxVANavlZi7EfbVOQqzciB5I3ZRVanjIA/DCWqMWV
rmlaXSgciGUa1VAdS8NKi4EJ6jeaglhuJKJbpeAAvAkTH2dauIk2jYJY+gAu9yEc5REe7YCdPgiV
LKIdpAAgpjGW5Rk0LERZxkAf5CGXGFEBmIh3jrMyXfFz0jU0D7J+D+ShmxkBiO75OPAZOHhDJ13z
j353d0YWYSlzD26DoX8ToWHWSNoZolk0oiQqoCcKYyaaovi4oiwKFiD6onsQozKKWTWaXy56o5mY
ozr6ZTPZo/73o0C6o0I6pD5qpCiKpCJapEpqBn02pE8KpIXgYFRapVZ6pViapdRjAEEAADs=
------=_NextPart_94915C5ABAF209EF376268C8
Content-Type: application/octet-stream; name="image002.gif"
Content-Description: image002.gif
Content-Id: <255661-2200241151539523805@sp.mailbr.com.br>

R0lGODlheAAoAOUAABwHADgNADIrCFQVAG0cAVQuCWsuBlU/OG07IFdND3BPEXhqFHFGOJQoAakr
AcYvAP49AI1YErJYDpltHKd1HZx2NLB8IcxaEP5DAcxoEPVtDpyIGqySHbWjH7iFJ76jINaMF/ON
F/inHM6SJ+WeKtGyI/OuLNzCJPLRKf/qLbSQT7udZ8KZUtysV+yxSc+ra+S5b/7HTfjLcqSdkc+1
iNnDm+/Okv/hjtzNs+rXtPzktOjezPXoz////wAAAAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQB
DwA/ACwAAAAAeAAoAAUG/8CecEgsGo/IpHLJbDqf0Kh0Sq1ar9isdpvU8aQyl25Jq3HPWx0qdvs2
daa4LMmyWMzovFWG6rPdSXGCLjlFOR52dnhGOS8vepBHPH2UazaARDGCcSQjMEM0iBYepB42RI0j
qh6FUzg4NDiRTmqVgjEygDaCfSadqjYvI6Ulw6Q2OTCqy6SPSzuvKysqE9UqsrNOfBoY3d0acbhw
JN7dFMXLqiXr6x7p6SWlrUI8OTWOFtX6+h7Y2bQoynVz4GGTiXIQIAhAgU4duxInGo54GA8Rixd1
LFDYuHEfR1b/kvDgoaOkDhs2ZJATiCFBChSCIGCQCcHBBErrKlVaZ6JPRP92iexwHEpB38ZFZ3Io
zYFShtMYLlwYnBqHG0sKKbKiCEGz24ASOsOKZcgO0UYOaNNy6FgNKRobUanKnRqQJQYHHbKmsDpT
4di/lH6W+KBWbceiE0p5UMHY0QsakMvUmLwUU5N6ceX20av3oN2WWmd2q3lTpwgRITRoCBEiLOsQ
IGIbzkA7A+JqFC7o1k2UQtChL+ZNyRFGM2e+V/uE8JbwayURGhJKlx7iZZ++CSWoxV5T7YXpDiIM
DWohuGUsOopTrfvZAQcU0UX7pRR9un0I1a9PL5A2g333adlngG8fmZdNDlIZtNxn3SRgQld3LdAL
dzU9cN97AU2nAFoU3Af/IAffSTfARh4EF1J668En0AMCRcAVTRA4dxB4AnBQQn3SRaCfdBtyQCEE
DVQjwXQNiEeBCjWcx0VJMmhGiUAxCuQAi/IJABGOEDywEAopPDidjhlKlwAH/t3XAFrYPZAAUXZ4
wEIZSkqhVJNURXTCnScs6A0BE9gF3gIlvCidAyCchpp9pXE35n0JNbBBmQnxl5Zv5IniZpJRJKMM
CXJRtA5yGADAAaijNffBjYymCkFpWC6AXQMWNuqjdEEaRt6tiSxWRhKpvEPCr5zK9euKVoLAHXdW
lqAqow2c0AeODwwpnQCvhliTAh6NhysFizmyqxIj1YPSuDBEZa6CAhEA/2gJCCU0kwOAimAfAQbU
a++9BdAnXazN9ZlqAfvwo4K3NSgVZ1JSgSrqpzBON0AHgdqXL2cUB4YlrQvMeh+f+6gQEq+CrFgA
O8Z2Ne06giaUAZcpcIWBaq3hdHGkaHqYccATXPMxETAEq+dMfKbTcELwsnOfavcNgGFEMxOQ1o+S
4lyNBSt8zIMLwN4oEADuLPPjw+z8eF8AHfjE8J9Pzyu11DpHYoOvIqvSQgsjzCzAO2InRMDIZq8D
Kc1p0xoBKWvjbAENerz9zggUGIAAAwwcMAPPjN9rQASLU9CAAw00QMAAERSgADEPUVDA6adPwBEH
oaNeAAWIDwFNZDRMo5yCHfqs4E8WdC/TAgzIHNxDDounE5wwy4BVUdeWAtVmb0VBXzUUO+ywxQvI
COcE8ss4IhzxipUS+wuJcCuKMzzUQIMKHhg1lDU7bwG+90m8IIodsdOTUVBuyQ4Z+wGrgPXih4V6
XEYU+UPF/SzwhB3UrgITiMDuCJiNbyGBBnYYQRUmSMEOEsERHgyhCEdIwhKa8IQoTKEKV0jAIAAA
IfkEAQ8APwAsBgADAGoAIQAFBv/An3BILBqPyCJP1ks6n9CodEoV9lw/l67K7Xq/RhkR1gSbh7vz
eWu0qak7HG21sqzeYDZeGp//LBNDE3d7XzGFSDk/Lz8egUeDiJKTSRSMkj09PDxCW26UahRCl1SZ
mTw6Ojk5NjIyMTFYoKCWXj2sslwms1wcRRQqLy80xDQ1x6ubPKamQ5k6NrlFu7zVTi85zUc9OjLS
1oWiUMLZmVPc3uDg2Npeeupm4kXY2fD2XR73+vv8/f7/AAMKHDhLBjWCUCqkmfQNIRQVOMq8yUHi
SQmCFOQRsUBDIhgbFX9cdNhlhbkuPKSNRDIiYAUVKixYqOBkRcQqNloiMWBgiKJkQjrfmNoRR04x
Ou2uBf0xYoQLGDZ+evkkZESJoCzy3TuZJEfTplBzcMLTgxSRGkYsfBipVR8PGmJ5eR3iYayQHG21
siCJqAeNHyw8Dnlhgchfvn0FE+Gh04NUxOq8toVs77C1IAAh+QQBDwA/ACwHAAMAZgAiAAUG/8Cf
cEgsGo9IYq+XbDqf0Kh0mvvJmNOsdss18oauandcxNFoZC7PVQynpzjzSjWZqHBvbS8mNJmIMGJ5
RHgrPxYTRniDWTJGEiOMX2h0TouMmF0eP5tSgnlLXz86OmQmkZlHn1BLoaQ5NjIyMS58qbdEq06t
Nmy4v05nNTU5OTw8rclHPTw5vkYmKCjA1Ec0yEtIPTq9R9PVYxQUTy85rVLbtODV5dldjuupBSrm
WPH3+Pn6+/z9/v8AAwoEt2agFAs07L3R8cegk3FDVuBQ2AWGQyiJilSY2MUZkRMXs1jgmAUGqhIl
fqQMSeTMijkqVFSYOWSkOyc5UDUx8EPFD19RuDi8SUaUohEaOomMGAHDhi5PRUZ0+vFiiIeV/Hi0
UMrUKSMbSaky4TH1KqdD98COaFHu6aAeVYXUUEjDghG764i5xdWDhlsWUVmCqzEVsGBwLzahOQwu
52LGBKkFAQAh+QQBDwA/ACwJAAQAYQAhAAUG/8CfcEgsGo9IYS/XSzqf0OguSnX2bEJZs8qt7nC0
lWo17ZqvRdjWzMb9wpWJcLJi27FInv3ppv0sckUTfntdPHiFUDk/Lz8eT4OJkkMmI5MWVR41kz89
PXqcoUMcHEWPXJ6ePDo6NjIyMaKTpU4TKlWeOTIusr1cKjU5Ozypa0eeOrtHJiYoKL7QRx4ewMXH
h7xEJtGhFD+BSC9MxlY6sdzoQ+Ke6e1PFOBD7O709fb3+Pn6+/z9/v/QFgHsQkaSjIFUFIBbMc9M
jmw/ShxZgHCOEYbkoMCwFLGiHYxRcnD0+ERFDjA0UqpUuaPhERhUCvywITBUiRP6RDoZAYPmnk9G
Qk4JYZHkFCZ0L0YO4emTUw6gQzb9qHmEaq8cp0aMeCEOlK+njMhB/fHhlNdeLzxwzWEVHbGqQoOe
FcWW5Ni4JANyxJsXWlq+fb8ShRYEACH5BAEPAD8ALAYABABlACIABQb/wJ9wSCwaj0hjL8lsOp84
HO1JpfJcMl11u91JV5XJD8ctG69DV87M3v1oYDGRzK5rj+t60g1XyY0rdHpmd4NONT8qFE6Bho6P
dRQvbpBDPTyVmT+LQ5KGPaA6ojIyMZqnnYhmoD02pC6osUY0O6xsoDmwSSg/IrKZGSMqNKxLVaG6
Qya8v4Nif0MqOba3NkYmzYOcQxWg2d/g4eLj5OXm5+jp6uvs7XUy2O5PjZA6yfLzxoY2JPhcFTj0
mcnlj80Kb2VglBnBrpjDYlVsMBxSogSTF808kOOBsckIhXm4CPyhUUjHdRKRfIQRclCriT9eGKth
YciHIlOa5WBRZMSISBc5WlbqkeMFpiI8hXiwKCSnLBo1PXh4QUOorJFDoCotqQoVDg8sgFpFx4Nh
yR9dNXkpOOTF2bRsZeWYCDeuLLdj7aLKcfRUEAAh+QQBDwA/ACwGAAQAZQAiAAUG/8CfcEgsGo9I
Yy7JbDqfvV3vSaXaTCZZdbuN4mgr1W/HLRuvQxLMzJ5+V5XilE23Idf0pBtsQdLmeWZ2gU9LLx5O
f4SLjHQeNYCNPzw6kpY/E0QeS4GUOjYyMjFYl6VDKpxlOTIuWKSmsEUqNTk7PGw6rCZMKT+7sZIX
FxQUHh6zWzw2rUUovcCBFD/SRRHEL6llOjFGKNCBHEUG2N/l5ufo6err7O3u7/Dx8vN0OS70VBMr
kj1a+FX6GNn7x2VCQDo98BAsY3CfqjIt5OX4QqOiRYtkqCQkMmIEk0GxwqXrke2OnUhVSnoUwmJe
jxcmc6Csd2jIIyE5EAnROaSkKUOSRV5g4zFT4AsLMIMOsfBhyC1gL38IzSHT3FMlOi30mRSrB0mZ
RdtF3foj7CKvC3uuLJu2XFS2bcvlYGE27iW0poIAACH5BAEPAD8ALAkABABhACEABQb/wJ9wSCwa
j0hhziRLOp/QaC9KdfJMQleuyuX2ejga7dst/3iuotbMnoZXKqEKxzbHktt60v1bWYwrdHpcOk2D
UDw/ND9xToGHkD8oJZAeVS95hzw6OpGeRBQURS9dmzo5NjIyMS4xWJ+Hok4UpFQ2Lia5sLtcFGI1
NTk5PIlJPDm4r0UoKCkpvNBHoaEWFjTFRTq3ykIo0Z4cP7JIoZhUhWnf6kMUFubr8E4c40MqmfH4
+fr7/P3+/wADChy4CxvBKtYOdToYZcG4CY/M9IAxhNuQCQyF0BMyoYKgKsiEkMjIZoKFj04mkoxS
QcwKOCpUVJg5oeZJJ/ecHPgxDNYIYUr9eixKAsNGjilsRgxROsQSkj8/nH7j0cJI0aOebDAdUsug
EaTQetgQ0gLTUbC7XvyxUKPIUCEeoKKF1SPY2XzXjvRgUWTEXE9kSLYdwneluh613hr+lmOE4sXR
enj9FAQAIfkEAQ8APwAsBwADAGYAIgAFBv/An3BILBqPSCJPlmw6n9CodOr6uXTTrHbLPTKFJlh3
bOztemSuzlQc2dJTM462WllWaHh2LUShiG56Rjs/ND92ZYJaMUYUJYpoOT8vHk55iphdLD+bUpd6
PDxYPzZvYyiPmUefUKI6OTk2MjIxMS4muKq6RaxOsbdsu8JOKi8vNMg0NcuwoTxHOja3SCgpKcPY
RxYWFC+S0DLTRtfZYxwcT9vG31A64cHlwxTd7FtV8aoD9PX4/f7/AAMKHEiwoMGDCLH1+JIQyoQK
hAT1uNewCbohE1TggMOvIhIKRiZYMMSlhykhqTxKmTBhhRYe90aM+EFCZREVKrZVYMkTY8txjU9M
PmHwY8aPXphAptmxA8ecZHUsvTjiAoaNHEijnEQ5U0inHxYqDexRz2oOHlm5UCriocbRqUIsfAD7
Q2y8HjxonO2RlmPXus+G5LA75Gs2vn1V0bBgmAjcISRtYuPxt6PkXTlmJr4MicZmzpCwBQEAIfkE
AQ8APwAsBgADAGoAIQAFBv/An3BILBqPyKLMpUs6n9CodEoV6kw/k6zK7Xq/RuzQlQObib3zOXaE
qam9Hg5Ho6XfXxsevpv/VipDKjh7YGKFRjw/NT8vFkgehIiTlJBlkzw8Ok0/eluVahNCjFU5pjk2
NjKrMS4uJrCgshOkXDauh1UoslwURR7AHirDL8V1dTXJNaeKQjw5uEi7vNROFNcvl0Y5S7nVhRxR
FuPZzVE63d/fFB7lYNrqZuFG7Nnx912++Pv8/f7/AAMKHEgQFLSCUiYEogQPYRSFkt70cONkREEO
84pMsLBCDQ8XQkg49ALRSw89QywmaSEwArAJMJ1spFHlZBIEDIbc2eNhT59jYzQAqRgHc8KKiE56
sBzSAoaNHDx2dnlBpASKEilV3tux40mcF09zxJF6JodWIjSLsOvZqN9YspNesP1hIa0zFkMeLRqJ
iEfPukdyzP0Bl68aGrWM9Eg7orBhXj2KPebnmFIQACH5BAEPAD8ALAQAAgBvACEABQb/wJ9wSCwa
j8ijTZdsOp/QqHSK5JlMMqp2y+0+Y0NXzksum5tMIgl2bru9JiGKOOK9ob0erncvZ39zQnFCL31C
eT84NCsrFj8rfIaSbXw5NI+ORis7k4aDZC8eUIWTPDw6qEI2NjJ/nWYvY1w1NDQwMDIuui5XV69m
okkqsl20IyMkXIG/XRMUzxQe0tIq1SwsL9m12zU1RDw2Lj+fRSgpKczpSc8qsVU5ueRDy+rMFhYv
NMRROuHy9X0yUbDgIda+Ljb+ATzzLNvBhRCjUIhIsaLFixgzatzIsSNAOx6hTFjxi4e4kEg4FJmg
AockkyiTbDgywcImNz1sDEkWc8oEfQ+XyuTc2YTNmWBeKEyIMmGkSy49jBKxwICBmQaSCu3Yse1H
tQoVmjYd8jPolJy3bOTIkSfSmxFHLHz4MdfQDkW1VlRb8fRJW7dudA4pMSQYDaSZfjzsaYZFEaQ/
epAiAphxGxqZPICkPHmy5Tc5pG02ksPx4s9nODlhuzAIACH+c0NyZWF0ZWQgYnkgRWNsaXBzZSBE
aWdpdGFsIEltYWdpbmcgqTE5OTggIEFsbCBSaWdodHMgUmVzZXJ2ZWQuIA0KQ29tbWVyY2lhbCBs
aWNlbnNlIGF2YWlsYWJsZSBhdCB3d3cuZWNsaXBzZWQuY29tDQoAOw==
------=_NextPart_94915C5ABAF209EF376268C8
Content-Type: application/octet-stream; name="image003.gif"
Content-Description: image003.gif
Content-Id: <188792-2200241151539523806@sp.mailbr.com.br>

R0lGODlhtQA+APMHABwXFVhVVISCe7W0tMa9vcbGvcfHx/39/cfHxwAAAAAAAAAAAAAAAAAAAAAA
AAAAACH5BAEAAAgAIf4IR2lmIEx1YmUALAAAAAC1AD4AwxwXFVhVVISCe7W0tMa9vcbGvcfHx/39
/cfHxwAAAAAAAAAAAAAAAAAAAAAAAAAAAAT+EMlJq7046827/2AocoFhEMRpFOtZpCzxtjLs2jU9
x/ut470cD0j8GYdHH3KpbAqdwVmOEJiUVASAdsvtmqS0sCHQTbViYZwKrW6H0YbB9ooypdzstD7P
x+79cGt+d1USdCddiQAlZHNfLnYmjVtmk3MDApiallcFiSVgKmNbApGWXGZdAytqbImrdWiWq3EB
mgO3uJiJZr2FCHRaB8PExcOni8UGiQLKlMbLigHFcloGxsjE0VvXxMLG3MXIit3E4wDN5opc08VZ
n3a/wQDG9QfnXOX37cPbA9Dr6BGrBkDfvWTDCCI8lqheOHUBH0L8JE7Rv4oR6ciz883exIj+7ABq
ueixJEGDDCUeeCfNobWS9RqhhPmRAE17WTROAADnziBAgvrYQEOUlQlBrZAalaXqTa83fqIq7Rlp
hqg1d3b4ZDH1atIUA+SBynqnhtkXZ1GgXau2bdq3bOG6jUt3rl25eOvmvau3L986ywhYicQyICyl
Q0WdMpMHD9CtjxH/gUxZslDLkR1DvhwDwGAsIF8tVToKlQldqFOrxtVLtRrCkhzNWA0jNQyfuE+4
BnQHtU/dq4MPaOPZEOHQV+I0vVGpTGznKHIoTM6M1YvOWw4XVgSjaY3r3xE1BT9j1gmzZE93CdCL
zUbxN/HZHEjKmZb52gJSU3lQC7/t7nD+4VJBGIFk0Dn89NfSfusE89k78YW2hUf+iKQIgwR6NIsx
AnTooS0D6oNPPtg0WOB9JSLH0WdWseDiGUZVZkZmWP2U241R4bhdCU+Fh9ttsOmY43lx3UWakEgi
MsEwOQWQyZO3QClllLhMaSWVV2apyZS5bFmll1qGCeaYV35J5pdiZhnAmt0suQx7R8Up55x01mnn
nXjmqeeefPbp559htfMgZWsxN4QMhiZ6lqKFLupoo5Cqxaikj1IaKaKVYnrpoZxOiqkZYT0InHA9
TgYJAbZVReNlmmEmo6mtxgrrrI1BRcVgmEp4SFGvLfbIZkEFm9iwMGIaz5qHvUisssX+Nsvss1AV
xautPW2Uq4T3HTXjts9RIgpVojwVbiTjHtltJ9yKqy6565bb7rvkknZqPLhyBNIpPGorhq+lYeuI
CfBMy8IkPA7MXbcAHJZZdZqmIAAXh424TnsnWGtvRggDACQP/Wbbscd13MHOUfjOm0p2igXUHMqR
YcdFKbkZ8DDLP4J3sbfVPthRSaeIqEg63qBo34X0vXSiRGWIE+LR6xy4DtAKRmThv7caBx9N56So
BdQVDt1Q0Rl+JNAB2+y8zdJiN631ekyHhOF69FoNIdYZ2cd1dlN3gaHTCxWWoEJjBx121AHxTVHa
IxkDONwV6xy4hhJCbSFJ/egH9oH+/Fgi+U0HqCSx0YgvRLjGazcYNzD2Rugvfvklbkx0sKewN+TP
tH54MfzdeRBKda7NOtl5yqGT1X/6mZWcU9goBaJxfmfDUYzbGfuMg4BRFvLTZx/dVc37cGMoNGzE
rLvjfmUrUnmYzLEbxoK63G+lNtYTG+uLoSmRQFavg8l20H9qcQjIwrbSVxlwtcooXTFf/eA3vf5h
QWDoa5lkcDSu6+yvVhBMjHIOsCSykWFNIAyhCEdIwhKa8IQoTKEKV8jCFrrQhSSZwDJoUbwa2vCG
OMyhDvu0Jg5KQGP6+o2+ukfEIRoxXkjknhKpt0TkObGISWSiFJ94xCZC0YpDdJj++H4lFSF60Stg
TFcQxwibMnYxjGQ8oxjN+MU1qjGNbTxiFj7TgjxZz0ZSkdMdk8RHPPbxj34MJCAH+ZPGGYcr24EH
F9nHFV/lhlW0WpUkXwXJSk7SVZlxDx3FAxLG6AhIjpygIEeZJD0S8pSkJCWQYlY11GkLW7Q4HqVs
EEpZ2vJUt/wULgtJMG3lMga/PM8udUlMYA4TmMgkEiQM6cqb3esL10OLUXwlzWhaU5jLi9Eqb4MD
iGkrCNvUpjizSc5rxsKc3GTFUxD5Hmc+E5fGNJhpbsi8OdmyJ/NkYhX3d7966pF/TvQnn+DXTk6u
zg7pLI9zPoateYwsob2Yp0H+vVW2hEFTDxO1qFTM487QePE9LjMM4HpB0hbw63PSgF4XSjFM5VA0
ox4zj45COhIyzqymLvVXtpjXyuTAFG7bgRlCp7lQZNTJbyp92RhZoJDjTAxhtNijO2MpCJnqCXD5
kgIAfZpIRXbMp3I6KS/ohFSArbROvTzK4ubJ0TslkoZh9aaeTsFS5L2HCuypVTwW8cHm8TWEi2je
mjT6yhEugivKWRMTPwhYGjJ2EbAbTuw+SD3GnpB6VPgrCCG7VxQSVhKNPd08VDeH0qFjaJQjm+US
wh9kkIQT+0FbSgxk2q21LWlMG54r5waTcbAOq6jNm4AuV7rXvgwjsh0dOWr+e9rQtTalzBwt3RZU
ufrYLrUVLS1xT4SfDU0kuSg1HNucO7jOmS66qZuu2iByN9fZbrysBd1su4uKE4FXQuJ122wTYRCM
oXcFO6OdYfbT3oQJ1z+zw0g5vIA7vYFDvuFlboKy5jXo9jS9vUWOMijXtffq9wAnKdGCSaSNOd03
NPnV7n4nVGELgzTA9kAp6ezR4ereDsT8WYWNYcw5z+FXwkdLkGpV9F/e8sxfQrYxdleLY/l6TcgU
bvDgInxbqME2QLoSLYaPrFOD1Jhz8S0vkz6sXNENgz8wicZMSDtjMI9Zt3QYzk0KMJw6o+rOdq6z
jt2BizXTJA5yrgegUTDDwjzjOdB83jPnUDUcGL3o0WnQs59LEgdfiEp7sROkVNtnvU4/Un84cqQQ
gcKG9JjaRWVB7J5u9D1TP9DScsPC9UI2GlbXyA0JPF9lSMOvVd4agRvL9Vbaxa4zPFKBXfmvMvtZ
q+8JEzfJhPYg9bojB8ryJwvcNDxPVb/ypQFJ2o4uLMCgFT38aJzWLKmrAWk9oPJGnTHCdvJwVG6s
+Prdw1aekMYwgCXJAa47DLjAB07wHZLhGiNIuMIXzvCGO/zhEogAADs=
------=_NextPart_94915C5ABAF209EF376268C8--


From noreply@sourceforge.net  Mon Apr 15 19:23:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 11:23:39 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xB8d-0008Qz-00@usw-sf-web4.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 19:29:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 11:29:11 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xBDz-0006Ca-00@usw-sf-web2.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
>Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 10:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 10:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 09:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 09:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 19:47:05 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 11:47:05 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xBVJ-0000HS-00@usw-sf-web4.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 19:48:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 11:48:27 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xBWd-0000dv-00@usw-sf-web3.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 10:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 10:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 09:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 09:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 19:54:10 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 11:54:10 -0700
Subject: [Patches] [ python-Patches-531901 ] binary packagers
Message-ID: <E16xBcA-0006VR-00@usw-sf-web2.sourceforge.net>

Patches item #531901, was opened at 2002-03-19 15:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: binary packagers

Initial Comment:
zip file with updated Solaris and HP-UX packagers.
Replaces 415226, 415227, 415228.

Changes made to take advantage of new PEP241 changes in
the Distribution class.

----------------------------------------------------------------------

>Comment By: Mark Alexander (mwa)
Date: 2002-04-15 18:54

Message:
Logged In: YES 
user_id=12810

New file submitted. No documentation yet, but I am committed
to maintaining them.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:59

Message:
Logged In: YES 
user_id=38388

Mark, could you reupload the ZIP file ? I cannot download it
from the SF page (the file is mostly empty).

Also, is the documentation already included in the ZIP file ?
If not, it would be nice if you could add them as well.

I don't require a special PEP for these changes, BTW, but
I do require you to maintain them.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 19:55

Message:
Logged In: YES 
user_id=12810

OK, the PEP seems to me to mean most of this is done.

These additions are not library modules, they are Distutils
"commands". So the way i read it, the Distutils-SIG (where
I've been hanging around for some time) are the Maintainers.

The documentation will be 2 new chapters for the Distutils
manual "Creating Solaris packages" and "Creating HP-UX
packages" each looking a whole lot like "Creating RPM packages".

Does that clarify anything, or am I still missing a clue?

p.s. Thanks for cleaning up the extra uploads!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 15:35

Message:
Logged In: YES 
user_id=21627

You volunteering as the maintainer is part of the
prerequisites of accepting new modules, when following PEP
2, see

http://python.sourceforge.net/peps/pep-0002.html

It says: "developers ... will first form a group of
maintainers. Then, this group shall produce a PEP called a
library PEP."

So existance of a PEP describing these library extensions
would be a prerequisite for accepting them. If MAL wants to
waive this requirement, it would be fine with me. However,
such a PEP could also share text with the documentation, so
it might not be wasted effort.


----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 14:49

Message:
Logged In: YES 
user_id=12810

Any of the three (they're all the same). SourceForge
hiccuped during the upload, and I don't have permission to
delete the duplicates.

I don't exactly understand what you mean by applying PEP 2.
I uploaded this per Marc Lemburg's request for the latest
versions of patches 41522[6-8]. He's acting as as the
integrator in this case (see
http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html).
I let him know about the duplicate uploads, so hopefully
he'll correct it. If you can and want, feel free to delete
the 2 of your choice.

I agree they need to be documented. As soon as I can, I'll
submit changes to the Distutils documentation.

Finally, yes, I'll act as maintainer. I'm on the
Distutils-sig and as soon as some other poor soul who has to
deal with Solaris or HP-UX tries them, I'm there to work out
issues.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 07:35

Message:
Logged In: YES 
user_id=21627

Which of the three attached files is the right one (19633,
19634, or 19635)? Unless they are all needed, we should
delete the extra copies.

I recommend to apply PEP 2 to this patch: A library PEP is
needed (which could be quite short), documentation, perhaps
test cases. Most importantly, there must be an identified
maintainer of these modules. Are you willing to act as the
maintainer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 20:11:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 12:11:31 -0700
Subject: [Patches] [ python-Patches-544330 ] docs for PyObject_Call + PyObject_Length
Message-ID: <E16xBsx-0006fe-00@usw-sf-web2.sourceforge.net>

Patches item #544330, was opened at 2002-04-15 21:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: docs for PyObject_Call + PyObject_Length

Initial Comment:
Summary says all..

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 20:55:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 12:55:33 -0700
Subject: [Patches] [ python-Patches-496705 ] Additions & corrections to libmacui.tex
Message-ID: <E16xCZZ-00045z-00@usw-sf-web1.sourceforge.net>

Patches item #496705, was opened at 2001-12-25 21:19
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=496705&group_id=5470

Category: Documentation
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Dean Draayer (draayer)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Additions & corrections to libmacui.tex

Initial Comment:
Includes a thorough description of the relatively new GetArgv function.

Greatly expanded the description of the ProgressBar class, as well as updating the description to reflect recent changes to this class.

Numerous minor changes - mostly grammatical - made throughout the document.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-15 15:55

Message:
Logged In: YES 
user_id=3066

Checked in as Doc/mac/libmacui.tex 1.17 and 1.16.24.1 for
the 2.2 maintenance branch (sorry for missing 2.2.1!) and
the trunk.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-01-10 18:33

Message:
Logged In: YES 
user_id=3066

Checked in the GetArgv() docs for Python 2.1.2
(Doc/mac/libmacui.tex revision 1.16.6.1); there just wasn't
time to worry about making sure I had the ProgressBar docs
right (due to the 2.2 addition of the indeterminate
version), so I punted on that for 2.1.2.

I still need to do the "right thing" for 2.2.* and the
trunk.  It shouldn't take long, but I can't right now.

----------------------------------------------------------------------

Comment By: Dean Draayer (draayer)
Date: 2002-01-08 11:42

Message:
Logged In: YES 
user_id=307112

I don't know which version introduced GetArgv().  I think it was 2.0, but since it wasn't documented I never saw it until 2.1.

As far as the barber-pole style progress bars, that will be new in 2.2.  So you may want to anotate the appropriate material with a version number there as well.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-01-04 22:53

Message:
Logged In: YES 
user_id=3066

Attached a revised version of the patch (minor changes only,
plus fix one markup error).

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-01-04 22:41

Message:
Logged In: YES 
user_id=3066

Excellent!

What version of MacPython introduced the GetArgv() function?
 I'd like to add a version annotation and back-port the
portions of the patch that belong in the Python 2.1.2 and
2.2.1 documentation.

Thanks for the contribution!

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=496705&group_id=5470


From noreply@sourceforge.net  Mon Apr 15 21:52:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 13:52:06 -0700
Subject: [Patches] [ python-Patches-544330 ] docs for PyObject_Call + PyObject_Length
Message-ID: <E16xDSI-0001qg-00@usw-sf-web3.sourceforge.net>

Patches item #544330, was opened at 2002-04-15 15:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470

Category: Documentation
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: docs for PyObject_Call + PyObject_Length

Initial Comment:
Summary says all..

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-15 16:52

Message:
Logged In: YES 
user_id=3066

Checked in a variant that uses new markup in
Doc/api/abstract.tex revision 1.12.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544330&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 04:41:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 20:41:50 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16xJqo-0003WV-00@usw-sf-web2.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:41

Message:
Logged In: YES 
user_id=11365

I get two failures...

First, using the dbm module as the engine, whichdb fails 
to indentify the type.  This is apparently a platform 
problem...  whichdbm.py has the comment
# Check for dbm first -- this has a .pag and a .dir file

but on my system the dbm modules creates a .db file.
The 'file' utility says
duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, 
Logical sequence number:
file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, 
Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, 
Fill Factor 40, Number of Keys 0)

Now, a very simple patch would be to look for .db files 
and call them 'dbm'.  I have no idea though whether there 
might be other database formats which use this extension.  
So the thing to do might be to look for .db files and test 
their magic.  Actually, the .db files are identified as 
"dbhash" databases if named explicitly to whichdb...
But the dbhash module isn't available due to missing bsddb!
I'm not sure what to make of all this.
I could just assume .db files with dbhash magic are always 
of kind dbm...  sound reasonable?

Secondly, dumbdbm doesn't work either, if the database is 
empty...  f.read(1) in ["'", '"']  doesn't turn out to be 
true, since the .dir file is empty.


Ok, I've attached a naive patch.  Note I'm not even 
looking at testing dbhash or gdbm since they're not built 
on my system.  On the other hand since anydbm tries these 
first, maybe they are effectively tested by test_anydbm.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 04:49:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 15 Apr 2002 20:49:06 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16xJxq-00060U-00@usw-sf-web3.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:49

Message:
Logged In: YES 
user_id=11365

More detail... the failure mode of test_anydbm is that 
a database freshly created with anydbm.open() can't be 
reopened using the 'r' mode.

Since whichdb returns None we wind up at
   raise error, "need 'c' or 'n' flag to open new db"

Of course, whichdb is to blame for this.


----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:41

Message:
Logged In: YES 
user_id=11365

I get two failures...

First, using the dbm module as the engine, whichdb fails 
to indentify the type.  This is apparently a platform 
problem...  whichdbm.py has the comment
# Check for dbm first -- this has a .pag and a .dir file

but on my system the dbm modules creates a .db file.
The 'file' utility says
duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, 
Logical sequence number:
file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, 
Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, 
Fill Factor 40, Number of Keys 0)

Now, a very simple patch would be to look for .db files 
and call them 'dbm'.  I have no idea though whether there 
might be other database formats which use this extension.  
So the thing to do might be to look for .db files and test 
their magic.  Actually, the .db files are identified as 
"dbhash" databases if named explicitly to whichdb...
But the dbhash module isn't available due to missing bsddb!
I'm not sure what to make of all this.
I could just assume .db files with dbhash magic are always 
of kind dbm...  sound reasonable?

Secondly, dumbdbm doesn't work either, if the database is 
empty...  f.read(1) in ["'", '"']  doesn't turn out to be 
true, since the .dir file is empty.


Ok, I've attached a naive patch.  Note I'm not even 
looking at testing dbhash or gdbm since they're not built 
on my system.  On the other hand since anydbm tries these 
first, maybe they are effectively tested by test_anydbm.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 13:46:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 05:46:21 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16xSLl-0001B3-00@usw-sf-web2.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-16 08:46

Message:
Logged In: YES 
user_id=6380

Greg, you assigned this to Neal Norwitz.  Why?  Usually bug
reports stay unassigned until a developer shows interest. I
doubt that this is Neal's kind of bug: he's not commented on
the bug report, nor does this match the other bugs that he's
interested in.

----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:49

Message:
Logged In: YES 
user_id=11365

More detail... the failure mode of test_anydbm is that 
a database freshly created with anydbm.open() can't be 
reopened using the 'r' mode.

Since whichdb returns None we wind up at
   raise error, "need 'c' or 'n' flag to open new db"

Of course, whichdb is to blame for this.


----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:41

Message:
Logged In: YES 
user_id=11365

I get two failures...

First, using the dbm module as the engine, whichdb fails 
to indentify the type.  This is apparently a platform 
problem...  whichdbm.py has the comment
# Check for dbm first -- this has a .pag and a .dir file

but on my system the dbm modules creates a .db file.
The 'file' utility says
duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, 
Logical sequence number:
file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, 
Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, 
Fill Factor 40, Number of Keys 0)

Now, a very simple patch would be to look for .db files 
and call them 'dbm'.  I have no idea though whether there 
might be other database formats which use this extension.  
So the thing to do might be to look for .db files and test 
their magic.  Actually, the .db files are identified as 
"dbhash" databases if named explicitly to whichdb...
But the dbhash module isn't available due to missing bsddb!
I'm not sure what to make of all this.
I could just assume .db files with dbhash magic are always 
of kind dbm...  sound reasonable?

Secondly, dumbdbm doesn't work either, if the database is 
empty...  f.read(1) in ["'", '"']  doesn't turn out to be 
true, since the .dir file is empty.


Ok, I've attached a naive patch.  Note I'm not even 
looking at testing dbhash or gdbm since they're not built 
on my system.  On the other hand since anydbm tries these 
first, maybe they are effectively tested by test_anydbm.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 13:52:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 05:52:39 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16xSRr-0001FH-00@usw-sf-web2.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-16 08:52

Message:
Logged In: YES 
user_id=33168

I have reviewed it, but others have stayed on top of this
and I didn't have anything to contribute.  I will be glad to
check the patch in when it is in the proper state.  But I
don't know much about anydbm.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-16 08:46

Message:
Logged In: YES 
user_id=6380

Greg, you assigned this to Neal Norwitz.  Why?  Usually bug
reports stay unassigned until a developer shows interest. I
doubt that this is Neal's kind of bug: he's not commented on
the bug report, nor does this match the other bugs that he's
interested in.

----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:49

Message:
Logged In: YES 
user_id=11365

More detail... the failure mode of test_anydbm is that 
a database freshly created with anydbm.open() can't be 
reopened using the 'r' mode.

Since whichdb returns None we wind up at
   raise error, "need 'c' or 'n' flag to open new db"

Of course, whichdb is to blame for this.


----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:41

Message:
Logged In: YES 
user_id=11365

I get two failures...

First, using the dbm module as the engine, whichdb fails 
to indentify the type.  This is apparently a platform 
problem...  whichdbm.py has the comment
# Check for dbm first -- this has a .pag and a .dir file

but on my system the dbm modules creates a .db file.
The 'file' utility says
duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, 
Logical sequence number:
file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, 
Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, 
Fill Factor 40, Number of Keys 0)

Now, a very simple patch would be to look for .db files 
and call them 'dbm'.  I have no idea though whether there 
might be other database formats which use this extension.  
So the thing to do might be to look for .db files and test 
their magic.  Actually, the .db files are identified as 
"dbhash" databases if named explicitly to whichdb...
But the dbhash module isn't available due to missing bsddb!
I'm not sure what to make of all this.
I could just assume .db files with dbhash magic are always 
of kind dbm...  sound reasonable?

Secondly, dumbdbm doesn't work either, if the database is 
empty...  f.read(1) in ["'", '"']  doesn't turn out to be 
true, since the .dir file is empty.


Ok, I've attached a naive patch.  Note I'm not even 
looking at testing dbhash or gdbm since they're not built 
on my system.  On the other hand since anydbm tries these 
first, maybe they are effectively tested by test_anydbm.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 16:48:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 08:48:52 -0700
Subject: [Patches] [ python-Patches-544733 ] Cygwin test_mmap fix for Python 2.2.1
Message-ID: <E16xVCO-0003Ic-00@usw-sf-web2.sourceforge.net>

Patches item #544733, was opened at 2002-04-16 07:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470

Category: Tests
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Tishler (jlt63)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cygwin test_mmap fix for Python 2.2.1

Initial Comment:
Due to the changes in test_mmap for Python
2.2.1, this test now fails under Cygwin for the
following two reasons:

    o since the test file is left open in
      the second to last test it causes
      the last test to fail due to the
      standard way that Windows "deals"
      with open files
    o the last test fails because Windows
      appears to need the backing file to
      be flushed before the mmap operation
      will succeed

This patch corrects the above problems.

I have also tried this patch under Red Hat Linux
7.1 without any ill effects.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 17:24:36 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 09:24:36 -0700
Subject: [Patches] [ python-Patches-541694 ] whichdb unittest
Message-ID: <E16xVky-0000fI-00@usw-sf-web1.sourceforge.net>

Patches item #541694, was opened at 2002-04-09 15:15
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gregory H. Ball (greg_ball)
Assigned to: Neal Norwitz (nnorwitz)
Summary: whichdb unittest

Initial Comment:
Attached patch is a first crack at a unit test for
whichdb.
I think that all functionality required for use by the
anydbm module is tested, but only for the
database modules found in a given installation.

The test case is built up at runtime to cover all the 
available modules, so it is a bit introspective,
but I think it is obvious that it should run correctly.

Unfortunately it crashes on my box (Redhat 6.2) and 
this seems to be a real problem with whichdb:
it assumes things about the dbm format which turn out
to be wrong sometimes.

I only discovered this because test_anydbm was
crashing,
when whichdb failed to work on dbm files.  It would not
have crashed if dbhash was available...  and dbhash was
not available
because bsddb was not built correctly.   So I think
there is a build bug there, but I have little idea how
to solve that one at this
point.

Would I be correct in thinking that if this test really
uncovers bugs in whichdb, it can't be checked in until
they are fixed?   Unfortunately I don't know much about
the various 
databases, but I'll try to work with someone on it.


----------------------------------------------------------------------

>Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-16 12:24

Message:
Logged In: YES 
user_id=11365

Neal posted a list to python-dev of standard library modules 
without unit tests.  (<3CB3093C.B7A22727@metaslash.com>, 
1 week ago, subject Re: Stability and change)
That prompted me to address the breakage that I was seeing
in test_anydbm due to whichdb.  So I thought he might be
interested.  If this interrupts the workflow I'll refrain
from making assignments with so little justification...


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-16 08:52

Message:
Logged In: YES 
user_id=33168

I have reviewed it, but others have stayed on top of this
and I didn't have anything to contribute.  I will be glad to
check the patch in when it is in the proper state.  But I
don't know much about anydbm.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-16 08:46

Message:
Logged In: YES 
user_id=6380

Greg, you assigned this to Neal Norwitz.  Why?  Usually bug
reports stay unassigned until a developer shows interest. I
doubt that this is Neal's kind of bug: he's not commented on
the bug report, nor does this match the other bugs that he's
interested in.

----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:49

Message:
Logged In: YES 
user_id=11365

More detail... the failure mode of test_anydbm is that 
a database freshly created with anydbm.open() can't be 
reopened using the 'r' mode.

Since whichdb returns None we wind up at
   raise error, "need 'c' or 'n' flag to open new db"

Of course, whichdb is to blame for this.


----------------------------------------------------------------------

Comment By: Gregory H. Ball (greg_ball)
Date: 2002-04-15 23:41

Message:
Logged In: YES 
user_id=11365

I get two failures...

First, using the dbm module as the engine, whichdb fails 
to indentify the type.  This is apparently a platform 
problem...  whichdbm.py has the comment
# Check for dbm first -- this has a .pag and a .dir file

but on my system the dbm modules creates a .db file.
The 'file' utility says
duh.db: Berkeley DB 2.X Hash/Little Endian (Version 5, 
Logical sequence number:
file - 0, offset - 0, Bucket Size 4096, Overflow Point 1, 
Last Freed 0, Max Bucket 1, High Mask 0x1, Low Mask 0x0, 
Fill Factor 40, Number of Keys 0)

Now, a very simple patch would be to look for .db files 
and call them 'dbm'.  I have no idea though whether there 
might be other database formats which use this extension.  
So the thing to do might be to look for .db files and test 
their magic.  Actually, the .db files are identified as 
"dbhash" databases if named explicitly to whichdb...
But the dbhash module isn't available due to missing bsddb!
I'm not sure what to make of all this.
I could just assume .db files with dbhash magic are always 
of kind dbm...  sound reasonable?

Secondly, dumbdbm doesn't work either, if the database is 
empty...  f.read(1) in ["'", '"']  doesn't turn out to be 
true, since the .dir file is empty.


Ok, I've attached a naive patch.  Note I'm not even 
looking at testing dbhash or gdbm since they're not built 
on my system.  On the other hand since anydbm tries these 
first, maybe they are effectively tested by test_anydbm.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-14 21:10

Message:
Logged In: YES 
user_id=6380

What kind of crash do you experience?

Do you have a patch that fixes whichdb?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541694&group_id=5470


From noreply@sourceforge.net  Tue Apr 16 18:00:22 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 10:00:22 -0700
Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers
Message-ID: <E16xWJa-0006M8-00@usw-sf-web4.sourceforge.net>

Patches item #500311, was opened at 2002-01-07 08:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Van den Bergh (vdbergh)
Assigned to: Martin v. Löwis (loewis)
Summary: Work around for buggy https servers

Initial Comment:
Python 2.2. Tested on RH 7.1.

This a workaround for, 

http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762

The problem is that some https servers close an ssl
connection without properly resetting it first. In the
above bug description it is suggested that this
only occurs for IIS but apparently some  (modified)
Apache servers also suffer from it (see
telemeter.telenet.be).

One of the suggested workarounds is to modify
httplib.py so as to ignore the combination of
err[0]==SSL_ERROR_SYSCALL and 
err[1]=="EOF occurred in violation of protocol".
However I think one should never compare error strings
since in principle they may depend on language etc...

So I decided to modify _socket.c slightly so that
it becomes possible to return error codes which
are not in in ssl.h.

When an ssl-connection is closed without reset I now
return the error code SSL_ERROR_EOF. Then I ignore
this (apparently benign) error in httplib.py.

In addition I fixed what I think was an error in
PySSL_SetError(SSL *ssl, int ret) in socketmodule.c.

Originally there was:

	case SSL_ERROR_SSL:
	{
		unsigned long e = ERR_get_error();
		if (e == 0) {
			/* an EOF was observed that violates the protocol */
			errstr = "EOF occurred in violation of protocol";

etc... 
but if I understand the documentation for
SSL_get_error then the test should be: e==0 && ret==0.
A similar error occurs a few lines later.

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-16 18:00

Message:
Logged In: YES 
user_id=76089

py23ssl.txt works fine for me when applied to latest CVS, 
and fixes the problem.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-11 06:34

Message:
Logged In: YES 
user_id=21627

Unfortunately, your patch appears to be incorrect.
Performing the script in #494762, I get an empty string as
the result, whereas the content of the resource is 'HTTPS Test'

In case you want to experiment with the CVS version I'll
attach a patch for that.

----------------------------------------------------------------------

Comment By: Michel Van den Bergh (vdbergh)
Date: 2002-01-09 10:25

Message:
Logged In: YES 
user_id=10252

Due to some problems with sourceforge and incompetence on my
part I submitted this several times.
Please see patch 500311. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 00:11:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 16:11:32 -0700
Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function
Message-ID: <E16xc6m-0002AQ-00@usw-sf-web3.sourceforge.net>

Patches item #544909, was opened at 2002-04-16 18:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
Assigned to: Nobody/Anonymous (nobody)
Summary: addition of cmath.arg function

Initial Comment:
This patch adds the familiar "Arg" function from
complex analysis to the cmath module, though it's
called "arg" here for consistency with the other names.
Along with the built-in abs function this makes
polar/rectangular coordinate conversions trivial:

  z = complex(x,y)
  r, theta = abs(z), arg(z)
  
  z = r * exp(1j * theta)
  x, y = z.real, z.imag

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 06:07:34 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 16 Apr 2002 22:07:34 -0700
Subject: [Patches] [ python-Patches-462754 ] no '_d' ending for mingw32
Message-ID: <E16xhfK-0000UH-00@usw-sf-web1.sourceforge.net>

Patches item #462754, was opened at 2001-09-19 05:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462754&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Gerhard Häring (ghaering)
Assigned to: Nobody/Anonymous (nobody)
Summary: no '_d' ending for mingw32

Initial Comment:
This patch prevents distutils from naming the extension
modules <extname>_d.pyd when compiled with mingw32 on
Windows in debug mode. Instead, the extension modules
will get the normal name <extname>.pyd. Technically,
the patch doesn't prevent the behaviour for mingw32,
but only adds the _d for MS Visual C++ and Borland
compilers (though I don't know about the Borland case).

The reason for this? Adding "_d" doesn't make any sense
for GNU compilers. I think it's just a MS Visual C++
madness. If you want to debug an extension module that
was compiled with gcc, you have to use gdb anyway,
because the debugging symbols of MSVC++ and gcc are
incompatible. So you normally use a release Python
version (from the python.org binary download) and
compile your extensions with mingw32.

To put it shortly:

The current state is that you do a
"setup.py build --compiler=mingw32 --debug" and then
rename the extension modules, removing the _d. Then
fire up gdb to debug your module.

With this patch, the renaming isn't necessary anymore.

----------------------------------------------------------------------

>Comment By: Gerhard Häring (ghaering)
Date: 2002-04-17 07:07

Message:
Logged In: YES 
user_id=163326

If python.exe is compiled --with-pydebug, then this is true.

But the point is that I want to compile debug versions of my
extension modules and use them with the standard python.exe
(*not* python_d.exe).

So yes, the patch does work, at least it did when I
submitted it <wink>.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-09 12:44

Message:
Logged In: YES 
user_id=21627

Does the patch actually work? It seems to me that, if
compiled with-pydebug, import will automatically search for
the _d version, and complain if it is not found.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-01-04 12:52

Message:
Logged In: YES 
user_id=21627

The rationale for using the debugging version of MSVCRT are
not the debugging information alone, but also the additional
functionalities, like heap consistency checks and other
assertions. So it is not obvious that you do not want to use
the debugging version of this library in a debug build.

----------------------------------------------------------------------

Comment By: Gerhard Häring (ghaering)
Date: 2002-01-04 03:50

Message:
Logged In: YES 
user_id=163326

mingw links with msvcrt.dll. I've plans to add mingw32
support to the autoconf build process (hopefully soon enough
for 2.3).

The GNU and MS debugger symbols are incompatible, though, so
I think that mingw32 shouldn't link to the debug version of
msrcrt (gdb doesn't understand the Microsoft debugger
symbols; and the Visual Studio debugger has no idea what the
debugging symbols of gcc are all about; isn't cross-platform
and cross-compiler programming fun?).


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-30 14:13

Message:
Logged In: YES 
user_id=21627

How does the mingw port interact with the debugging
libraries? With MSVC, the debug build will link to the debug
versions of the CRT. What C library will mingw link with (I
hope it won't use crtdll.dll)?

----------------------------------------------------------------------

Comment By: Gerhard Häring (ghaering)
Date: 2001-09-28 23:28

Message:
Logged In: YES 
user_id=163326

Yes. But mingw32 isn't emulating Unix under Windows (that
would be Cygwin). It's just a version of gcc and friends
that targets native win32. It links against msvcrt (not a
Posix emulation library like Cygwin does).

This is a bit hypothetical because I didn't yet hack the
autoconf build process for native win32 with mingw32.

Currently, you cannot build a complete Python with mingw32,
but you *can* build extension modules against an existing
Python (compiled with M$ VC++).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-09-28 22:43

Message:
Logged In: YES 
user_id=31435

All else being equal, a system emulating Unix under Windows 
should strive to make life comfortable for Unix folks.  The 
question is thus whether all else is in fact equal <wink -- 
but I don't know, as I don't yet use the system under 
discussion>.

----------------------------------------------------------------------

Comment By: Gerhard Häring (ghaering)
Date: 2001-09-28 20:37

Message:
Logged In: YES 
user_id=163326

Hmm. I don't like the _d endings at all. But if the policy
on win32 is that debug executables and libraries get a "_d"
ending, then I'm unsure wether this patch should be applied.

I have plans to hack the autoconf madness to build a native
win32 Python with mingw32. But that won't be ready by
tomorror. And I don't think that I'll add "_d" endings there
for debugging, because that would be inconsistent with the
normal autoconf builds on Unix.

I'm glad that *I* don't have to decide wether this patch is
a Good Thing. Being consistent with Python win32 build or
with GNU (gcc/autoconf). Take your pick :-)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-09-19 05:46

Message:
Logged In: YES 
user_id=31435

FYI, MSVC never adds _d on its own -- Mark Hammond and/or 
Guido forced it to do that.  I don't remember why, but one 
of them explained it to me long ago and it made good sense 
at the time <wink>.

MSCV normally compiles debug and release builds into 
distinct subdirectories, and uses the same names in both.  
But *our* MSVC setup forces it to compile both flavors of 
build directly into the PCbuild directory, so has to give 
the resulting DLLs and executables different names (else 
the second build would overwrite the results of the first 
build).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=462754&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 11:21:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 03:21:35 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16xmZD-0000fg-00@usw-sf-web4.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 15:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 12:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 19:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 18:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 16:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 00:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 02:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 17:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 12:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 12:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 05:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 19:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 13:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 13:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 14:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 22:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 19:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 17:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 21:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 20:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 20:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 16:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 11:24:37 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 03:24:37 -0700
Subject: [Patches] [ python-Patches-545096 ] Janitoring in ConfigParser
Message-ID: <E16xmc9-0000hw-00@usw-sf-web4.sourceforge.net>

Patches item #545096, was opened at 2002-04-17 10:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Janitoring in ConfigParser

Initial Comment:
The first patch fixes a bug, implements some speed 
improvements, some memory consumption improvements, 
enforces the usage of the already available global 
variables, and extends the allowed chars in option 
names to be very permissive. 
 
The second one, if used, is supposed to be applied 
over the first one, and implements a walk() 
generator method for walking trough the options of a 
section. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 13:40:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 05:40:04 -0700
Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate
Message-ID: <E16xojE-0002Mu-00@usw-sf-web3.sourceforge.net>

Patches item #545150, was opened at 2002-04-17 12:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: {a,b} in fnmatch.translate

Initial Comment:
This patch adds support to {a,b} expansion constructs  
in fnmatch.translate. That is, file{a,b}.txt will 
match both, filea.txt and fileb.txt, like usual 
shell expansions. 
  

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 15:16:20 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 07:16:20 -0700
Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers
Message-ID: <E16xqEO-0000hX-00@usw-sf-web2.sourceforge.net>

Patches item #500311, was opened at 2002-01-07 09:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Van den Bergh (vdbergh)
Assigned to: Martin v. Löwis (loewis)
Summary: Work around for buggy https servers

Initial Comment:
Python 2.2. Tested on RH 7.1.

This a workaround for, 

http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762

The problem is that some https servers close an ssl
connection without properly resetting it first. In the
above bug description it is suggested that this
only occurs for IIS but apparently some  (modified)
Apache servers also suffer from it (see
telemeter.telenet.be).

One of the suggested workarounds is to modify
httplib.py so as to ignore the combination of
err[0]==SSL_ERROR_SYSCALL and 
err[1]=="EOF occurred in violation of protocol".
However I think one should never compare error strings
since in principle they may depend on language etc...

So I decided to modify _socket.c slightly so that
it becomes possible to return error codes which
are not in in ssl.h.

When an ssl-connection is closed without reset I now
return the error code SSL_ERROR_EOF. Then I ignore
this (apparently benign) error in httplib.py.

In addition I fixed what I think was an error in
PySSL_SetError(SSL *ssl, int ret) in socketmodule.c.

Originally there was:

	case SSL_ERROR_SSL:
	{
		unsigned long e = ERR_get_error();
		if (e == 0) {
			/* an EOF was observed that violates the protocol */
			errstr = "EOF occurred in violation of protocol";

etc... 
but if I understand the documentation for
SSL_get_error then the test should be: e==0 && ret==0.
A similar error occurs a few lines later.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-17 16:16

Message:
Logged In: YES 
user_id=21627

jribbens: Even when running the test from 494762, i.e.

import os,urllib2
os.environ["http_proxy"]=''
f = urllib2.urlopen("https://wwws.task.com.br/i.htm")
print f.read()

This gives an empty response for me...

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-16 19:00

Message:
Logged In: YES 
user_id=76089

py23ssl.txt works fine for me when applied to latest CVS, 
and fixes the problem.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-11 07:34

Message:
Logged In: YES 
user_id=21627

Unfortunately, your patch appears to be incorrect.
Performing the script in #494762, I get an empty string as
the result, whereas the content of the resource is 'HTTPS Test'

In case you want to experiment with the CVS version I'll
attach a patch for that.

----------------------------------------------------------------------

Comment By: Michel Van den Bergh (vdbergh)
Date: 2002-01-09 11:25

Message:
Logged In: YES 
user_id=10252

Due to some problems with sourceforge and incompetence on my
part I submitted this several times.
Please see patch 500311. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 15:44:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 07:44:07 -0700
Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers
Message-ID: <E16xqfH-0003VP-00@usw-sf-web3.sourceforge.net>

Patches item #500311, was opened at 2002-01-07 08:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Van den Bergh (vdbergh)
Assigned to: Martin v. Löwis (loewis)
Summary: Work around for buggy https servers

Initial Comment:
Python 2.2. Tested on RH 7.1.

This a workaround for, 

http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762

The problem is that some https servers close an ssl
connection without properly resetting it first. In the
above bug description it is suggested that this
only occurs for IIS but apparently some  (modified)
Apache servers also suffer from it (see
telemeter.telenet.be).

One of the suggested workarounds is to modify
httplib.py so as to ignore the combination of
err[0]==SSL_ERROR_SYSCALL and 
err[1]=="EOF occurred in violation of protocol".
However I think one should never compare error strings
since in principle they may depend on language etc...

So I decided to modify _socket.c slightly so that
it becomes possible to return error codes which
are not in in ssl.h.

When an ssl-connection is closed without reset I now
return the error code SSL_ERROR_EOF. Then I ignore
this (apparently benign) error in httplib.py.

In addition I fixed what I think was an error in
PySSL_SetError(SSL *ssl, int ret) in socketmodule.c.

Originally there was:

	case SSL_ERROR_SSL:
	{
		unsigned long e = ERR_get_error();
		if (e == 0) {
			/* an EOF was observed that violates the protocol */
			errstr = "EOF occurred in violation of protocol";

etc... 
but if I understand the documentation for
SSL_get_error then the test should be: e==0 && ret==0.
A similar error occurs a few lines later.

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-17 15:44

Message:
Logged In: YES 
user_id=76089

Yes, that test works fine.

The patch looks correct to me by inspection also. Michel's 
comments about SSL_get_error are correct according to the 
OpenSSL documentation, i.e. the existing code is incorrect 
(this being a separate issue to whether or not "EOF 
occurred" should be ignored, which is a work-around for 
other peoples' bugs).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-17 15:16

Message:
Logged In: YES 
user_id=21627

jribbens: Even when running the test from 494762, i.e.

import os,urllib2
os.environ["http_proxy"]=''
f = urllib2.urlopen("https://wwws.task.com.br/i.htm")
print f.read()

This gives an empty response for me...

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-16 18:00

Message:
Logged In: YES 
user_id=76089

py23ssl.txt works fine for me when applied to latest CVS, 
and fixes the problem.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-11 06:34

Message:
Logged In: YES 
user_id=21627

Unfortunately, your patch appears to be incorrect.
Performing the script in #494762, I get an empty string as
the result, whereas the content of the resource is 'HTTPS Test'

In case you want to experiment with the CVS version I'll
attach a patch for that.

----------------------------------------------------------------------

Comment By: Michel Van den Bergh (vdbergh)
Date: 2002-01-09 10:25

Message:
Logged In: YES 
user_id=10252

Due to some problems with sourceforge and incompetence on my
part I submitted this several times.
Please see patch 500311. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 19:16:12 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 11:16:12 -0700
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E16xtyW-0000cy-00@usw-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 14:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.1.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 19:55:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 11:55:04 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xua8-0006OH-00@usw-sf-web3.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 20:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 20:40:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 12:40:44 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16xvIK-0006tv-00@usw-sf-web3.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 13:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:40

Message:
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 10:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 17:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 17:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 18:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 17:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 15:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 23:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 01:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 16:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 10:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 10:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 03:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 17:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 11:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 11:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 12:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 20:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 17:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 13:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 08:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 19:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 18:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 16:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 16:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 14:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 20:42:48 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 12:42:48 -0700
Subject: [Patches] [ python-Patches-415227 ] Solaris pkgtool bdist command
Message-ID: <E16xvKK-0006vF-00@usw-sf-web3.sourceforge.net>

Patches item #415227, was opened at 2001-04-10 19:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470

Category: Distutils and setup.py
Group: None
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Solaris pkgtool bdist command

Initial Comment:
The bdist_pktool command is based on bdist_packager and
provides support for the Solaris
pkgadd and pkgrm commands. In most cases, no additional
options beyond the PEP 241 options are required. An 
exception is if the package name is >9 characters, a
--pkg-abrev option is required because that's all
pkgtool
will handle. It makes listing the packages on the
system
a pain, but the actual package files produced do match
name-version-revision-pyvers.pkg format. By default,
bdist_pkgtool provides request, postinstall, preremove,
and postremove scripts that will properly relocate
modules to the site-packages directory and recompile
all .py modules on the target machine. An author
can provide a custom request script and either have
it auto-relocate by merging the scripts, or inhibit
auto-relocation with --no-autorelocate.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:42

Message:
Logged In: YES 
user_id=38388

Replaced by 531901.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-02-24 18:25

Message:
Logged In: YES 
user_id=38388

The code looks OK, but I can't test it... I'm sure the
user base will, though, once it's in CVS.

Please also write up some documentation which we can
add to the distutils TeX docs and add them to the
patch. I will then add it to CVS.

Thanks.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-11-20 14:00

Message:
Logged In: YES 
user_id=38388

Hijacking this patch to take load off of Andrew. This patch should be reviewed after the Python 2.2 feature freeze.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 05:39

Message:
Logged In: YES 
user_id=21627

Should there also be some Makefile machinery to create a 
Solaris package for python itself? There is a 1.6a2 
package on sunfreeware; it would surely help if building 
Solaris packages was supported by the Python core itself.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415227&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 20:43:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 12:43:11 -0700
Subject: [Patches] [ python-Patches-415228 ] HP-UX packaging command
Message-ID: <E16xvKh-0006vl-00@usw-sf-web3.sourceforge.net>

Patches item #415228, was opened at 2001-04-10 19:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415228&group_id=5470

Category: Distutils and setup.py
Group: None
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: HP-UX packaging command

Initial Comment:
The bdist_sdux (SD-UX is HP's packager) command is
based on bdist_packager and provides the same
functionality
as the bdist_pkgtool command, except the resulting
packages
cannot auto-relocate. Instead, a checkinstall script
is included by default that determines of the target
machines python installation matches that of the 
creating machine. If not, it bails out and provides
the installer with the correct version of the swinstall
command to place it in the proper directory.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:43

Message:
Logged In: YES 
user_id=38388

Replaced by 531901.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-02-24 18:27

Message:
Logged In: YES 
user_id=38388

The code looks OK, but I can't test it... I'm sure the
user base will, though, once it's in CVS.

Please also write up some documentation which we can
add to the distutils TeX docs and add them to the
patch. I will then add it to CVS.

Thanks.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-11-20 14:01

Message:
Logged In: YES 
user_id=38388

Hijacking this patch to take load off of Andrew. This patch should be reviewed after the Python 2.2 feature freeze.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-04-10 21:18

Message:
Logged In: YES 
user_id=6380

Please select the proper category when submitting patches!

This is clearly a distutils thing.

Assigned to Andrew.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415228&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 20:43:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 12:43:52 -0700
Subject: [Patches] [ python-Patches-415226 ] new base class for binary packaging
Message-ID: <E16xvLM-0006vz-00@usw-sf-web3.sourceforge.net>

Patches item #415226, was opened at 2001-04-10 19:51
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470

Category: Distutils and setup.py
Group: None
>Status: Closed
>Resolution: Duplicate
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: new base class for binary packaging

Initial Comment:
bdist_packager.py provides an abstract
base class for bdist commands. It provides easy access
to all 
the PEP 241 metadata fields, plus "revision" for the
package
revision and installation scripts for preinstall,
postinstall
preremove, and postremove. That covers the base
characteristics
of all the package managers that I'm familiar with. If
anyone
can think of any others, let me know, otherwise
additional
extensions would be implemented in the specific
packager's
commands. I would, however, discourage _requiring_ any
additional fields. It would be nice if by simply
supplying
the PEP241 metadata under the [bdist_packager] section 
all subclassed packagers worked with no further effort.
It also has rudimentary relocation support by including
a --no-autorelocate option. 

The bdist_packager is also where I see creating
seperate
binary packages for sub-packages supported. My need for 
that is much less than my desire for it right now, so I
didn't give it much thought as I wrote it. I'd be
delighted
to hear any comments and suggestions on how to approach
sub-packaging, though.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:43

Message:
Logged In: YES 
user_id=38388

Replaced by 531901.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2001-10-02 21:10

Message:
Logged In: YES 
user_id=12810

Regarding script code: The preinstall, postinstall, etc.
scripts are hooked into the package manager specific
subclasses. It's the responsibility of the specific class to
"do the right thing". For *NIX package managers, this is
usually script code, although changing the help text to be
more informative isn't a problem. 

More specifically, using python scripts under pkgtool and
sdux would fail. Install scripts are not executed, they're
sourced (in some wierd fashion I've yet to identify).
Theoretically, using a shell script to find the python
interpreter by querying the package manager and calling it
with either -i or a runtime created script should work fine.

This is intended as a class for instantiating new bdist
commands with full support for pep 241. Current bdist
commands do their own thing, and they do it very
differently. I'd rather see this put in as a migration path
than shut down bdist commands that function just fine on
their own. Eventual adoption of a standard abstract base
would mean that module authors could provide all metadata in
a standard format, and distutils would be able to create
binary packages for systems the author doesn't have access
to. 

This works for Solaris pkgtool and HP-UX SDUX. All three
patches can be included with ZERO side effects on any other
aspect of Distutils. I'm really kind of curious why they're
not integrated yet so other's can try them out.

----------------------------------------------------------------------

Comment By: david arnold (dja)
Date: 2001-09-20 09:08

Message:
Logged In: YES 
user_id=78574

i recently struck a case where i wanted the ability to run a
post-install script on Windows (from a
bdist_wininst-produced package).

while i agree with what seems to be the basic intention of
this patch, wouldn't it be more useful to have the various
scripts run by the Python interpreter, rather than Bourne
shell (which is extremely seldom available on Windows,
MacOS, etc) ?

i went looking for the source of the .exe file embedded in
the wininst command, but couldn't find it.  does anyone know
where it lives?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-06-07 05:33

Message:
Logged In: YES 
user_id=21627

Shouldn't the patch also modify the existing bdist 
commands to use this as a base class?


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=415226&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 20:54:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 12:54:35 -0700
Subject: [Patches] [ python-Patches-531901 ] binary packagers
Message-ID: <E16xvVj-0004UI-00@usw-sf-web2.sourceforge.net>

Patches item #531901, was opened at 2002-03-19 15:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: binary packagers

Initial Comment:
zip file with updated Solaris and HP-UX packagers.
Replaces 415226, 415227, 415228.

Changes made to take advantage of new PEP241 changes in
the Distribution class.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:54

Message:
Logged In: YES 
user_id=38388

I will try to checkin your latest version into CVS today. The PSF will
still require you to sign a contributor agreement for these
addition, though, after these have been through the legal
review phase.

http://www.python.org/psf/psf-contributor-agreement.html

Is that acceptable ?

Note: I'm still awaiting the documentation for these files.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-04-15 18:54

Message:
Logged In: YES 
user_id=12810

New file submitted. No documentation yet, but I am committed
to maintaining them.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:59

Message:
Logged In: YES 
user_id=38388

Mark, could you reupload the ZIP file ? I cannot download it
from the SF page (the file is mostly empty).

Also, is the documentation already included in the ZIP file ?
If not, it would be nice if you could add them as well.

I don't require a special PEP for these changes, BTW, but
I do require you to maintain them.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 19:55

Message:
Logged In: YES 
user_id=12810

OK, the PEP seems to me to mean most of this is done.

These additions are not library modules, they are Distutils
"commands". So the way i read it, the Distutils-SIG (where
I've been hanging around for some time) are the Maintainers.

The documentation will be 2 new chapters for the Distutils
manual "Creating Solaris packages" and "Creating HP-UX
packages" each looking a whole lot like "Creating RPM packages".

Does that clarify anything, or am I still missing a clue?

p.s. Thanks for cleaning up the extra uploads!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 15:35

Message:
Logged In: YES 
user_id=21627

You volunteering as the maintainer is part of the
prerequisites of accepting new modules, when following PEP
2, see

http://python.sourceforge.net/peps/pep-0002.html

It says: "developers ... will first form a group of
maintainers. Then, this group shall produce a PEP called a
library PEP."

So existance of a PEP describing these library extensions
would be a prerequisite for accepting them. If MAL wants to
waive this requirement, it would be fine with me. However,
such a PEP could also share text with the documentation, so
it might not be wasted effort.


----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 14:49

Message:
Logged In: YES 
user_id=12810

Any of the three (they're all the same). SourceForge
hiccuped during the upload, and I don't have permission to
delete the duplicates.

I don't exactly understand what you mean by applying PEP 2.
I uploaded this per Marc Lemburg's request for the latest
versions of patches 41522[6-8]. He's acting as as the
integrator in this case (see
http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html).
I let him know about the duplicate uploads, so hopefully
he'll correct it. If you can and want, feel free to delete
the 2 of your choice.

I agree they need to be documented. As soon as I can, I'll
submit changes to the Distutils documentation.

Finally, yes, I'll act as maintainer. I'm on the
Distutils-sig and as soon as some other poor soul who has to
deal with Solaris or HP-UX tries them, I'm there to work out
issues.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 07:35

Message:
Logged In: YES 
user_id=21627

Which of the three attached files is the right one (19633,
19634, or 19635)? Unless they are all needed, we should
delete the extra copies.

I recommend to apply PEP 2 to this patch: A library PEP is
needed (which could be quite short), documentation, perhaps
test cases. Most importantly, there must be an identified
maintainer of these modules. Are you willing to act as the
maintainer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 21:50:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 13:50:07 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16xwNT-0007bT-00@usw-sf-web4.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 15:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=89016

> About the difference between encoding 
> and decoding: you shouldn't just look 
> at the case where you work with Unicode 
> and strings, e.g. take the rot-13 codec
> which works on strings only or other
> codecs which translate objects into 
> strings and vice-versa.

unicode.encode encodes to str and 
str.decode decodes to unicode,
even for rot-13:

>>> u"gürk".encode("rot13")
't\xfcex'
>>> "gürk".decode("rot13")
u't\xfcex'
>>> u"gürk".decode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'unicode' object has no attribute 'decode'
>>> "gürk".encode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/walter/Python-current-
readonly/dist/src/Lib/encodings/rot_13.py", line 18, in 
encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: ASCII decoding error: ordinal not in range
(128)

Here the str is converted to unicode
first, before encode is called, but the
conversion to unicode fails.

Is there an example where something
else happens?

> Error handling has to be flexible enough 
> to handle all these situations. Since 
> the codecs know best how to handle the
> situations, I'd make this an implementation 
> detail of the codec and leave the
> behaviour undefined in the general case.

OK, but we should suggest, that for encoding
unencodable characters are collected
and for decoding seperate byte sequences
that are considered broken by the codec
are passed to the callback: i.e for 
decoding the handler will never get
all broken data in one call, e.g. 
for "\u30\Uffffffff".decode("unicode-escape")
the handler will be called twice (once for
"\u30" and "truncated \u escape" as the
reason and once for "\Uffffffff" and
"illegal character" as the reason.)

> For the existing codecs, backward 
> compatibility should be maintained, 
> if at all possible. If the patch gets 
> overly complicated because of this, 
> we may have to provide a downgrade solution
> for this particular problem (I don't think 
> replace is used in any computational context, 
> though, since you can never be sure how 
> many replacement character do get 
> inserted, so the case may not be 
> that realistic).
> 
> Raising an exception for the charmap codec 
> is the right way to go, IMHO. I would 
> consider the current behaviour a bug.

OK, this is implemented in PyUnicode_EncodeCharmap now, 
and collecting unencodable characters works too.

I completely changed the implementation,
because the stack approach would have
gotten much more complicated when
unencodable characters are collected.

> For new codecs, I think we should 
> suggest that replace tries to collect 
> as much illegal data as possible before
> invoking the error handler. The handler 
> should be aware of the fact that it 
> won't necessarily get all the broken 
> data in one call.

OK for encoders, for decoders see
above.

> About the codec error handling 
> registry: You seem to be using a 
> Unicode specific approach here. 
> I'd rather like to see a generic 
> approach which uses the API 
> we discussed earlier. Would that be possible?

The handlers in the registry are all Unicode
specific. and they are different for encoding
and for decoding.

I renamed the function because of your
comment from 2001-06-13 10:05 (which 
becomes exceedingly difficult to find on
this long page! ;)).

> In that case, the codec API should 
> probably be called 
> codecs.register_error('myhandler', myhandler).
> 
> Does that make sense ?

We could require that unique names
are used for custom handlers, but
for the standard handlers we do have
name collisions. To prevent them, we
could either remove them from the registry
and require that the codec implements
the error handling for those itself,
or we could to some fiddling, so that
u"üöä".encode("ascii", "replace")
becomes 
u"üöä".encode("ascii", "unicodeencodereplace")
behind the scenes.

But I think two unicode specific 
registries are much simpler to handle.

> BTW, the patch which uses the callback 
> registry does not seem to be available 
> on this SF page (the last patch still 
> converts the errors argument to a 
> PyObject, which shouldn't be needed
> anymore with the new approach). 
> Can you please upload your 
> latest version?

OK, I'll upload a preliminary version
tomorrow. PyUnicode_EncodeDecimal and
PyUnicode_TranslateCharmap are still
missing, but otherwise the patch seems
to be finished. All decoders work and
the encoders collect unencodable characters
and implement the handling of known
callback handler names themselves.

As PyUnicode_EncodeDecimal is only used
by the int, long, float, and complex constructors,
I'd love to get rid of the errors argument,
but for completeness sake, I'll implement
the callback functionality.

> Note that the highlighting codec 
> would make a nice example
> for the new feature.

This could be part of the codec callback test
script, which I've started to write. We could
kill two birds with one stone here:
1. Test the implementation.
2. Document and advocate what is 
   possible with the patch.

Another idea: we could have as an example
a decoding handler that relaxes the
UTF-8 minimal encoding restriction, e.g.

def relaxedutf8(enc, uni, startpos, endpos, reason, data):
   if uni[startpos:startpos+2] == u"\xc0\x80":
      return (u"\x00", startpos+2)
   else:
      raise UnicodeError(...)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 21:40

Message:
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 12:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 19:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 18:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 16:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 00:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 02:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 17:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 12:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 12:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 05:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 19:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 13:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 13:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 14:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 22:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 19:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 17:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 21:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 20:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 20:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 16:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 21:50:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 13:50:56 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xwOG-00058q-00@usw-sf-web2.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 08:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 16:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 14:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 10:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 10:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 09:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 09:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-12 21:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 14:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 10:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 06:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 11:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Wed Apr 17 22:35:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 14:35:50 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16xx5i-00088J-00@usw-sf-web4.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 23:35

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 20:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 01:31:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 17:31:08 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16xzpM-0001dM-00@usw-sf-web4.sourceforge.net>

Patches item #545439, was opened at 2002-04-17 19:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 03:19:03 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 19:19:03 -0700
Subject: [Patches] [ python-Patches-540394 ] Remove PyMalloc_* symbols
Message-ID: <E16y1Vn-0002g7-00@usw-sf-web3.sourceforge.net>

Patches item #540394, was opened at 2002-04-07 01:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470

Category: Core (C code)
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Remove PyMalloc_* symbols

Initial Comment:
This patch removes all PyMalloc_* symbols from the
source.  obmalloc now implements PyObject_{Malloc, 
Realloc, Free}.  PyObject_{New,NewVar} allocate using
pymalloc.

I also changed PyObject_Del and PyObject_GC_Del
so that they be used as function designators.  Is
changing the signature of PyObject_Del going to cause
any problems?  I had to add some extra typecasts when
assigning to tp_free.

Please review and assign back to me.

The next phase would be to cleanup the memory API
usage.  Do we want to replace all PyObject_Del calls
with PyObject_Free?  PyObject_Del seems to match better
with PyObject_GC_Del.

Oh yes, we also need to change PyMem_{Free, Del, ...} to
use pymalloc's free.


----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-18 02:19

Message:
Logged In: YES 
user_id=35752

A modified version of the patch has been commited.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-10 00:53

Message:
Logged In: YES 
user_id=6380

The binary compatibility issue is extensions compiled for
2.2 that have references to _PyObject_Del compiled into them
and aren't recompiled for 2.3. I think that should work
(even if they get a warning). To make it work, the
_PyObject_Del entry point must continue to exist.

Back to Neil, I think my instructions are clear enough.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 20:43

Message:
Logged In: YES 
user_id=31435

It'll be a day or two before PLabs can get back to Python 
work anyway.

Reassigning to Guido -- I'm not even going to try to 
channel him on backwards compatibility, or the feasibility 
of introducing possible warnings.  If I were you I'd check 
in the patch with the casts in; they can be taken out again 
later if Guido is agreeable.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-09 20:29

Message:
Logged In: YES 
user_id=35752

It might be a day or two before I get to this.

Regarding the type of tp_free, could we change it to be
something like:

  typedef void (*freefunc)(void *);
  ...
  freefunc tp_free;

and leave the type of tp_dealloc alone.  Maybe it's too
late now that 2.2 is out and uses 'destructor'.  I don't
see how this relates to binary compatibility though.
Why does it matter if the function takes a PyObject pointer
or a void pointer?  The worse I see happening is that people
could get warnings when they compile their extension
modules.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-09 18:47

Message:
Logged In: YES 
user_id=31435

Clarifying or just repeating Guido here:

+ Binary compatibility is important.  It's better on Unix 
than it appears <wink> -- while you'll get a warning if you 
run an old 1.5.2 extension with 2.2 today and without 
recompiling, it will almost certainly work anyway.  So in 
the case of macros that expanded to a private API function 
before, that private API function must still exist, but the 
macro needn't expand to that anymore (nor even *be* a macro 
anymore).  _PyObject_Del is a particular problem cuz it's 
even documented in the C API manual -- there simply wasn't 
a public API function before that did the same thing and 
could be used as a function designator.  You're making life 
better for future generations.

+ Casts on tp_free slots are par for the course, 
because "destructor" has an impractical signature.  I'm 
afraid that can't change either, so the casts stay.

+ Fred and I agreed to add PyObject_Del to the "minimal 
recommended API", so, for the next round of this, feel 
wholly righteous in leaving existing PyObject_Del calls 
alone.

If anything's unclear, hit me.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-09 16:27

Message:
Logged In: YES 
user_id=6380

I've not fully read Tim's response in email, but instead
I've reviewed and discussed the patch with Tim.

I think the only thing to which I object at this point is
the removal of the entry point _PyObject_Del.  I believe
that for source and binary compatibility with 2.2, that
entry point should remain, with the same meaning, but it
should not be used at all by the core. (Motivation to keep
it: it's the only thing you can reasonably stick in tp_free
that works for 2.2 as well as for 2.3.)

One minor question: there are a bunch of #undefs in
gcmodule.c (e.g. PyObject_GC_Track) that don't seem to make
sense -- at least I cannot find where these would be
#defined any more. Ditto for #indef PyObject_Malloc in
obmalloc.c.

I suggest that you check this thing in, but keeping
_PyObject_Del alive, and we'll take it from there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 19:18

Message:
Logged In: YES 
user_id=6380

(Wouldn't it be more efficient to take this to email
between the three of us?)

> Extensions that *currently* call PyObject_Del have
> its old macro expansion ("_PyObject_Del((PyObject
> *)(op))") buried in them, so getting rid of
> _PyObject_Del is a binary-API incompatibility
> (existing extensions will no longer link without
> recompilation).  I personally don't mind that, but
> I run on Windows and "binary compatability" never
> works there across minor releases for other
> reasons, so I don't have any real feel for how
> much people on other platforms value it.  As you
> pointed out recently too, binary compatability
> has, in reality, not been the case since 1.5.2
> anyway.

Still, tradition has it that we keep such entry
points around for a long time.  I propose that we do
so now, too.

> So that's one for Python-Dev.  If we do break
> binary compatibility, I'd be sorely tempted to
> change the "destructor" typedef to say destructors
> take void*.  IMO saying they take PyObject* was a
> poor idea, as you almost never have a PyObject*
> when calling one of these guys.

Huh?  "destructor" is used to declare tp_dealloc,
which definitely needs a PyObject * (or some
"subclass" of it, like PyIntObject *).

It's also used to declare tp_free, which arguably
shouldn't take a PyObject * (since by the time
tp_free is called, most of the object's contents
have been destroyed by tp_dealloc).  So maybe
tp_free (a newcomer in 2.2) should be declared to
take something else, but then the risk is breaking
code that defines a tp_free with the correct
signature.

> That's why PyObject_Del "had to" be a macro, to
> hide the cast to PyObject* almost everyone needs
> because of destructor's "correct" but impractical
> signature.  If "destructor" had a practical
> signature, there would have been no temptation to
> use a macro.

I don't understand this at all.

> Note that if the typedef of destructor were so
> changed, you wouldn't have needed new casts in
> tp_free slots.  And I'd rather break binary
> compatability than make extension authors add new
> casts.

Nor this.

> Hmm. I'm assigning this to Guido for comment:
> Guido, what are your feelings about binary
> compatibility here?  C didn't define free() as
> taking a void* by mistake <wink>.

I want binary compatibility, but I don't understand
your comments very well.

> Back to Neil: I wouldn't bother changing PyObject_Del
> to PyObject_Free.  The former isn't in the
> "recommended" minimal API, but neither is it
> discouraged.  I expect TMTOWTDI here forever.

I prefer PyObject_Del -- like PyObject_GC_Del, and
like we did in the past.  Plus, I like New to match
Del and Malloc to match Free.  Since it's
PyObject_New, it should be _Del.


I'm not sure what to say of Neil's patch, except
that I'm glad to be rid of the PyMalloc_XXX family.
I wish we didn't have to change all the places that
used to say _PyObject_Del.  Maybe it's best to keep
that name around?  The patch would (psychologically)
become a lot smaller.  I almost wish that this would
work:

#define PyObject_Del  ((destructor)PyObject_Free)

Or maybe it *does* work???


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-08 18:47

Message:
Logged In: YES 
user_id=6380

I'm looking at this now...

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:59

Message:
Logged In: YES 
user_id=31435

Extensions that *currently* call PyObject_Del have its old 
macro expansion ("_PyObject_Del((PyObject *)(op))") buried 
in them, so getting rid of _PyObject_Del is a binary-API 
incompatibility (existing extensions will no longer link 
without recompilation).

I personally don't mind that, but I run on Windows 
and "binary compatability" never works there across minor 
releases for other reasons, so I don't have any real feel 
for how much people on other platforms value it.  As you 
pointed out recently too, binary compatability has, in 
reality, not been the case since 1.5.2 anyway.

So that's one for Python-Dev.  If we do break binary 
compatibility, I'd be sorely tempted to change 
the "destructor" typedef to say destructors take void*.  
IMO saying they take PyObject* was a poor idea, as you 
almost never have a PyObject* when calling one of these 
guys.  That's why PyObject_Del "had to" be a macro, to hide 
the cast to PyObject* almost everyone needs because of 
destructor's "correct" but impractical signature. 
If "destructor" had a practical signature, there would have 
been no temptation to use a macro.

Note that if the typedef of destructor were so changed, you 
wouldn't have needed new casts in tp_free slots.  And I'd 
rather break binary compatability than make extension 
authors add new casts.

Hmm. I'm assigning this to Guido for comment:  Guido, what 
are your feelings about binary compatibility here?  C 
didn't define free() as taking a void* by mistake <wink>.

Back to Neil:  I wouldn't bother changing PyObject_Del to 
PyObject_Free.  The former isn't in the "recommended" 
minimal API, but neither is it discouraged.  I expect 
TMTOWTDI here forever.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:41

Message:
Logged In: YES 
user_id=31435

Oops -- I hit "Submit" prematurely.  More to come.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-07 02:40

Message:
Logged In: YES 
user_id=31435

Looks good to me -- thanks!


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=540394&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 05:13:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 21:13:32 -0700
Subject: [Patches] [ python-Patches-545480 ] Examples for urllib2
Message-ID: <E16y3Ia-000135-00@usw-sf-web2.sourceforge.net>

Patches item #545480, was opened at 2002-04-18 04:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545480&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sean Reifschneider (jafo)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Examples for urllib2

Initial Comment:
An associate who's learning Python recently complained
about a lack of
examples for urllib2.  As a starting point, I'd like to
submit the
following:

This example gets the python.org main page and displays
the first 100 bytes
of it:

   >>> import urllib2
   >>> url = urllib2.urlopen('http://www.python.org/')
   >>> print url.read()[:100]
   <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN">
   <html>
   <!-- THIS PAGE IS AUTOMATICAL
   >>>

Here we are sending a data-stream to the stdin of a CGI
and reading the
data it returns to us:

   >>> import urllib2
   >>> req = urllib2.Request(url =
'https://localhost/cgi-bin/test.cgi',
   ...       data = 'This data is passed to stdin of
the CGI')
   >>> url = urllib2.urlopen(req)
   >>> print url.read()
   Got Data: "This data is passed to stdin of the CGI"
   >>>

The code for the sample CGI used in the above example is:

   #!/usr/bin/env python
   import sys
   data = sys.stdin.read()
   print 'Content-type: text-plain\n\nGot Data: "%s"' %
data

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545480&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 05:30:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 21:30:49 -0700
Subject: [Patches] [ python-Patches-544733 ] Cygwin test_mmap fix for Python 2.2.1
Message-ID: <E16y3ZJ-0001Ba-00@usw-sf-web2.sourceforge.net>

Patches item #544733, was opened at 2002-04-16 11:48
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470

Category: Tests
Group: Python 2.2.x
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Jason Tishler (jlt63)
>Assigned to: Tim Peters (tim_one)
Summary: Cygwin test_mmap fix for Python 2.2.1

Initial Comment:
Due to the changes in test_mmap for Python
2.2.1, this test now fails under Cygwin for the
following two reasons:

    o since the test file is left open in
      the second to last test it causes
      the last test to fail due to the
      standard way that Windows "deals"
      with open files
    o the last test fails because Windows
      appears to need the backing file to
      be flushed before the mmap operation
      will succeed

This patch corrects the above problems.

I have also tried this patch under Red Hat Linux
7.1 without any ill effects.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-18 00:30

Message:
Logged In: YES 
user_id=31435

Also fine under native Windows.  Thanks, Jason!  Checked in 
as

Lib/test/test_mmap.py; new revision: 1.21

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544733&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 05:36:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 21:36:56 -0700
Subject: [Patches] [ python-Patches-545486 ] make test_linuxaudiodev ignore EBUSY
Message-ID: <E16y3fE-00043T-00@usw-sf-web3.sourceforge.net>

Patches item #545486, was opened at 2002-04-18 14:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: make test_linuxaudiodev ignore EBUSY

Initial Comment:
When testing, I don't want to have to stop
playing mp3s just to get the test suite to
have zero failures. The following trivial
patch makes it skip on EBUSY.

submitted for feedback as I'm not sure if
there's a reason to not do this...


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 07:01:14 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 23:01:14 -0700
Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate
Message-ID: <E16y4yo-0004yM-00@usw-sf-web4.sourceforge.net>

Patches item #545150, was opened at 2002-04-17 14:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: {a,b} in fnmatch.translate

Initial Comment:
This patch adds support to {a,b} expansion constructs  
in fnmatch.translate. That is, file{a,b}.txt will 
match both, filea.txt and fileb.txt, like usual 
shell expansions. 
  

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 08:01

Message:
Logged In: YES 
user_id=21627

I'm concerned about backwards compatibility of this change.
People who currently use { in their patterns get them
translated literally; under your change, this will have a
different effect.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 07:04:41 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 23:04:41 -0700
Subject: [Patches] [ python-Patches-545096 ] Janitoring in ConfigParser
Message-ID: <E16y529-00050T-00@usw-sf-web4.sourceforge.net>

Patches item #545096, was opened at 2002-04-17 12:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Janitoring in ConfigParser

Initial Comment:
The first patch fixes a bug, implements some speed 
improvements, some memory consumption improvements, 
enforces the usage of the already available global 
variables, and extends the allowed chars in option 
names to be very permissive. 
 
The second one, if used, is supposed to be applied 
over the first one, and implements a walk() 
generator method for walking trough the options of a 
section. 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 08:04

Message:
Logged In: YES 
user_id=21627

I'd like to see this split into even more parts: a patch
that supposedly has *no* semantic change (ie. the speed
improvements, memory consumption improvements, use of global
variables); a patch that changes behavior (please explain in
which ways); and the patch that implements walk (which
appears to be missing currently).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 07:06:07 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 17 Apr 2002 23:06:07 -0700
Subject: [Patches] [ python-Patches-543498 ] s/Copyright/License/ in bdist_rpm.py
Message-ID: <E16y53X-00051X-00@usw-sf-web4.sourceforge.net>

Patches item #543498, was opened at 2002-04-14 00:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: s/Copyright/License/ in bdist_rpm.py

Initial Comment:
The "Copyright" field in RPM spec files is obsolete. 
"License" should be used instead. 

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 08:06

Message:
Logged In: YES 
user_id=21627

So what is the minimum version of the RPM software that
accepts the License: field? It is my understanding that
rpm(1) may blow up if it does not recognize a field.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-14 18:46

Message:
Logged In: YES 
user_id=7887

The rpm.org site is much more obsolete than this tag  
<wink>.  
  
Here is an excerpt from a message of Jeff Johnson in  
rpm-list (subject is "Re: three questions about building  
rpms"):  
  
----  
[...] 
This is historical legacy. Originally rpm had  
        Copyright: GPL  
but everyone said  
        GPL is not a copyright.  
  
So, rpm changed the tag name to License:, and, for  
backward compatibility, used the same numeric value as  
RPMTAG_COPYRIGHT. Now, everyone gets to ask the next  
question  
  
        Which is it Copyright: or License:?  
  
and the answer is <shrug> :-)  
----  
  
Every distribution working with rpms, including redhat,  
has changed (or is changing) the tag to License.  
Copyright, as Jeff said by himself, is a misgiven name  
for that field.  
 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-14 11:58

Message:
Logged In: YES 
user_id=21627

Can you provide a pointer that shows this obsoletion?

http://www.rpm.org/RPM-HOWTO/build.html#SPEC-FILE

still says Copyright.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543498&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 08:04:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 00:04:15 -0700
Subject: [Patches] [ python-Patches-541031 ] context sensitive help/keyword search
Message-ID: <E16y5xn-0008Ro-00@usw-sf-web1.sourceforge.net>

Patches item #541031, was opened at 2002-04-08 16:25
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: context sensitive help/keyword search

Initial Comment:
This script/module looks up keywords in the Python 
manuals.

It is usable as CGI script - a version is online at
http://starship.python.net/crew/theller/cgi-
bin/pyhelp.cgi

It can also by used from the command line:
python pyhelp.py keyword

It can also be used to implement context sensitive 
help in IDLE or Xemacs (for example) by simply 
selecting a word and pressing F1.

It can use the online version of the manuals at 
www.python.org/doc/, or it can use local installed 
html pages.

The script/module scans the index pages of the docs 
for hyperlinks, and pickles the results to disk.


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-18 09:04

Message:
Logged In: YES 
user_id=11105

Uploaded new version (Rev 1.19 in my local CVS).

The online version moved to
http://starship.python.net/crew/theller/pyhelp.cgi

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 03:06

Message:
Logged In: YES 
user_id=6380

Maybe Fred finds this interesting?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=541031&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 08:05:53 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 00:05:53 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16y5zN-0008Ss-00@usw-sf-web1.sourceforge.net>

Patches item #545439, was opened at 2002-04-18 02:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-18 09:05

Message:
Logged In: YES 
user_id=11105

FYI, a *somewhat* similar thing can be found at bug 541031.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 08:39:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 00:39:23 -0700
Subject: [Patches] [ python-Patches-545523 ] patch for 514433  bsddb.dbopen (NULL)
Message-ID: <E16y6Vn-0000Pj-00@usw-sf-web1.sourceforge.net>

Patches item #545523, was opened at 2002-04-18 17:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: patch for 514433  bsddb.dbopen (NULL)

Initial Comment:
This is a patch for 
[ 514433 ] bsddb: enable dbopen (file==NULL)
from that bug:
"""
bsddb: enable dbopen (file==NULL)
dbopen(): if the file argument is NULL, the library
will use a temporary file. this is useful if you want
that, or if you want to specify a large cache so that
it never actually touches the disk. [i.e., in-memory
hash/bt]
I've done this by replacing the "s" with a "z" in the
arg specs for the three open functions. Seems to work. 
"""

This patch does this. Some testing seems to show that
it works fine.

The docs for db-1.86 show that it was an acceptable
option back then, and it's still allowed for db-3.2,
so that seems a wide enough range of supported 
libraries.

Anyway, this patch has the trivial fix, additions
to the test suite (such as it is) and some short
additions to the docs (lifted from the dbopen()
manpage)

If this is fine, let me know and I'll check it into
the trunk.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 08:41:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 00:41:44 -0700
Subject: [Patches] [ python-Patches-545523 ] patch for 514433  bsddb.dbopen (NULL)
Message-ID: <E16y6Y4-0003BO-00@usw-sf-web2.sourceforge.net>

Patches item #545523, was opened at 2002-04-18 17:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470

>Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: patch for 514433  bsddb.dbopen (NULL)

Initial Comment:
This is a patch for 
[ 514433 ] bsddb: enable dbopen (file==NULL)
from that bug:
"""
bsddb: enable dbopen (file==NULL)
dbopen(): if the file argument is NULL, the library
will use a temporary file. this is useful if you want
that, or if you want to specify a large cache so that
it never actually touches the disk. [i.e., in-memory
hash/bt]
I've done this by replacing the "s" with a "z" in the
arg specs for the three open functions. Seems to work. 
"""

This patch does this. Some testing seems to show that
it works fine.

The docs for db-1.86 show that it was an acceptable
option back then, and it's still allowed for db-3.2,
so that seems a wide enough range of supported 
libraries.

Anyway, this patch has the trivial fix, additions
to the test suite (such as it is) and some short
additions to the docs (lifted from the dbopen()
manpage)

If this is fine, let me know and I'll check it into
the trunk.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 10:09:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 02:09:19 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16y7up-0006xZ-00@usw-sf-web3.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 13:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-04-18 09:09

Message:
Logged In: YES 
user_id=6656

Walter, do you feel like sorting out the release22-maint
branch too?

It's probably best to activate the new string methods there
too.  I can't see how it could possibly break anything.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 21:35

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 20:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 18:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 18:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 18:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 18:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 18:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 13:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 13:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 01:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 18:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 14:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 11:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 11:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 16:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 15:00:37 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 07:00:37 -0700
Subject: [Patches] [ python-Patches-545523 ] patch for 514433  bsddb.dbopen (NULL)
Message-ID: <E16yCSj-0002QL-00@usw-sf-web2.sourceforge.net>

Patches item #545523, was opened at 2002-04-18 03:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: patch for 514433  bsddb.dbopen (NULL)

Initial Comment:
This is a patch for 
[ 514433 ] bsddb: enable dbopen (file==NULL)
from that bug:
"""
bsddb: enable dbopen (file==NULL)
dbopen(): if the file argument is NULL, the library
will use a temporary file. this is useful if you want
that, or if you want to specify a large cache so that
it never actually touches the disk. [i.e., in-memory
hash/bt]
I've done this by replacing the "s" with a "z" in the
arg specs for the three open functions. Seems to work. 
"""

This patch does this. Some testing seems to show that
it works fine.

The docs for db-1.86 show that it was an acceptable
option back then, and it's still allowed for db-3.2,
so that seems a wide enough range of supported 
libraries.

Anyway, this patch has the trivial fix, additions
to the test suite (such as it is) and some short
additions to the docs (lifted from the dbopen()
manpage)

If this is fine, let me know and I'll check it into
the trunk.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-18 10:00

Message:
Logged In: YES 
user_id=6380

Looks OK. Check it in on the trunk, and we'll see how it
goes. (Could be a 2.2 candidate too I think.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 18:02:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 10:02:52 -0700
Subject: [Patches] [ python-Patches-545486 ] make test_linuxaudiodev ignore EBUSY
Message-ID: <E16yFJ6-0004Fy-00@usw-sf-web4.sourceforge.net>

Patches item #545486, was opened at 2002-04-18 06:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470

Category: None
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: make test_linuxaudiodev ignore EBUSY

Initial Comment:
When testing, I don't want to have to stop
playing mp3s just to get the test suite to
have zero failures. The following trivial
patch makes it skip on EBUSY.

submitted for feedback as I'm not sure if
there's a reason to not do this...


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 19:02

Message:
Logged In: YES 
user_id=21627

Sounds reasonable to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 18:07:03 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 10:07:03 -0700
Subject: [Patches] [ python-Patches-545096 ] Janitoring in ConfigParser
Message-ID: <E16yFN9-00070N-00@usw-sf-web1.sourceforge.net>

Patches item #545096, was opened at 2002-04-17 10:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Janitoring in ConfigParser

Initial Comment:
The first patch fixes a bug, implements some speed 
improvements, some memory consumption improvements, 
enforces the usage of the already available global 
variables, and extends the allowed chars in option 
names to be very permissive. 
 
The second one, if used, is supposed to be applied 
over the first one, and implements a walk() 
generator method for walking trough the options of a 
section. 

----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-18 17:07

Message:
Logged In: YES 
user_id=7887

I'd rather explain here the patch that changes behavior, 
since it's pretty small. This line in the regular 
expression OPTCRE: 
 
r'(?P<option>[]\-[\w_.*,(){}]+)' 
 
was replaced by: 
 
r'(?P<option>[^:=\s]+)' 
 
So that instead of giving a range of characters which may 
be part of the option name, it just looks for the 
separator chars and spaces. This behavior is already used 
in the headers, and I haven't found any good reason to 
deny usage of other characters as option names. 
 
In the same regular expression, I've also replaced 
'[ \t]' by '\s', but this shouln't change the current 
behavior at all. 
 
About the walk patch, I have no idea why it isn't 
attached. I remember to have checked the ticked, and it 
was there. Anyway, I'm attaching it again. 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 06:04

Message:
Logged In: YES 
user_id=21627

I'd like to see this split into even more parts: a patch
that supposedly has *no* semantic change (ie. the speed
improvements, memory consumption improvements, use of global
variables); a patch that changes behavior (please explain in
which ways); and the patch that implements walk (which
appears to be missing currently).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 18:23:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 10:23:04 -0700
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E16yFce-0007EH-00@usw-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 20:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.1.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 19:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 19:07:43 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 11:07:43 -0700
Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate
Message-ID: <E16yGJr-0004i6-00@usw-sf-web3.sourceforge.net>

Patches item #545150, was opened at 2002-04-17 12:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: {a,b} in fnmatch.translate

Initial Comment:
This patch adds support to {a,b} expansion constructs  
in fnmatch.translate. That is, file{a,b}.txt will 
match both, filea.txt and fileb.txt, like usual 
shell expansions. 
  

----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-18 18:07

Message:
Logged In: YES 
user_id=7887

Indeed. I was even expecting this answer. It's not usual 
to have such characters in filenames, but they're not 
invalid. OTOH, I discovered that is was not supported 
while trying to use them with globbing meanings, so this 
may be expected by some users. 
 
Anyway, the patch works, and I'd like to use this 
functionality for myself, but if you decide not to 
include it, I'll understand. 
 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 06:01

Message:
Logged In: YES 
user_id=21627

I'm concerned about backwards compatibility of this change.
People who currently use { in their patterns get them
translated literally; under your change, this will have a
different effect.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 20:22:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 12:22:39 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16yHUN-0005de-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 15:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-18 21:22

Message:
Logged In: YES 
user_id=89016

OK, here is the current version of the patch (diff7.txt). 
PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are 
still missing.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=89016

> About the difference between encoding 
> and decoding: you shouldn't just look 
> at the case where you work with Unicode 
> and strings, e.g. take the rot-13 codec
> which works on strings only or other
> codecs which translate objects into 
> strings and vice-versa.

unicode.encode encodes to str and 
str.decode decodes to unicode,
even for rot-13:

>>> u"gürk".encode("rot13")
't\xfcex'
>>> "gürk".decode("rot13")
u't\xfcex'
>>> u"gürk".decode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'unicode' object has no attribute 'decode'
>>> "gürk".encode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/walter/Python-current-
readonly/dist/src/Lib/encodings/rot_13.py", line 18, in 
encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: ASCII decoding error: ordinal not in range
(128)

Here the str is converted to unicode
first, before encode is called, but the
conversion to unicode fails.

Is there an example where something
else happens?

> Error handling has to be flexible enough 
> to handle all these situations. Since 
> the codecs know best how to handle the
> situations, I'd make this an implementation 
> detail of the codec and leave the
> behaviour undefined in the general case.

OK, but we should suggest, that for encoding
unencodable characters are collected
and for decoding seperate byte sequences
that are considered broken by the codec
are passed to the callback: i.e for 
decoding the handler will never get
all broken data in one call, e.g. 
for "\u30\Uffffffff".decode("unicode-escape")
the handler will be called twice (once for
"\u30" and "truncated \u escape" as the
reason and once for "\Uffffffff" and
"illegal character" as the reason.)

> For the existing codecs, backward 
> compatibility should be maintained, 
> if at all possible. If the patch gets 
> overly complicated because of this, 
> we may have to provide a downgrade solution
> for this particular problem (I don't think 
> replace is used in any computational context, 
> though, since you can never be sure how 
> many replacement character do get 
> inserted, so the case may not be 
> that realistic).
> 
> Raising an exception for the charmap codec 
> is the right way to go, IMHO. I would 
> consider the current behaviour a bug.

OK, this is implemented in PyUnicode_EncodeCharmap now, 
and collecting unencodable characters works too.

I completely changed the implementation,
because the stack approach would have
gotten much more complicated when
unencodable characters are collected.

> For new codecs, I think we should 
> suggest that replace tries to collect 
> as much illegal data as possible before
> invoking the error handler. The handler 
> should be aware of the fact that it 
> won't necessarily get all the broken 
> data in one call.

OK for encoders, for decoders see
above.

> About the codec error handling 
> registry: You seem to be using a 
> Unicode specific approach here. 
> I'd rather like to see a generic 
> approach which uses the API 
> we discussed earlier. Would that be possible?

The handlers in the registry are all Unicode
specific. and they are different for encoding
and for decoding.

I renamed the function because of your
comment from 2001-06-13 10:05 (which 
becomes exceedingly difficult to find on
this long page! ;)).

> In that case, the codec API should 
> probably be called 
> codecs.register_error('myhandler', myhandler).
> 
> Does that make sense ?

We could require that unique names
are used for custom handlers, but
for the standard handlers we do have
name collisions. To prevent them, we
could either remove them from the registry
and require that the codec implements
the error handling for those itself,
or we could to some fiddling, so that
u"üöä".encode("ascii", "replace")
becomes 
u"üöä".encode("ascii", "unicodeencodereplace")
behind the scenes.

But I think two unicode specific 
registries are much simpler to handle.

> BTW, the patch which uses the callback 
> registry does not seem to be available 
> on this SF page (the last patch still 
> converts the errors argument to a 
> PyObject, which shouldn't be needed
> anymore with the new approach). 
> Can you please upload your 
> latest version?

OK, I'll upload a preliminary version
tomorrow. PyUnicode_EncodeDecimal and
PyUnicode_TranslateCharmap are still
missing, but otherwise the patch seems
to be finished. All decoders work and
the encoders collect unencodable characters
and implement the handling of known
callback handler names themselves.

As PyUnicode_EncodeDecimal is only used
by the int, long, float, and complex constructors,
I'd love to get rid of the errors argument,
but for completeness sake, I'll implement
the callback functionality.

> Note that the highlighting codec 
> would make a nice example
> for the new feature.

This could be part of the codec callback test
script, which I've started to write. We could
kill two birds with one stone here:
1. Test the implementation.
2. Document and advocate what is 
   possible with the patch.

Another idea: we could have as an example
a decoding handler that relaxes the
UTF-8 minimal encoding restriction, e.g.

def relaxedutf8(enc, uni, startpos, endpos, reason, data):
   if uni[startpos:startpos+2] == u"\xc0\x80":
      return (u"\x00", startpos+2)
   else:
      raise UnicodeError(...)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 21:40

Message:
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 12:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 19:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 18:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 16:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 00:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 02:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 17:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 12:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 12:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 05:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 19:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 13:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 13:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 14:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 22:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 19:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 17:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 21:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 20:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 20:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 16:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 20:24:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 12:24:50 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16yHWU-0005id-00@usw-sf-web2.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 15:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-18 21:24

Message:
Logged In: YES 
user_id=89016

And here is the test script (test_codeccallbacks.py)

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-18 21:22

Message:
Logged In: YES 
user_id=89016

OK, here is the current version of the patch (diff7.txt). 
PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are 
still missing.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=89016

> About the difference between encoding 
> and decoding: you shouldn't just look 
> at the case where you work with Unicode 
> and strings, e.g. take the rot-13 codec
> which works on strings only or other
> codecs which translate objects into 
> strings and vice-versa.

unicode.encode encodes to str and 
str.decode decodes to unicode,
even for rot-13:

>>> u"gürk".encode("rot13")
't\xfcex'
>>> "gürk".decode("rot13")
u't\xfcex'
>>> u"gürk".decode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'unicode' object has no attribute 'decode'
>>> "gürk".encode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/walter/Python-current-
readonly/dist/src/Lib/encodings/rot_13.py", line 18, in 
encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: ASCII decoding error: ordinal not in range
(128)

Here the str is converted to unicode
first, before encode is called, but the
conversion to unicode fails.

Is there an example where something
else happens?

> Error handling has to be flexible enough 
> to handle all these situations. Since 
> the codecs know best how to handle the
> situations, I'd make this an implementation 
> detail of the codec and leave the
> behaviour undefined in the general case.

OK, but we should suggest, that for encoding
unencodable characters are collected
and for decoding seperate byte sequences
that are considered broken by the codec
are passed to the callback: i.e for 
decoding the handler will never get
all broken data in one call, e.g. 
for "\u30\Uffffffff".decode("unicode-escape")
the handler will be called twice (once for
"\u30" and "truncated \u escape" as the
reason and once for "\Uffffffff" and
"illegal character" as the reason.)

> For the existing codecs, backward 
> compatibility should be maintained, 
> if at all possible. If the patch gets 
> overly complicated because of this, 
> we may have to provide a downgrade solution
> for this particular problem (I don't think 
> replace is used in any computational context, 
> though, since you can never be sure how 
> many replacement character do get 
> inserted, so the case may not be 
> that realistic).
> 
> Raising an exception for the charmap codec 
> is the right way to go, IMHO. I would 
> consider the current behaviour a bug.

OK, this is implemented in PyUnicode_EncodeCharmap now, 
and collecting unencodable characters works too.

I completely changed the implementation,
because the stack approach would have
gotten much more complicated when
unencodable characters are collected.

> For new codecs, I think we should 
> suggest that replace tries to collect 
> as much illegal data as possible before
> invoking the error handler. The handler 
> should be aware of the fact that it 
> won't necessarily get all the broken 
> data in one call.

OK for encoders, for decoders see
above.

> About the codec error handling 
> registry: You seem to be using a 
> Unicode specific approach here. 
> I'd rather like to see a generic 
> approach which uses the API 
> we discussed earlier. Would that be possible?

The handlers in the registry are all Unicode
specific. and they are different for encoding
and for decoding.

I renamed the function because of your
comment from 2001-06-13 10:05 (which 
becomes exceedingly difficult to find on
this long page! ;)).

> In that case, the codec API should 
> probably be called 
> codecs.register_error('myhandler', myhandler).
> 
> Does that make sense ?

We could require that unique names
are used for custom handlers, but
for the standard handlers we do have
name collisions. To prevent them, we
could either remove them from the registry
and require that the codec implements
the error handling for those itself,
or we could to some fiddling, so that
u"üöä".encode("ascii", "replace")
becomes 
u"üöä".encode("ascii", "unicodeencodereplace")
behind the scenes.

But I think two unicode specific 
registries are much simpler to handle.

> BTW, the patch which uses the callback 
> registry does not seem to be available 
> on this SF page (the last patch still 
> converts the errors argument to a 
> PyObject, which shouldn't be needed
> anymore with the new approach). 
> Can you please upload your 
> latest version?

OK, I'll upload a preliminary version
tomorrow. PyUnicode_EncodeDecimal and
PyUnicode_TranslateCharmap are still
missing, but otherwise the patch seems
to be finished. All decoders work and
the encoders collect unencodable characters
and implement the handling of known
callback handler names themselves.

As PyUnicode_EncodeDecimal is only used
by the int, long, float, and complex constructors,
I'd love to get rid of the errors argument,
but for completeness sake, I'll implement
the callback functionality.

> Note that the highlighting codec 
> would make a nice example
> for the new feature.

This could be part of the codec callback test
script, which I've started to write. We could
kill two birds with one stone here:
1. Test the implementation.
2. Document and advocate what is 
   possible with the patch.

Another idea: we could have as an example
a decoding handler that relaxes the
UTF-8 minimal encoding restriction, e.g.

def relaxedutf8(enc, uni, startpos, endpos, reason, data):
   if uni[startpos:startpos+2] == u"\xc0\x80":
      return (u"\x00", startpos+2)
   else:
      raise UnicodeError(...)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 21:40

Message:
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 12:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 19:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 18:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 16:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 00:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 02:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 17:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 12:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 12:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 05:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 19:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 13:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 13:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 14:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 22:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 19:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 17:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 21:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 20:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 20:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 16:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Thu Apr 18 20:34:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 12:34:16 -0700
Subject: [Patches] [ python-Patches-545792 ] Index generation for Documenting Python
Message-ID: <E16yHfc-00062f-00@usw-sf-web4.sourceforge.net>

Patches item #545792, was opened at 2002-04-18 14:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Index generation for Documenting Python

Initial Comment:
I was looking for a definition of the \ttindex
macro today and couldn't find one.  My search
was made more difficult by the lack of an index
in Documenting Python.

The attached file tweaks doc.tex and ltxmarkup.sty
to generate an index.  I don't know if my mods to
the .sty file are correct.  I was just mimicing
what I saw in the cfuncdesc macro def'n in
python.sty.

Skip


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 04:28:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 20:28:16 -0700
Subject: [Patches] [ python-Patches-545792 ] Index generation for Documenting Python
Message-ID: <E16yP4K-0002C9-00@usw-sf-web3.sourceforge.net>

Patches item #545792, was opened at 2002-04-18 15:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470

Category: Documentation
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Skip Montanaro (montanaro)
>Assigned to: Skip Montanaro (montanaro)
Summary: Index generation for Documenting Python

Initial Comment:
I was looking for a definition of the \ttindex
macro today and couldn't find one.  My search
was made more difficult by the lack of an index
in Documenting Python.

The attached file tweaks doc.tex and ltxmarkup.sty
to generate an index.  I don't know if my mods to
the .sty file are correct.  I was just mimicing
what I saw in the cfuncdesc macro def'n in
python.sty.

Skip


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-18 23:28

Message:
Logged In: YES 
user_id=3066

Sounds good to me!  Check it in.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 05:54:30 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Apr 2002 21:54:30 -0700
Subject: [Patches] [ python-Patches-545792 ] Index generation for Documenting Python
Message-ID: <E16yQPm-00066n-00@usw-sf-web1.sourceforge.net>

Patches item #545792, was opened at 2002-04-18 14:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470

Category: Documentation
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Skip Montanaro (montanaro)
Summary: Index generation for Documenting Python

Initial Comment:
I was looking for a definition of the \ttindex
macro today and couldn't find one.  My search
was made more difficult by the lack of an index
in Documenting Python.

The attached file tweaks doc.tex and ltxmarkup.sty
to generate an index.  I don't know if my mods to
the .sty file are correct.  I was just mimicing
what I saw in the cfuncdesc macro def'n in
python.sty.

Skip


----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2002-04-18 23:54

Message:
Logged In: YES 
user_id=44345

checked in as Doc/texinputs/ltxmarkup.sty v 1.6 and
Doc/doc/doc.tex v 1.63


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-18 22:28

Message:
Logged In: YES 
user_id=3066

Sounds good to me!  Check it in.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545792&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 15:46:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 07:46:23 -0700
Subject: [Patches] [ python-Patches-546163 ] build fails on Solaris8 (makedev)
Message-ID: <E16yZeZ-00017d-00@usw-sf-web3.sourceforge.net>

Patches item #546163, was opened at 2002-04-19 10:46
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: build fails on Solaris8 (makedev)

Initial Comment:
The build (link) fails on solaris 8 w/the new mknod.

This patch corrects the problem.  I think it's correct,
but I'm not real confident about all the autoconf
stuff.  So I wanted someone to take a quick look and
make sure it was correct.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 15:47:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 07:47:42 -0700
Subject: [Patches] [ python-Patches-505846 ] pyport.h, Wince and errno getter/setter
Message-ID: <E16yZfq-00018O-00@usw-sf-web3.sourceforge.net>

Patches item #505846, was opened at 2002-01-19 21:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Brad Clements (bkc)
Assigned to: Nobody/Anonymous (nobody)
Summary: pyport.h, Wince and errno getter/setter

Initial Comment:
Most of the remaining Windows CE diffs are due to the 
lack of errno on Windows CE. There are other OS's that 
do not have errno (but they have a workalike method).

At first I had simply commented out all references in 
the code to errno, but that quickly became unworkable. 

Wince and NetWare use a function to set the per-
thread "errno" value. Although errno #defines  (like 
ERANGE) are not defined for Wince, they are defined 
for NetWare. Removing references to errno would 
artificially hobble the NetWare port.

These platforms also use a function to retrieve the 
current errno value.

The attached diff for pyport.h attempts to standardize 
the access method for errno (or it's work-alike) by 
providing SetErrno(), ClearErrno() and GetErrno() 
macros.

ClearErrno() is SetErrno(0)

I've found and changed all direct references to errno 
to use these macros. This patch must be submitted 
before the patches for other modules.

--

I see two negatives with this approach:

1. It will be a pain to think GetErrno() instead 
of "errno" when modifying/writing new code.

2. Extension modules will need access to pyport.h for 
the target platform.

In the worst case, directly referencing errno instead 
of using the macros will break only those platforms 
for which the macros do something different. That is, 
Wince and NetWare.

--

An alternative spelling/capitalization of these macros 
might make them more appealing. Feel free to make a 
suggestion.


--

It's probably incorrect for me to use SetErrno() as a 
function, such as

   SetErrno(1);

I think the semi-colon is not needed, but wasn't 
entirely certain. On better advice, I will fix these 
statements in the remaining source files if this patch 
is accepted.


----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2002-04-19 16:47

Message:
Logged In: YES 
user_id=45365

Brad,
I think this patch might be asking for too much. You're asking that all accesses to errno be replaced by GetErrno() or SetErrno() calls, really...

And for many cases there is a workaround, where you don't have to change user code (i.e. the normal C code still uses what it thinks is an errno variable). On my system errno is
#define errno (*__error())
and the __error() routine returns a pointer to the errno-variable for the current thread. For the GetErrno function this would be good enough, and with a bit of effort you could probably get it to work for the Set function too (possibly by doing the actual Set work in the next Get call).


----------------------------------------------------------------------

Comment By: Brad Clements (bkc)
Date: 2002-02-12 00:17

Message:
Logged In: YES 
user_id=4631

Hi folks,

I need to proceed with the port to NetWare so I have 
something to demo at Brainshare in March. Unfortunately 
future patches from me will include both WINCE and NetWare 
specific patches, though hopefully there won't be much 
other than config.h and this patch (which is required for 
NetWare).

Is there anything I can do to make this patch more 
acceptable? Send a bottle of wine, perhaps? ;-)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-01-29 00:39

Message:
Logged In: YES 
user_id=33168

Tim, I can check in or do whatever else needs to be done 
to check this in and move this forward.  How do you want to
procede?

Brad, I think most people are pretty busy right now.

----------------------------------------------------------------------

Comment By: Brad Clements (bkc)
Date: 2002-01-29 00:19

Message:
Logged In: YES 
user_id=4631

Hi folks, just wondering if this patch is going to be 
rejected, or if you're all too busy and I have to be more 
patient ;-)

I have a passle of Python-CE folks waiting on me to finish 
checking in patches. This is the worst one, I promise!

Let me know what you want me to do, when you get a chance. 
Thanks


----------------------------------------------------------------------

Comment By: Brad Clements (bkc)
Date: 2002-01-20 21:17

Message:
Logged In: YES 
user_id=4631

I've eliminated Py_ClearErrno() and updated all the source 
to use Py_SetErrno(0).  Attached is an updated diff for 
pyport.h


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-01-20 20:21

Message:
Logged In: YES 
user_id=31435

Brad, errno is required by ANSI C, which also defines the 
semantics of a 0 value.  Setting errno to 0, and taking 
errno==0 as meaning "no error", are 100% portable across 
platforms with a standard-conforming C implementation.  If 
this platform doesn't support standard C, I have to 
question whether the core should even try to cater to it:  
the changes needed make no sense to C programmers, so may 
become a maintenance nightmare.

I don't think putting a layer around errno is going to be 
hard to live with, provided that it merely tries to emulate 
standard behavior.  For that reason, setting errno to 0 is 
correct, but inventing a new ClearErrno concept is wrong 
(the latter makes no sense to anyone except its inventor 
<wink>).

----------------------------------------------------------------------

Comment By: Brad Clements (bkc)
Date: 2002-01-20 16:54

Message:
Logged In: YES 
user_id=4631

I can post a new diff for the // or would you be willing to 
just change the patch you have?

I cannot use the same macros for Py_SET_ERANGE_IF_OVERFLOW
(X) because Wince doesn't have ERANGE. You'll note the use 
of Py_SetErrno(1) which is frankly bogus. This is related 
to your comment on Py_ClearErrno()

Using (errno == 0) as meaning "no error" seems to me to be 
a python source "convention" forced on it by (mostly) 
floating point side effects. Because the math routines are 
indicating overflow errors through the side effect of 
setting errno (rather than returning an explicit NaN that 
works on all platforms), we must set errno = 0 before 
calling these math functions. 

I suppose it's possible that on some platform "clearing the 
last error value" wouldn't be done this way, but rather 
might be an explicit function call. Since I was going 
through the source looking for all errno's, I felt it was 
clearer to say Py_ClearErrno() rather than Py_SetErrno(0), 
even though in the end they do the same thing on currently 
supported platforms.

I'm easy, if you want to replace Py_ClearErrno() with 
Py_SetErrno(0) I can do that too.

--

Regarding goto targets.. is it likely that "cleanup" might 
also collide with local variables? would _cleanup or 
__cleanup work for you?


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-01-19 23:57

Message:
Logged In: YES 
user_id=33168

Need to change the // comment to /* */.  gcc accepts this
for C, but it's non-standard (at least it was, it may have
changed in C99).

You can have 1 Py_SET_ERANGE_IF_OVERFLOW for both platforms
if you do this:

#ifndef ERANGE
#define ERANGE 1
#endif

#define Py_SET_ERANGE_IF_OVERFLOW(X) \
    do { \
        if (Py_GetErrno() == 0 && ((X) == Py_HUGE_VAL || \
                                   (X) == -Py_HUGE_VAL))  \
             Py_SetErrno(ERANGE); \
    } while(0)
I'm not sure of the usefulness of Py_ClearErrno(), since
it's the same on all platforms.  If errno might be set to
something other than 0 in the future, it would be good to
make the change now.

I would suggest changing finally to cleanup.

----------------------------------------------------------------------

Comment By: Brad Clements (bkc)
Date: 2002-01-19 22:47

Message:
Logged In: YES 
user_id=4631

Here is an amended diff with the suggested changes. I've 
tested the semi-colon handling on EVT, it works as 
suggested.

--

Question: What is the prefered style, #ifdef xyz or #if 
defined(xyz) ?

I try to use #ifdef xyz, but sometimes there's multiple 
possibilities and #if defined(x) || defined(y) is needed. 
Is that okay?

--

Upcoming issue (hoping you address in your reply). There 
are many "goto finally" statements in various modules. 
Unfortunately EVT treats "finally" as a reserved word, even 
when compiling in non C++ mode.  Also, Metrowerks does the 
same.

I've changed all of these to "goto my_finally" as a quick 
work-around. I know "my_finally" sounds yucky, what's your 
recommendation for this? 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-01-19 21:52

Message:
Logged In: YES 
user_id=31435

All identifiers defined in pyport.h must begin with "Py_".  
pyport.h is (and must be) #include'd by extension modules, 
and we need the prefix to avoid stomping on their 
namespace, and to make clear (to them and to us) that the 
gimmicks are part of Python's portability layer.  A name 
like "SetErrno" is certain to conflict with some other 
package's attempt to worm around errno problems; Py_SetErrno
() is not.  Agree with Neal's final suggestion about 
dealing with  semicolons.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-01-19 21:28

Message:
Logged In: YES 
user_id=33168

Typically, the semi-colon problem is dealt with as in
Py_SET_ERANGE_IF_OVERFLOW.

So, 

#define SetErrno(X) do { SetLastError(X); } while (0)

I don't think (but can't remember if) there is any problem
for single statements like you have.  You could probably do:

#ifndef MS_WINCE
#define SetErrno(X) errno = (X)     /* note no ; */
#else
#define SetErrno(X) SetLastError(X) /* note no ; */
#endif


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=505846&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 17:30:12 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 09:30:12 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16ybH2-0005M7-00@usw-sf-web1.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-19 18:30

Message:
Logged In: YES 
user_id=89016

OK, I have a branch version that has the methods (attached 
as branch-diff.txt). In addition to the zfill changes it 
has Guido's change to test_userstring.py and the 
subinstance checks in the test.

Does this look ok?


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-18 11:09

Message:
Logged In: YES 
user_id=6656

Walter, do you feel like sorting out the release22-maint
branch too?

It's probably best to activate the new string methods there
too.  I can't see how it could possibly break anything.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 23:35

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 20:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 17:34:00 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 09:34:00 -0700
Subject: [Patches] [ python-Patches-546194 ] _localemodule.c compile fix
Message-ID: <E16ybKi-0003wi-00@usw-sf-web2.sourceforge.net>

Patches item #546194, was opened at 2002-04-19 12:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470

Category: Build
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matt Behrens (mattbehrens)
Assigned to: Nobody/Anonymous (nobody)
Summary: _localemodule.c compile fix

Initial Comment:
For #534143.  This seems to apply cleanly to the head
as well (-2 offset).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 17:36:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 09:36:44 -0700
Subject: [Patches] [ python-Patches-546194 ] _localemodule.c compile fix
Message-ID: <E16ybNM-0005QU-00@usw-sf-web1.sourceforge.net>

Patches item #546194, was opened at 2002-04-19 12:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470

Category: Build
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matt Behrens (mattbehrens)
Assigned to: Nobody/Anonymous (nobody)
Summary: _localemodule.c compile fix

Initial Comment:
For #534143.  This seems to apply cleanly to the head
as well (-2 offset).

----------------------------------------------------------------------

>Comment By: Matt Behrens (mattbehrens)
Date: 2002-04-19 12:36

Message:
Logged In: YES 
user_id=240525

Check the box, Matt, check the box...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 17:54:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 09:54:19 -0700
Subject: [Patches] [ python-Patches-546194 ] _localemodule.c compile fix
Message-ID: <E16ybeN-00038X-00@usw-sf-web4.sourceforge.net>

Patches item #546194, was opened at 2002-04-19 18:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470

Category: Build
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matt Behrens (mattbehrens)
>Assigned to: Martin v. Löwis (loewis)
Summary: _localemodule.c compile fix

Initial Comment:
For #534143.  This seems to apply cleanly to the head
as well (-2 offset).

----------------------------------------------------------------------

Comment By: Matt Behrens (mattbehrens)
Date: 2002-04-19 18:36

Message:
Logged In: YES 
user_id=240525

Check the box, Matt, check the box...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 18:38:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 10:38:16 -0700
Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function
Message-ID: <E16ycKu-0004h7-00@usw-sf-web2.sourceforge.net>

Patches item #544909, was opened at 2002-04-16 18:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
Assigned to: Nobody/Anonymous (nobody)
Summary: addition of cmath.arg function

Initial Comment:
This patch adds the familiar "Arg" function from
complex analysis to the cmath module, though it's
called "arg" here for consistency with the other names.
Along with the built-in abs function this makes
polar/rectangular coordinate conversions trivial:

  z = complex(x,y)
  r, theta = abs(z), arg(z)
  
  z = r * exp(1j * theta)
  x, y = z.real, z.imag

----------------------------------------------------------------------

>Comment By: John Williams (johnw42)
Date: 2002-04-19 12:38

Message:
Logged In: YES 
user_id=44174

I forgot to mention this patch is relative to version 2.2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 19:10:48 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 11:10:48 -0700
Subject: [Patches] [ python-Patches-546244 ] implementation of Text.dump method
Message-ID: <E16ycqO-0006Uv-00@usw-sf-web1.sourceforge.net>

Patches item #546244, was opened at 2002-04-19 13:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
Assigned to: Nobody/Anonymous (nobody)
Summary: implementation of Text.dump method

Initial Comment:
This is a fairly robust implementation of the dump
command for the text widget. It supports all the
options of the underlying Tk command. This patch is
relative to version 2.2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 22:01:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 14:01:21 -0700
Subject: [Patches] [ python-Patches-546163 ] build fails on Solaris8 (makedev)
Message-ID: <E16yfVR-0008JI-00@usw-sf-web1.sourceforge.net>

Patches item #546163, was opened at 2002-04-19 16:46
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: build fails on Solaris8 (makedev)

Initial Comment:
The build (link) fails on solaris 8 w/the new mknod.

This patch corrects the problem.  I think it's correct,
but I'm not real confident about all the autoconf
stuff.  So I wanted someone to take a quick look and
make sure it was correct.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-19 23:01

Message:
Logged In: YES 
user_id=21627

The patch looks fine; please check it in.

In general, there no need to supply the changes to configure
on SF: if they "similar" to the changes to configure.in,
they are correct (in many cases, lots of line numbers will
change, which just makes the patch unreadable).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470


From noreply@sourceforge.net  Fri Apr 19 22:06:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 14:06:49 -0700
Subject: [Patches] [ python-Patches-546194 ] _localemodule.c compile fix
Message-ID: <E16yfaj-0008Lt-00@usw-sf-web1.sourceforge.net>

Patches item #546194, was opened at 2002-04-19 18:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470

Category: Build
Group: Python 2.2.x
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Matt Behrens (mattbehrens)
Assigned to: Martin v. Löwis (loewis)
Summary: _localemodule.c compile fix

Initial Comment:
For #534143.  This seems to apply cleanly to the head
as well (-2 offset).

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-19 23:06

Message:
Logged In: YES 
user_id=21627

Thanks, committed as _localemodule.c 2.30 and 2.25.6.3

----------------------------------------------------------------------

Comment By: Matt Behrens (mattbehrens)
Date: 2002-04-19 18:36

Message:
Logged In: YES 
user_id=240525

Check the box, Matt, check the box...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546194&group_id=5470


From noreply@sourceforge.net  Sat Apr 20 00:07:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 19 Apr 2002 16:07:58 -0700
Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers
Message-ID: <E16yhTy-0006zK-00@usw-sf-web4.sourceforge.net>

Patches item #500311, was opened at 2002-01-07 08:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michel Van den Bergh (vdbergh)
Assigned to: Martin v. Löwis (loewis)
Summary: Work around for buggy https servers

Initial Comment:
Python 2.2. Tested on RH 7.1.

This a workaround for, 

http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762

The problem is that some https servers close an ssl
connection without properly resetting it first. In the
above bug description it is suggested that this
only occurs for IIS but apparently some  (modified)
Apache servers also suffer from it (see
telemeter.telenet.be).

One of the suggested workarounds is to modify
httplib.py so as to ignore the combination of
err[0]==SSL_ERROR_SYSCALL and 
err[1]=="EOF occurred in violation of protocol".
However I think one should never compare error strings
since in principle they may depend on language etc...

So I decided to modify _socket.c slightly so that
it becomes possible to return error codes which
are not in in ssl.h.

When an ssl-connection is closed without reset I now
return the error code SSL_ERROR_EOF. Then I ignore
this (apparently benign) error in httplib.py.

In addition I fixed what I think was an error in
PySSL_SetError(SSL *ssl, int ret) in socketmodule.c.

Originally there was:

	case SSL_ERROR_SSL:
	{
		unsigned long e = ERR_get_error();
		if (e == 0) {
			/* an EOF was observed that violates the protocol */
			errstr = "EOF occurred in violation of protocol";

etc... 
but if I understand the documentation for
SSL_get_error then the test should be: e==0 && ret==0.
A similar error occurs a few lines later.

----------------------------------------------------------------------

Comment By: Jonathan Hseu (vomjom)
Date: 2002-04-19 23:07

Message:
Logged In: YES 
user_id=19719

I can confirm that this patch works correctly.  I was
experiencing this bug when I tried using httplib for https
connections for IIS servers.  I patched it, and now it works
perfectly :).

E-mail me at vomjom@vomjom.org if you have any questions.

Please use this patch.


----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-17 14:44

Message:
Logged In: YES 
user_id=76089

Yes, that test works fine.

The patch looks correct to me by inspection also. Michel's 
comments about SSL_get_error are correct according to the 
OpenSSL documentation, i.e. the existing code is incorrect 
(this being a separate issue to whether or not "EOF 
occurred" should be ignored, which is a work-around for 
other peoples' bugs).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-17 14:16

Message:
Logged In: YES 
user_id=21627

jribbens: Even when running the test from 494762, i.e.

import os,urllib2
os.environ["http_proxy"]=''
f = urllib2.urlopen("https://wwws.task.com.br/i.htm")
print f.read()

This gives an empty response for me...

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-16 17:00

Message:
Logged In: YES 
user_id=76089

py23ssl.txt works fine for me when applied to latest CVS, 
and fixes the problem.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-11 06:34

Message:
Logged In: YES 
user_id=21627

Unfortunately, your patch appears to be incorrect.
Performing the script in #494762, I get an empty string as
the result, whereas the content of the resource is 'HTTPS Test'

In case you want to experiment with the CVS version I'll
attach a patch for that.

----------------------------------------------------------------------

Comment By: Michel Van den Bergh (vdbergh)
Date: 2002-01-09 10:25

Message:
Logged In: YES 
user_id=10252

Due to some problems with sourceforge and incompetence on my
part I submitted this several times.
Please see patch 500311. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470


From noreply@sourceforge.net  Sat Apr 20 08:52:41 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Apr 2002 00:52:41 -0700
Subject: [Patches] [ python-Patches-500311 ] Work around for buggy https servers
Message-ID: <E16ypfl-0001w3-00@usw-sf-web3.sourceforge.net>

Patches item #500311, was opened at 2002-01-07 09:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470

Category: Modules
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Michel Van den Bergh (vdbergh)
Assigned to: Martin v. Löwis (loewis)
Summary: Work around for buggy https servers

Initial Comment:
Python 2.2. Tested on RH 7.1.

This a workaround for, 

http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=494762

The problem is that some https servers close an ssl
connection without properly resetting it first. In the
above bug description it is suggested that this
only occurs for IIS but apparently some  (modified)
Apache servers also suffer from it (see
telemeter.telenet.be).

One of the suggested workarounds is to modify
httplib.py so as to ignore the combination of
err[0]==SSL_ERROR_SYSCALL and 
err[1]=="EOF occurred in violation of protocol".
However I think one should never compare error strings
since in principle they may depend on language etc...

So I decided to modify _socket.c slightly so that
it becomes possible to return error codes which
are not in in ssl.h.

When an ssl-connection is closed without reset I now
return the error code SSL_ERROR_EOF. Then I ignore
this (apparently benign) error in httplib.py.

In addition I fixed what I think was an error in
PySSL_SetError(SSL *ssl, int ret) in socketmodule.c.

Originally there was:

	case SSL_ERROR_SSL:
	{
		unsigned long e = ERR_get_error();
		if (e == 0) {
			/* an EOF was observed that violates the protocol */
			errstr = "EOF occurred in violation of protocol";

etc... 
but if I understand the documentation for
SSL_get_error then the test should be: e==0 && ret==0.
A similar error occurs a few lines later.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-20 09:52

Message:
Logged In: YES 
user_id=21627

I just retried, and it now works for me as well. Committed as

httplib.py 1.50
_ssl.c 1.2
ACSK 1.170
httplib.py 1.42.10.5
socketmodule.c 1.200.6.4

Thanks guys!


----------------------------------------------------------------------

Comment By: Jonathan Hseu (vomjom)
Date: 2002-04-20 01:07

Message:
Logged In: YES 
user_id=19719

I can confirm that this patch works correctly.  I was
experiencing this bug when I tried using httplib for https
connections for IIS servers.  I patched it, and now it works
perfectly :).

E-mail me at vomjom@vomjom.org if you have any questions.

Please use this patch.


----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-17 16:44

Message:
Logged In: YES 
user_id=76089

Yes, that test works fine.

The patch looks correct to me by inspection also. Michel's 
comments about SSL_get_error are correct according to the 
OpenSSL documentation, i.e. the existing code is incorrect 
(this being a separate issue to whether or not "EOF 
occurred" should be ignored, which is a work-around for 
other peoples' bugs).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-17 16:16

Message:
Logged In: YES 
user_id=21627

jribbens: Even when running the test from 494762, i.e.

import os,urllib2
os.environ["http_proxy"]=''
f = urllib2.urlopen("https://wwws.task.com.br/i.htm")
print f.read()

This gives an empty response for me...

----------------------------------------------------------------------

Comment By: Jon Ribbens (jribbens)
Date: 2002-04-16 19:00

Message:
Logged In: YES 
user_id=76089

py23ssl.txt works fine for me when applied to latest CVS, 
and fixes the problem.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-11 07:34

Message:
Logged In: YES 
user_id=21627

Unfortunately, your patch appears to be incorrect.
Performing the script in #494762, I get an empty string as
the result, whereas the content of the resource is 'HTTPS Test'

In case you want to experiment with the CVS version I'll
attach a patch for that.

----------------------------------------------------------------------

Comment By: Michel Van den Bergh (vdbergh)
Date: 2002-01-09 11:25

Message:
Logged In: YES 
user_id=10252

Due to some problems with sourceforge and incompetence on my
part I submitted this several times.
Please see patch 500311. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=500311&group_id=5470


From noreply@sourceforge.net  Sat Apr 20 14:46:51 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Apr 2002 06:46:51 -0700
Subject: [Patches] [ python-Patches-546163 ] build fails on Solaris8 (makedev)
Message-ID: <E16yvCV-00070f-00@usw-sf-web2.sourceforge.net>

Patches item #546163, was opened at 2002-04-19 10:46
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470

Category: Build
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Neal Norwitz (nnorwitz)
Summary: build fails on Solaris8 (makedev)

Initial Comment:
The build (link) fails on solaris 8 w/the new mknod.

This patch corrects the problem.  I think it's correct,
but I'm not real confident about all the autoconf
stuff.  So I wanted someone to take a quick look and
make sure it was correct.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-20 09:46

Message:
Logged In: YES 
user_id=33168

Sorry, I forgot about configure.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-19 17:01

Message:
Logged In: YES 
user_id=21627

The patch looks fine; please check it in.

In general, there no need to supply the changes to configure
on SF: if they "similar" to the changes to configure.in,
they are correct (in many cases, lots of line numbers will
change, which just makes the patch unreadable).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546163&group_id=5470


From noreply@sourceforge.net  Sat Apr 20 16:34:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Apr 2002 08:34:56 -0700
Subject: [Patches] [ python-Patches-432401 ] unicode encoding error callbacks
Message-ID: <E16ywt6-0001EP-00@usw-sf-web1.sourceforge.net>

Patches item #432401, was opened at 2001-06-12 15:43
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Postponed
Priority: 6
Submitted By: Walter Dörwald (doerwalter)
Assigned to: M.-A. Lemburg (lemburg)
Summary: unicode encoding error callbacks

Initial Comment:
This patch adds unicode error handling callbacks to the
encode functionality. With this patch it's possible to
not only pass 'strict', 'ignore' or 'replace' as the
errors argument to encode, but also a callable
function, that will be called with the encoding name,
the original unicode object and the position of the
unencodable character. The callback must return a
replacement unicode object that will be encoded instead
of the original character.

For example replacing unencodable characters with XML
character references can be done in the following way.

u"aäoöuüß".encode(
   "ascii",
   lambda enc, uni, pos: u"&#x%x;" % ord(uni[pos])
)


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-20 17:34

Message:
Logged In: YES 
user_id=89016

A new idea for the interface between the
codec and the callback:

Maybe we could have new exception classes
UnicodeEncodeError, UnicodeDecodeError
and UnicodeTranslateError derived from
UnicodeError. They have all the attributes
that are passed as an argument
tuple in the current version:
string: the original string
start: the start position of the
unencodable characters/undecodable bytes
end: the end position+1 of the unencodable
characters/undecodable bytes.
reason: the a string, that explains, why
the encoding/decoding doesn't work.

There is no data object, because when a codec
wants to pass extended information to the
callback it can do this via a derived
class.

It might be better to move these attributes
to the base class UnicodeError, but this
might have backwards compatibility
problems.

With this method we really can have one global
registry for all callbacks, because for callback
names that must work with encoding *and* decoding
*and* translating (i.e. "strict", "replace" and 
"ignore"), the callback can check which type 
of exception was passed, so "replace" can
e.g. look like this:

def replace(exc):
   if isinstance(exc, UnicodeDecodeError):
      return ("?", exc.end)
   else:
      return (u"?"*(exc.end-exc.start), exc.end)

Another possibility would be to do the commucation
callback->codec by assigning to attributes
of the exception object. The resyncronisation 
position could even be preassigned to end, so
the callback only needs to specify the 
replacement in most cases:

def replace(exc):
   if isinstance(exc, UnicodeDecodeError):
      exc.replacement = "?"
   else:
      exc.replacement = u"?"*(exc.end-exc.start)

As many of the assignments can now be done on
the C level without having to allocate Python
objects (except for the replacement string
and the reason), this version might even be 
faster, especially if we allow the codec to 
reuse the exception object for the next call 
to the callback.

Does this make sense, or is this to fancy?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-18 21:24

Message:
Logged In: YES 
user_id=89016

And here is the test script (test_codeccallbacks.py)

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-18 21:22

Message:
Logged In: YES 
user_id=89016

OK, here is the current version of the patch (diff7.txt). 
PyUnicode_EncodeDecimal and PyUnicode_TranslateCharmap are 
still missing.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=89016

> About the difference between encoding 
> and decoding: you shouldn't just look 
> at the case where you work with Unicode 
> and strings, e.g. take the rot-13 codec
> which works on strings only or other
> codecs which translate objects into 
> strings and vice-versa.

unicode.encode encodes to str and 
str.decode decodes to unicode,
even for rot-13:

>>> u"gürk".encode("rot13")
't\xfcex'
>>> "gürk".decode("rot13")
u't\xfcex'
>>> u"gürk".decode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'unicode' object has no attribute 'decode'
>>> "gürk".encode("rot13")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/home/walter/Python-current-
readonly/dist/src/Lib/encodings/rot_13.py", line 18, in 
encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeError: ASCII decoding error: ordinal not in range
(128)

Here the str is converted to unicode
first, before encode is called, but the
conversion to unicode fails.

Is there an example where something
else happens?

> Error handling has to be flexible enough 
> to handle all these situations. Since 
> the codecs know best how to handle the
> situations, I'd make this an implementation 
> detail of the codec and leave the
> behaviour undefined in the general case.

OK, but we should suggest, that for encoding
unencodable characters are collected
and for decoding seperate byte sequences
that are considered broken by the codec
are passed to the callback: i.e for 
decoding the handler will never get
all broken data in one call, e.g. 
for "\u30\Uffffffff".decode("unicode-escape")
the handler will be called twice (once for
"\u30" and "truncated \u escape" as the
reason and once for "\Uffffffff" and
"illegal character" as the reason.)

> For the existing codecs, backward 
> compatibility should be maintained, 
> if at all possible. If the patch gets 
> overly complicated because of this, 
> we may have to provide a downgrade solution
> for this particular problem (I don't think 
> replace is used in any computational context, 
> though, since you can never be sure how 
> many replacement character do get 
> inserted, so the case may not be 
> that realistic).
> 
> Raising an exception for the charmap codec 
> is the right way to go, IMHO. I would 
> consider the current behaviour a bug.

OK, this is implemented in PyUnicode_EncodeCharmap now, 
and collecting unencodable characters works too.

I completely changed the implementation,
because the stack approach would have
gotten much more complicated when
unencodable characters are collected.

> For new codecs, I think we should 
> suggest that replace tries to collect 
> as much illegal data as possible before
> invoking the error handler. The handler 
> should be aware of the fact that it 
> won't necessarily get all the broken 
> data in one call.

OK for encoders, for decoders see
above.

> About the codec error handling 
> registry: You seem to be using a 
> Unicode specific approach here. 
> I'd rather like to see a generic 
> approach which uses the API 
> we discussed earlier. Would that be possible?

The handlers in the registry are all Unicode
specific. and they are different for encoding
and for decoding.

I renamed the function because of your
comment from 2001-06-13 10:05 (which 
becomes exceedingly difficult to find on
this long page! ;)).

> In that case, the codec API should 
> probably be called 
> codecs.register_error('myhandler', myhandler).
> 
> Does that make sense ?

We could require that unique names
are used for custom handlers, but
for the standard handlers we do have
name collisions. To prevent them, we
could either remove them from the registry
and require that the codec implements
the error handling for those itself,
or we could to some fiddling, so that
u"üöä".encode("ascii", "replace")
becomes 
u"üöä".encode("ascii", "unicodeencodereplace")
behind the scenes.

But I think two unicode specific 
registries are much simpler to handle.

> BTW, the patch which uses the callback 
> registry does not seem to be available 
> on this SF page (the last patch still 
> converts the errors argument to a 
> PyObject, which shouldn't be needed
> anymore with the new approach). 
> Can you please upload your 
> latest version?

OK, I'll upload a preliminary version
tomorrow. PyUnicode_EncodeDecimal and
PyUnicode_TranslateCharmap are still
missing, but otherwise the patch seems
to be finished. All decoders work and
the encoders collect unencodable characters
and implement the handling of known
callback handler names themselves.

As PyUnicode_EncodeDecimal is only used
by the int, long, float, and complex constructors,
I'd love to get rid of the errors argument,
but for completeness sake, I'll implement
the callback functionality.

> Note that the highlighting codec 
> would make a nice example
> for the new feature.

This could be part of the codec callback test
script, which I've started to write. We could
kill two birds with one stone here:
1. Test the implementation.
2. Document and advocate what is 
   possible with the patch.

Another idea: we could have as an example
a decoding handler that relaxes the
UTF-8 minimal encoding restriction, e.g.

def relaxedutf8(enc, uni, startpos, endpos, reason, data):
   if uni[startpos:startpos+2] == u"\xc0\x80":
      return (u"\x00", startpos+2)
   else:
      raise UnicodeError(...)


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 21:40

Message:
Logged In: YES 
user_id=38388

Sorry for the late response.

About the difference between encoding and decoding: you shouldn't
just look at the case where you work with Unicode and strings, e.g.
take the rot-13 codec which works on strings only or other codecs
which translate objects into strings and vice-versa.

Error handling has to be flexible enough to handle all these 
situations. Since the codecs know best how to handle the situations,
I'd make this an implementation detail of the codec and leave the
behaviour undefined in the general case.

For the existing codecs, backward compatibility should be 
maintained, if at all possible. If the patch gets overly complicated
because of this, we may have to provide a downgrade solution
for this particular problem (I don't think replace is used in any
computational context, though, since you can never be sure
how many replacement character do get inserted, so the case
may not be that realistic).

Raising an exception for the charmap codec is the right
way to go, IMHO. I would consider the current behaviour
a bug.

For new codecs, I think we should suggest that replace
tries to collect as much illegal data as possible before
invoking the error handler. The handler should be aware
of the fact that it won't necessarily get all the broken data
in one call.

About the codec error handling registry:
You seem to be using a Unicode specific approach
here. I'd rather like to see a generic approach which uses
the API we discussed earlier. Would that be possible ?
In that case, the codec API should probably be called
codecs.register_error('myhandler', myhandler).

Does that make sense ?

BTW, the patch which uses the callback registry does not seem
to be available on this SF page (the last patch still converts
the errors argument to a PyObject, which shouldn't be needed
anymore with the new approach). Can you please upload your 
latest version ?

Note that the highlighting codec would make a nice example
for the new feature.

Thanks.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 12:21

Message:
Logged In: YES 
user_id=89016

Another note: the patch will change the meaning of charmap 
encoding slightly: currently "replace" will put a ? into 
the output, even if ? is not in the mapping, i.e. 
codecs.charmap_encode(u"c", "replace", {ord("a"): ord
("b")}) will return ('?', 1).

With the patch the above example will raise an exception.

Off course with the patch many more replace characters can 
appear, so it is vital that for the replacement string the 
mapping is done.

Is this semantic change OK? (I guess all of the existing 
codecs have a mapping ord("?")->ord("?"))


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:19

Message:
Logged In: YES 
user_id=89016

So this means that the encoder can collect illegal 
characters and pass it to the callback. "replace" will 
replace this with (end-start)*u"?".

Decoders don't collect all illegal byte sequences, but call 
the callback once for every byte sequence that has been 
found illegal and "replace" will replace it with u"?".

Does this make sense?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-15 18:06

Message:
Logged In: YES 
user_id=89016

For encoding it's always (end-start)*u"?":
>>> u"ää".encode("ascii", "replace")
'??'

But for decoding, it is neither nor:
>>> "\Ux\U".decode("unicode-escape", "replace")
u'\ufffd\ufffd'

i.e. a sequence of 5 illegal characters was replace by two 
replacement characters. This might mean that decoders can't 
collect all the illegal characters and call the callback 
once. They might have to call the callback for every single 
illegal byte sequence to get the old behaviour.

(It seems that this patch would be much, much simpler, if 
we only change the encoders)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 19:36

Message:
Logged In: YES 
user_id=38388

Hmm, whatever it takes to maintain backwards 
compatibility. Do you have an example ?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 18:31

Message:
Logged In: YES 
user_id=89016

What should replace do: Return u"?" or (end-start)*u"?"

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-08 16:15

Message:
Logged In: YES 
user_id=38388

Sounds like a good idea. Please keep the encoder and 
decoder APIs symmetric, though, ie. add the slice
information to both APIs. The slice should use the
same format as Python's standard slices, that is
left inclusive, right exclusive.

I like the highlighting feature !


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-08 00:09

Message:
Logged In: YES 
user_id=89016

I'm think about extending the API a little bit:

Consider the following example:
>>> "\u1".decode("unicode-escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' 
can't decode byte 0x31 
in position 2: truncated \uXXXX escape

The error message is a lie: Not the '1' 
in position 2 is the problem, but the 
complete truncated sequence '\u1'. 
For this the decoder should pass a start 
and an end position to the handler.

For encoding this would be useful too: 
Suppose I want to have an encoder that 
colors the unencodable character via an 
ANSI escape sequences. Then I could do 
the following:
>>> import codecs
>>> def color(enc, uni, pos, why, sta):
...    return (u"\033[1m<%d>\033[0m" % ord(uni[pos]), pos+1)
... 
>>> codecs.register_unicodeencodeerrorhandler("color", 
color)
>>> u"aäüöo".encode("ascii", "color")
'a\x1b[1m<228>\x1b[0m\x1b[1m<252>\x1b[0m\x1b[1m<246>\x1b
[0mo'

But here the sequences "\x1b[0m\x1b[1m" are not needed.

To fix this problem the encoder could collect as many
unencodable characters as possible and pass those to 
the error callback in one go (passing a start and 
end+1 position).

This fixes the above problem and reduces the number of 
calls to the callback, so it should speed up the 
algorithms in case of custom encoding names. 
(And it makes the implementation very interesting ;))

What do you think?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-07 02:29

Message:
Logged In: YES 
user_id=89016

I started from scratch, and the current state is this:

Encoding mostly works (except that I haven't changed 
TranslateCharmap and EncodeDecimal yet) and most of the 
decoding stuff works (DecodeASCII and DecodeCharmap are 
still unchanged) and the decoding callback helper isn't 
optimized for the "builtin" names yet (i.e. it still calls 
the handler).

For encoding the callback helper knows how to 
handle "strict", "replace", "ignore" 
and "xmlcharrefreplace" itself and won't call the callback. 
This should make the encoder fast enough. As callback name 
string comparison results are cached it might even be 
faster than the original.

The patch so far didn't require any changes to 
unicodeobject.h, stringobject.h or stringobject.c


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-03-05 17:49

Message:
Logged In: YES 
user_id=38388

Walter, are you making any progress on the new scheme
we discussed on the mailing list (adding an error handler
registry much like the codec registry itself instead of trying 
to redo the complete codec API) ?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-09-20 12:38

Message:
Logged In: YES 
user_id=38388

I am postponing this patch until the PEP process has started. This feature won't make it into Python 2.2. 

Walter, you may want to reference this patch in the PEP.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-08-16 12:53

Message:
Logged In: YES 
user_id=38388

I think we ought to summarize these changes in a PEP to get some more feedback and testing from others as 
well.

I'll look into this after I'm back from vacation on the 10.09.

Given the release schedule I am not sure whether this feature will make it into 2.2. The size of the patch is huge 
and probably needs a lot of testing first.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-27 05:55

Message:
Logged In: YES 
user_id=89016

Changing the decoding API is done now. There 
are new functions
codec.register_unicodedecodeerrorhandler and
codec.lookup_unicodedecodeerrorhandler. 
Only the standard handlers for 'strict', 
'ignore' and 'replace' are preregistered.

There may be many reasons for decoding errors 
in the byte string, so I added an additional
argument to the decoding API: reason, which 
gives the reason for the failure, e.g.:

>>> "\U1111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 8: truncated \UXXXXXXXX escape
>>> "\U11111111".decode("unicode_escape")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'unicodeescape' can't decode byte 
0x31 in position 9: illegal Unicode character

For symmetry I added this to the encoding API too:
>>> u"\xff".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: encoding 'ascii' can't decode byte 0xff in 
position 0: ordinal not in range(128)

The parameters passed to the callbacks now are:
encoding, unicode, position, reason, state.

The encoding and decoding API for strings has been 
adapted too, so now the new API should be usable 
everywhere:

>>> unicode("a\xffb\xffc", "ascii", 
...    lambda enc, uni, pos, rea, sta: (u"<?>", pos+1))
u'a<?>b<?>c'
>>> "a\xffb\xffc".decode("ascii",
...    lambda enc, uni, pos, rea, sta: (u"<?>", 
pos+1))            
u'a<?>b<?>c'

I had a problem with the decoding API: all the 
functions in _codecsmodule.c used the t# format 
specifier. I changed that to O! with 
&PyString_Type, because otherwise we would have 
the problem that the decoding API would must pass
buffer object around instead of strings, and 
the callback would have to call str() on the 
buffer anyway to access a specific character, so 
this wouldn't be any faster than calling str() 
on the buffer before decoding. It seems that 
buffers  aren't used anyway. 

I changed all the old function to call the new 
ones so bugfixes don't have to be done in two 
places. There are two exceptions: I didn't 
change PyString_AsEncodedString and 
PyString_AsDecodedString because they are 
documented as deprecated anyway (although they 
are called in a few spots) This means that I 
duplicated part of their functionality in 
PyString_AsEncodedObjectEx and 
PyString_AsDecodedObjectEx.

There are still a few spots that call the old API:
E.g. PyString_Format still calls PyUnicode_Decode 
(but with strict decoding) because it passes the 
rest of the format string to PyUnicode_Format 
when it encounters a Unicode object.

Should we switch to the new API everywhere even 
if strict encoding/decoding is used?

The size of this patch begins to scare me. I 
guess we need an extensive test script for all the 
new features and documentation. I hope you have time 
to do that, as I'll be busy with other projects in
the next weeks. (BTW, I have't touched 
PyUnicode_TranslateCharmap yet.)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-23 19:03

Message:
Logged In: YES 
user_id=89016

New version of the patch with the error handling callback 
registry. 

> > OK, done, now there's a
> > PyCodec_EscapeReplaceUnicodeEncodeErrors/
> > codecs.escapereplace_unicodeencode_errors
> > that uses \u (or \U if x>0xffff (with a wide build
> > of Python)).
> 
> Great!

Now PyCodec_EscapeReplaceUnicodeEncodeErrors uses \x
in addition to \u and \U where appropriate.

> > [...] 
> > But for special one-shot error handlers, it might still 
be
> > useful to pass the error handler directly, so maybe we
> > should leave error as PyObject *, but implement the
> > registry anyway?
> 
> Good idea !
> 
> One minor nit: codecs.registerError() should be named
> codecs.register_errorhandler() to be more inline with
> the Python coding style guide.

OK, but these function are specific to unicode encoding,
so now the functions are called:
   codecs.register_unicodeencodeerrorhandler
   codecs.lookup_unicodeencodeerrorhandler

Now all callbacks (including the new 
ones: "xmlcharrefreplace" 
and "escapereplace") are registered in the 
codecs.c/_PyCodecRegistry_Init so using them is really 
simple: u"gürk".encode("ascii", "xmlcharrefreplace")


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-13 13:26

Message:
Logged In: YES 
user_id=38388

> > >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> > >    > > could be reimplemented as PyUnicode_EncodeASCII
> > >    > > with \uxxxx replacement callback.
> > >    >
> > >    > Hmm, wouldn't that result in a slowdown ? If so,
> > >    > I'd rather leave the special encoder in place,
> > >    > since it is being used a lot in Python and
> > >    > probably some applications too.
> > >
> > >    It would be a slowdown. But callbacks open many
> > >    possiblities.
> >
> > True, but in this case I believe that we should stick with
> > the native implementation for "unicode-escape". Having
> > a standard callback error handler which does the \uXXXX
> > replacement would be nice to have though, since this would
> > also be usable with lots of other codecs (e.g. all the
> > code page ones).
> 
> OK, done, now there's a
> PyCodec_EscapeReplaceUnicodeEncodeErrors/
> codecs.escapereplace_unicodeencode_errors
> that uses \u (or \U if x>0xffff (with a wide build
> of Python)).

Great !
 
> > [...]
> > >    Should the old TranslateCharmap map to the new
> > >    TranslateCharmapEx and inherit the
> > >    "multicharacter replacement" feature,
> > >    or should I leave it as it is?
> >
> > If possible, please also add the multichar replacement
> > to the old API. I think it is very useful and since the
> > old APIs work on raw buffers it would be a benefit to have
> > the functionality in the old implementation too.
> 
> OK! I will try to find the time to implement that in the
> next days.

Good.
 
> > [Decoding error callbacks]
> >
> > About the return value:
> >
> > I'd suggest to always use the same tuple interface, e.g.
> >
> >     callback(encoding, input_data, input_position,
> state) ->
> >         (output_to_be_appended, new_input_position)
> >
> > (I think it's better to use absolute values for the
> > position rather than offsets.)
> >
> > Perhaps the encoding callbacks should use the same
> > interface... what do you think ?
> 
> This would make the callback feature hypergeneric and a
> little slower, because tuples have to be created, but it
> (almost) unifies the encoding and decoding API. ("almost"
> because, for the encoder output_to_be_appended will be
> reencoded, for the decoder it will simply be appended.),
> so I'm for it.

That's the point. 

Note that I don't think the tuple creation
will hurt much (see the make_tuple() API in codecs.c)
since small tuples are cached by Python internally.
 
> I implemented this and changed the encoders to only
> lookup the error handler on the first error. The UCS1
> encoder now no longer uses the two-item stack strategy.
> (This strategy only makes sense for those encoder where
> the encoding itself is much more complicated than the
> looping/callback etc.) So now memory overflow tests are
> only done, when an unencodable error occurs, so now the
> UCS1 encoder should be as fast as it was without
> error callbacks.
> 
> Do we want to enforce new_input_position>input_position,
> or should jumping back be allowed?

No; moving backwards should be allowed (this may be useful
in order to resynchronize with the input data).
 
> Here's is the current todo list:
> 1. implement a new TranslateCharmap and fix the old.
> 2. New encoding API for string objects too.
> 3. Decoding
> 4. Documentation
> 5. Test cases
> 
> I'm thinking about a different strategy for implementing
> callbacks
> (see http://mail.python.org/pipermail/i18n-sig/2001-
> July/001262.html)
> 
> We coould have a error handler registry, which maps names
> to error handlers, then it would be possible to keep the
> errors argument as "const char *" instead of "PyObject *".
> Currently PyCodec_UnicodeEncodeHandlerForObject is a
> backwards compatibility hack that will never go away,
> because
> it's always more convenient to type
>    u"...".encode("...", "strict")
> instead of
>    import codecs
>    u"...".encode("...", codecs.raise_encode_errors)
> 
> But with an error handler registry this function would
> become the official lookup method for error handlers.
> (PyCodec_LookupUnicodeEncodeErrorHandler?)
> Python code would look like this:
> ---
> def xmlreplace(encoding, unicode, pos, state):
>    return (u"&#%d;" % ord(uni[pos]), pos+1)
> 
> import codec
> 
> codec.registerError("xmlreplace",xmlreplace)
> ---
> and then the following call can be made:
>         u"äöü".encode("ascii", "xmlreplace")
> As soon as the first error is encountered, the encoder uses
> its builtin error handling method if it recognizes the name
> ("strict", "replace" or "ignore") or looks up the error
> handling function in the registry if it doesn't. In this way
> the speed for the backwards compatible features is the same
> as before and "const char *error" can be kept as the
> parameter to all encoding functions. For speed common error
> handling names could even be implemented in the encoder
> itself.
> 
> But for special one-shot error handlers, it might still be
> useful to pass the error handler directly, so maybe we
> should leave error as PyObject *, but implement the
> registry anyway?

Good idea !

One minor nit: codecs.registerError() should be named
codecs.register_errorhandler() to be more inline with
the Python coding style guide.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-07-12 13:03

Message:
Logged In: YES 
user_id=89016

> >    [...]
> >    so I guess we could change the replace handler
> >    to always return u'?'. This would make the
> >    implementation a little bit simpler, but the 
> >    explanation of the callback feature *a lot* 
> >    simpler. 
> 
> Go for it.

OK, done!

> [...]
> >    > Could you add these docs to the Misc/unicode.txt
> >    > file ? I will eventually take that file and turn 
> >    > it into a PEP which will then serve as general 
> >    > documentation for these things.
> > 
> >    I could, but first we should work out how the 
> >    decoding callback API will work.
> 
> Ok. BTW, Barry Warsaw already did the work of converting
> the unicode.txt to PEP 100, so the docs should eventually 
> go there.

OK. I guess it would be best to do this when everything 
is finished.

> >    > > BTW, I guess PyUnicode_EncodeUnicodeEscape
> >    > > could be reimplemented as PyUnicode_EncodeASCII 
> >    > > with \uxxxx replacement callback.
> >    >
> >    > Hmm, wouldn't that result in a slowdown ? If so,
> >    > I'd rather leave the special encoder in place, 
> >    > since it is being used a lot in Python and 
> >    > probably some applications too.
> > 
> >    It would be a slowdown. But callbacks open many 
> >    possiblities.
> 
> True, but in this case I believe that we should stick with
> the native implementation for "unicode-escape". Having
> a standard callback error handler which does the \uXXXX
> replacement would be nice to have though, since this would
> also be usable with lots of other codecs (e.g. all the
> code page ones).

OK, done, now there's a 
PyCodec_EscapeReplaceUnicodeEncodeErrors/
codecs.escapereplace_unicodeencode_errors
that uses \u (or \U if x>0xffff (with a wide build
of Python)).

> >    For example:
> > 
> >       Why can't I print u"gürk"?
> > 
> >    is probably one of the most frequently asked
> >    questions in comp.lang.python. For printing 
> >    Unicode stuff, print could be extended the use an 
> >    error handling callback for Unicode strings (or 
> >    objects where __str__ or tp_str returns a Unicode 
> >    object) instead of using str() which always 
> >    returns an 8bit string and uses strict encoding. 
> >    There might even be a
> >    sys.setprintencodehandler()/sys.getprintencodehandler
()
> 
> There already is a print callback in Python (forgot the
> name of the hook though), so this should be possible by 
> providing the encoding logic in the hook.

True: sys.displayhook

> [...]
> >    Should the old TranslateCharmap map to the new 
> >    TranslateCharmapEx and inherit the 
> >    "multicharacter replacement" feature,
> >    or should I leave it as it is?
> 
> If possible, please also add the multichar replacement
> to the old API. I think it is very useful and since the
> old APIs work on raw buffers it would be a benefit to have
> the functionality in the old implementation too.

OK! I will try to find the time to implement that in the 
next days.

> [Decoding error callbacks]
>
> About the return value:
> 
> I'd suggest to always use the same tuple interface, e.g.
> 
>     callback(encoding, input_data, input_position, 
state) -> 
>         (output_to_be_appended, new_input_position)
> 
> (I think it's better to use absolute values for the 
> position rather than offsets.)
> 
> Perhaps the encoding callbacks should use the same 
> interface... what do you think ?

This would make the callback feature hypergeneric and a
little slower, because tuples have to be created, but it
(almost) unifies the encoding and decoding API. ("almost" 
because, for the encoder output_to_be_appended will be 
reencoded, for the decoder it will simply be appended.), 
so I'm for it.

I implemented this and changed the encoders to only 
lookup the error handler on the first error. The UCS1 
encoder now no longer uses the two-item stack strategy. 
(This strategy only makes sense for those encoder where 
the encoding itself is much more complicated than the 
looping/callback etc.) So now memory overflow tests are 
only done, when an unencodable error occurs, so now the 
UCS1 encoder should be as fast as it was without 
error callbacks.

Do we want to enforce new_input_position>input_position,
or should jumping back be allowed?

> >    > > One additional note: It is vital that errors
> >    > > is an assignable attribute of the StreamWriter.
> >    >
> >    > It is already !
> > 
> >    I know, but IMHO it should be documented that an
> >    assignable errors attribute must be supported 
> >    as part of the official codec API.
> > 
> >    Misc/unicode.txt is not clear on that:
> >    """
> >    It is not required by the Unicode implementation
> >    to use these base classes, only the interfaces must 
> >    match; this allows writing Codecs as extension types.
> >    """
> 
> Good point. I'll add that to the PEP 100.

OK.

Here's is the current todo list:
1. implement a new TranslateCharmap and fix the old.
2. New encoding API for string objects too.
3. Decoding
4. Documentation
5. Test cases

I'm thinking about a different strategy for implementing 
callbacks
(see http://mail.python.org/pipermail/i18n-sig/2001-
July/001262.html)

We coould have a error handler registry, which maps names 
to error handlers, then it would be possible to keep the 
errors argument as "const char *" instead of "PyObject *". 
Currently PyCodec_UnicodeEncodeHandlerForObject is a 
backwards compatibility hack that will never go away, 
because 
it's always more convenient to type
   u"...".encode("...", "strict")
instead of
   import codecs
   u"...".encode("...", codecs.raise_encode_errors)

But with an error handler registry this function would 
become the official lookup method for error handlers. 
(PyCodec_LookupUnicodeEncodeErrorHandler?)
Python code would look like this:
---
def xmlreplace(encoding, unicode, pos, state):
   return (u"&#%d;" % ord(uni[pos]), pos+1)

import codec

codec.registerError("xmlreplace",xmlreplace)
---
and then the following call can be made:
	u"äöü".encode("ascii", "xmlreplace")
As soon as the first error is encountered, the encoder uses
its builtin error handling method if it recognizes the name 
("strict", "replace" or "ignore") or looks up the error 
handling function in the registry if it doesn't. In this way
the speed for the backwards compatible features is the same 
as before and "const char *error" can be kept as the 
parameter to all encoding functions. For speed common error 
handling names could even be implemented in the encoder 
itself.

But for special one-shot error handlers, it might still be 
useful to pass the error handler directly, so maybe we 
should leave error as PyObject *, but implement the 
registry anyway?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-07-10 14:29

Message:
Logged In: YES 
user_id=38388

Ok, here we go...

>    > > raise an exception). U+FFFD characters in the 
>    replacement
>    > > string will be replaced with a character that the 
>    encoder
>    > > chooses ('?' in all cases).
>    >
>    > Nice.
> 
>    But the special casing of U+FFFD makes the interface 
>    somewhat
>    less clean than it could be. It was only done to be 100%
>    backwards compatible. With the original "replace"
>    error
>    handling the codec chose the replacement character. But as
>    far as I can tell none of the codecs uses anything other
>    than '?', 

True.

>    so I guess we could change the replace handler
>    to always return u'?'. This would make the implementation a
>    little bit simpler, but the explanation of the callback
>    feature *a lot* simpler. 

Go for it.

>    And if you still want to handle
>    an unencodable U+FFFD, you can write a special callback for
>    that, e.g.
> 
>    def FFFDreplace(enc, uni, pos):
>    if uni[pos] == "\ufffd":
>    return u"?"
>    else:
>    raise UnicodeError(...)
>
>    > ...docs...
>    >
>    > Could you add these docs to the Misc/unicode.txt file ? I
>    > will eventually take that file and turn it into a PEP 
>    which
>    > will then serve as general documentation for these things.
> 
>    I could, but first we should work out how the decoding
>    callback API will work.

Ok. BTW, Barry Warsaw already did the work of converting the
unicode.txt to PEP 100, so the docs should eventually go there.
 
>    > > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
>    > > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
>    > > replacement callback.
>    >
>    > Hmm, wouldn't that result in a slowdown ? If so, I'd 
>    rather
>    > leave the special encoder in place, since it is being 
>    used a
>    > lot in Python and probably some applications too.
> 
>    It would be a slowdown. But callbacks open many 
>    possiblities.

True, but in this case I believe that we should stick with
the native implementation for "unicode-escape". Having
a standard callback error handler which does the \uXXXX
replacement would be nice to have though, since this would
also be usable with lots of other codecs (e.g. all the code page
ones).
 
>    For example:
> 
>       Why can't I print u"gürk"?
> 
>    is probably one of the most frequently asked questions in
>    comp.lang.python. For printing Unicode stuff, print could be
>    extended the use an error handling callback for Unicode 
>    strings (or objects where __str__ or tp_str returns a 
>    Unicode object) instead of using str() which always returns 
>    an 8bit string and uses strict encoding. There might even 
>    be a
>    sys.setprintencodehandler()/sys.getprintencodehandler()

There already is a print callback in Python (forgot the name of the
hook though), so this should be possible by providing the
encoding logic in the hook.
 
>    > > I have not touched PyUnicode_TranslateCharmap yet,
>    > > should this function also support error callbacks? Why
>    > > would one want the insert None into the mapping to
>    call
>    > > the callback?
>    >
>    > 1. Yes.
>    > 2. The user may want to e.g. restrict usage of certain
>    > character ranges. In this case the codec would be used to
>    > verify the input and an exception would indeed be useful
>    > (e.g. say you want to restrict input to Hangul + ASCII).
> 
>    OK, do we want TranslateCharmap to work exactly like 
>    encoding,
>    i.e. in case of an error should the returned replacement
>    string again be mapped through the translation mapping or
>    should it be copied to the output directly? The former would
>    be more in line with encoding, but IMHO the latter would
>    be much more useful.

It's better to take the second approach (copy the callback
output directly to the output string) to avoid endless
recursion and other pitfalls.

I suppose this will also simplify the implementation somewhat.
 
>    BTW, when I implement it I can implement patch #403100
>    ("Multicharacter replacements in 
>    PyUnicode_TranslateCharmap")
>    along the way.

I've seen it; will comment on it later.
 
>    Should the old TranslateCharmap map to the new 
>    TranslateCharmapEx
>    and inherit the "multicharacter replacement" feature,
>    or
>    should I leave it as it is?

If possible, please also add the multichar replacement
to the old API. I think it is very useful and since the
old APIs work on raw buffers it would be a benefit to have
the functionality in the old implementation too.
 
[Decoding error callbacks]

>    > > A remaining problem is how to implement decoding error
>    > > callbacks. In Python 2.1 encoding and decoding errors 
>    are
>    > > handled in the same way with a string value. But with
>    > > callbacks it doesn't make sense to use the same
>    callback
>    > > for encoding and decoding (like 
>    codecs.StreamReaderWriter
>    > > and codecs.StreamRecoder do). Decoding callbacks have
>    a
>    > > different API. Which arguments should be passed to the
>    > > decoding callback, and what is the decoding callback
>    > > supposed to do?
>    >
>    > I'd suggest adding another set of PyCodec_UnicodeDecode...
>    ()
>    > APIs for this. We'd then have to augment the base classes 
>    of
>    > the StreamCodecs to provide two attributes for .errors 
>    with
>    > a fallback solution for the string case (i.s. "strict"
>    can
>    > still be used for both directions).
> 
>    Sounds good. Now what is the decoding callback supposed to 
>    do?
>    I guess it will be called in the same way as the encoding
>    callback, i.e. with encoding name, original string and
>    position of the error. It might returns a Unicode string
>    (i.e. an object of the decoding target type), that will be
>    emitted from the codec instead of the one offending byte. Or
>    it might return a tuple with replacement Unicode object and
>    a resynchronisation offset, i.e. returning (u"?", 1)
>    means
>    emit a '?' and skip the offending character. But to make
>    the offset really useful the callback has to know something
>    about the encoding, perhaps the codec should be allowed to
>    pass an additional state object to the callback?
> 
>    Maybe the same should be added to the encoding callbacks to?
>    Maybe the encoding callback should be able to tell the
>    encoder if the replacement returned should be reencoded
>    (in which case it's a Unicode object), or directly emitted
>    (in which case it's an 8bit string)?

I like the idea of having an optional state object (basically
this should be a codec-defined arbitrary Python object)
which then allow the callback to apply additional tricks.
The object should be documented to be modifyable in place
(simplifies the interface).

About the return value:

I'd suggest to always use the same tuple interface, e.g.

    callback(encoding, input_data, input_position, state) -> 
        (output_to_be_appended, new_input_position)

(I think it's better to use absolute values for the position 
rather than offsets.)

Perhaps the encoding callbacks should use the same 
interface... what do you think ?

>    > > One additional note: It is vital that errors is an
>    > > assignable attribute of the StreamWriter.
>    >
>    > It is already !
> 
>    I know, but IMHO it should be documented that an assignable
>    errors attribute must be supported as part of the official
>    codec API.
> 
>    Misc/unicode.txt is not clear on that:
>    """
>    It is not required by the Unicode implementation to use 
>    these base classes, only the interfaces must match; this 
>    allows writing Codecs as extension types.
>    """

Good point. I'll add that to the PEP 100.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-22 22:51

Message:
Logged In: YES 
user_id=38388

Sorry to keep you waiting, Walter. I will look into this
again next week -- this week was way too busy...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 19:00

Message:
Logged In: YES 
user_id=38388

On your comment about the non-Unicode codecs: let's keep
this separated from the current patch.

Don't have much time today. I'll comment on the other things
tomorrow.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 17:49

Message:
Logged In: YES 
user_id=89016

Guido van Rossum wrote in python-dev:

> True, the "codec" pattern can be used for other 
> encodings than Unicode.  But it seems to me that the
> entire codecs architecture is rather strongly geared
> towards en/decoding Unicode, and it's not clear
> how well other codecs fit in this pattern (e.g. I 
> noticed that all the non-Unicode codecs ignore the 
> error handling parameter or assert that
> it is set to 'strict').

I noticed that too. asserting that errors=='strict' would 
mean that the encoder is not able to deal in any other way 
with unencodable stuff than by raising an error. But that 
is not the problem here, because for zlib, base64, quopri, 
hex and uu encoding there can be no unencodable characters. 
The encoders can simply ignore the errors parameter. Should 
I remove the asserts from those codecs and change the 
docstrings accordingly, or will this be done separately?


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-13 15:57

Message:
Logged In: YES 
user_id=89016

> > [...]
> > raise an exception). U+FFFD characters in the 
replacement
> > string will be replaced with a character that the 
encoder
> > chooses ('?' in all cases).
>
> Nice.

But the special casing of U+FFFD makes the interface 
somewhat
less clean than it could be. It was only done to be 100%
backwards compatible. With the original "replace" error
handling the codec chose the replacement character. But as
far as I can tell none of the codecs uses anything other
than '?', so I guess we could change the replace handler
to always return u'?'. This would make the implementation a
little bit simpler, but the explanation of the callback
feature *a lot* simpler. And if you still want to handle
an unencodable U+FFFD, you can write a special callback for
that, e.g.

def FFFDreplace(enc, uni, pos):
if uni[pos] == "\ufffd":
return u"?"
else:
raise UnicodeError(...)

> > The implementation of the loop through the string is 
done
> > in the following way. A stack with two strings is kept
> > and the loop always encodes a character from the string
> > at the stacktop. If an error is encountered and the 
stack
> > has only one entry (during encoding of the original 
string)
> > the callback is called and the unicode object returned 
is
> > pushed on the stack, so the encoding continues with the
> > replacement string. If the stack has two entries when an
> > error is encountered, the replacement string itself has
> > an unencodable character and a normal exception raised.
> > When the encoder has reached the end of it's current 
string
> > there are two possibilities: when the stack contains two
> > entries, this was the replacement string, so the 
replacement
> > string will be poppep from the stack and encoding 
continues
> > with the next character from the original string. If the
> > stack had only one entry, encoding is finished.
>
> Very elegant solution !

I'll put it as a comment in the source.

> > (I hope that's enough explanation of the API and
> implementation)
>
> Could you add these docs to the Misc/unicode.txt file ? I
> will eventually take that file and turn it into a PEP 
which
> will then serve as general documentation for these things.

I could, but first we should work out how the decoding
callback API will work.

> > I have renamed the static ...121 function to all 
lowercase
> > names.
>
> Ok.
>
> > BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> > reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> > replacement callback.
>
> Hmm, wouldn't that result in a slowdown ? If so, I'd 
rather
> leave the special encoder in place, since it is being 
used a
> lot in Python and probably some applications too.

It would be a slowdown. But callbacks open many 
possiblities.

For example:

   Why can't I print u"gürk"?

is probably one of the most frequently asked questions in
comp.lang.python. For printing Unicode stuff, print could be
extended the use an error handling callback for Unicode 
strings (or objects where __str__ or tp_str returns a 
Unicode object) instead of using str() which always returns 
an 8bit string and uses strict encoding. There might even 
be a
sys.setprintencodehandler()/sys.getprintencodehandler()

> [...]
> I think it would be worthwhile to rename the callbacks to
> include "Unicode" somewhere, e.g.
> PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, 
but
> then it points out the application field of the callback
> rather well. Same for the callbacks exposed through the
> _codecsmodule.

OK, done (and PyCodec_XMLCharRefReplaceUnicodeEncodeErrors
really is a long name ;))

> > I have not touched PyUnicode_TranslateCharmap yet,
> > should this function also support error callbacks? Why
> > would one want the insert None into the mapping to call
> > the callback?
>
> 1. Yes.
> 2. The user may want to e.g. restrict usage of certain
> character ranges. In this case the codec would be used to
> verify the input and an exception would indeed be useful
> (e.g. say you want to restrict input to Hangul + ASCII).

OK, do we want TranslateCharmap to work exactly like 
encoding,
i.e. in case of an error should the returned replacement
string again be mapped through the translation mapping or
should it be copied to the output directly? The former would
be more in line with encoding, but IMHO the latter would
be much more useful.

BTW, when I implement it I can implement patch #403100
("Multicharacter replacements in 
PyUnicode_TranslateCharmap")
along the way.

Should the old TranslateCharmap map to the new 
TranslateCharmapEx
and inherit the "multicharacter replacement" feature, or
should I leave it as it is?

> > A remaining problem is how to implement decoding error
> > callbacks. In Python 2.1 encoding and decoding errors 
are
> > handled in the same way with a string value. But with
> > callbacks it doesn't make sense to use the same callback
> > for encoding and decoding (like 
codecs.StreamReaderWriter
> > and codecs.StreamRecoder do). Decoding callbacks have a
> > different API. Which arguments should be passed to the
> > decoding callback, and what is the decoding callback
> > supposed to do?
>
> I'd suggest adding another set of PyCodec_UnicodeDecode...
()
> APIs for this. We'd then have to augment the base classes 
of
> the StreamCodecs to provide two attributes for .errors 
with
> a fallback solution for the string case (i.s. "strict" can
> still be used for both directions).

Sounds good. Now what is the decoding callback supposed to 
do?
I guess it will be called in the same way as the encoding
callback, i.e. with encoding name, original string and
position of the error. It might returns a Unicode string
(i.e. an object of the decoding target type), that will be
emitted from the codec instead of the one offending byte. Or
it might return a tuple with replacement Unicode object and
a resynchronisation offset, i.e. returning (u"?", 1) means
emit a '?' and skip the offending character. But to make
the offset really useful the callback has to know something
about the encoding, perhaps the codec should be allowed to
pass an additional state object to the callback?

Maybe the same should be added to the encoding callbacks to?
Maybe the encoding callback should be able to tell the
encoder if the replacement returned should be reencoded
(in which case it's a Unicode object), or directly emitted
(in which case it's an 8bit string)?

> > One additional note: It is vital that errors is an
> > assignable attribute of the StreamWriter.
>
> It is already !

I know, but IMHO it should be documented that an assignable
errors attribute must be supported as part of the official
codec API.

Misc/unicode.txt is not clear on that:
"""
It is not required by the Unicode implementation to use 
these base classes, only the interfaces must match; this 
allows writing Codecs as extension types.
"""

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-13 10:05

Message:
Logged In: YES 
user_id=38388

> How the callbacks work:
> 
> A PyObject * named errors is passed in. This may by NULL,
> Py_None, 'strict', u'strict', 'ignore', u'ignore',
> 'replace', u'replace' or a callable object.
> PyCodec_EncodeHandlerForObject maps all of these objects
to
> one of the three builtin error callbacks
> PyCodec_RaiseEncodeErrors (raises an exception),
> PyCodec_IgnoreEncodeErrors (returns an empty replacement
> string, in effect ignoring the error),
> PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
> replacement character to signify to the encoder that it
> should choose a suitable replacement character) or
directly
> returns errors if it is a callable object. When an
> unencodable character is encounterd the error handling
> callback will be called with the encoding name, the
original
> unicode object and the error position and must return a
> unicode object that will be encoded instead of the
offending
> character (or the callback may of course raise an
> exception). U+FFFD characters in the replacement string
will
> be replaced with a character that the encoder chooses ('?'
> in all cases).

Nice.
 
> The implementation of the loop through the string is done
in
> the following way. A stack with two strings is kept and
the
> loop always encodes a character from the string at the
> stacktop. If an error is encountered and the stack has
only
> one entry (during encoding of the original string) the
> callback is called and the unicode object returned is
pushed
> on the stack, so the encoding continues with the
replacement
> string. If the stack has two entries when an error is
> encountered, the replacement string itself has an
> unencodable character and a normal exception raised. When
> the encoder has reached the end of it's current string
there
> are two possibilities: when the stack contains two
entries,
> this was the replacement string, so the replacement string
> will be poppep from the stack and encoding continues with
> the next character from the original string. If the stack
> had only one entry, encoding is finished.

Very elegant solution !
 
> (I hope that's enough explanation of the API and
implementation)

Could you add these docs to the Misc/unicode.txt file ? I
will eventually take that file and turn it into a PEP which
will then serve as general documentation for these things.
 
> I have renamed the static ...121 function to all lowercase
> names.

Ok.
 
> BTW, I guess PyUnicode_EncodeUnicodeEscape could be
> reimplemented as PyUnicode_EncodeASCII with a \uxxxx
> replacement callback.

Hmm, wouldn't that result in a slowdown ? If so, I'd rather
leave the special encoder in place, since it is being used a
lot in Python and probably some applications too.
 
> PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
> PyCodec_ReplaceEncodeErrors are globally visible because
> they have to be available in _codecsmodule.c to wrap them
as
> Python function objects, but they can't be implemented in
> _codecsmodule, because they need to be available to the
> encoders in unicodeobject.c (through
> PyCodec_EncodeHandlerForObject), but importing the codecs
> module might result in an endless recursion, because
> importing a module requires unpickling of the bytecode,
> which might require decoding utf8, which ... (but this
will
> only happen, if we implement the same mechanism for the
> decoding API)

I think that codecs.c is the right place for these APIs.
_codecsmodule.c is only meant as Python access wrapper for
the internal codecs and nothing more. 

One thing I noted about the callbacks: they assume that they
will always get Unicode objects as input. This is certainly
not true in the general case (it is for the codecs you touch
in the patch). 

I think it would be worthwhile to rename the callbacks to
include "Unicode" somewhere, e.g.
PyCodec_UnicodeReplaceEncodeErrors(). It's a long name, but
then it points out the application field of the callback
rather well. Same for the callbacks exposed through the
_codecsmodule.

> I have not touched PyUnicode_TranslateCharmap yet,
> should this function also support error callbacks? Why
would
> one want the insert None into the mapping to call the
callback?

1. Yes.
2. The user may want to e.g. restrict usage of certain
character ranges. In this case the codec would be used to
verify the input and an exception would indeed be useful
(e.g. say you want to restrict input to Hangul + ASCII).
 
> A remaining problem is how to implement decoding error
> callbacks. In Python 2.1 encoding and decoding errors are
> handled in the same way with a string value. But with
> callbacks it doesn't make sense to use the same callback
for
> encoding and decoding (like codecs.StreamReaderWriter and
> codecs.StreamRecoder do). Decoding callbacks have a
> different API. Which arguments should be passed to the
> decoding callback, and what is the decoding callback
> supposed to do?

I'd suggest adding another set of PyCodec_UnicodeDecode...()
APIs for this. We'd then have to augment the base classes of
the StreamCodecs to provide two attributes for .errors with
a fallback solution for the string case (i.s. "strict" can
still be used for both directions).

> One additional note: It is vital that errors is an
> assignable attribute of the StreamWriter.

It is already !
 
> Consider the XML example: For writing an XML DOM tree one
> StreamWriter object is used. When a text node is written,
> the error handling has to be set to
> codecs.xmlreplace_encode_errors, but inside a comment or
> processing instruction replacing unencodable characters
with
> charrefs is not possible, so here
codecs.raise_encode_errors
> should be used (or better a custom error handler that
raises
> an error that says "sorry, you can't have unencodable
> characters inside a comment")

Sure.
 
> BTW, should we continue the discussion in the i18n SIG
> mailing list? An email program is much more comfortable
than
> a HTML textarea! ;)

I'd rather keep the discussions on this patch here --
forking it off to the i18n sig will make it very hard to
follow up on it. (This HTML area is indeed damn small ;-)
 

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 21:18

Message:
Logged In: YES 
user_id=89016

One additional note: It is vital that errors is an
assignable attribute of the StreamWriter. 

Consider the XML example: For writing an XML DOM tree one
StreamWriter object is used. When a text node is written,
the error handling has to be set to
codecs.xmlreplace_encode_errors, but inside a comment or
processing instruction replacing unencodable characters with
charrefs is not possible, so here codecs.raise_encode_errors
should be used (or better a custom error handler that raises
an error that says "sorry, you can't have unencodable
characters inside a comment")

BTW, should we continue the discussion in the i18n SIG
mailing list? An email program is much more comfortable than
a HTML textarea! ;)


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 20:59

Message:
Logged In: YES 
user_id=89016

How the callbacks work:

A PyObject * named errors is passed in. This may by NULL,
Py_None, 'strict', u'strict', 'ignore', u'ignore',
'replace', u'replace' or a callable object.
PyCodec_EncodeHandlerForObject maps all of these objects to
one of the three builtin error callbacks
PyCodec_RaiseEncodeErrors (raises an exception),
PyCodec_IgnoreEncodeErrors (returns an empty replacement
string, in effect ignoring the error),
PyCodec_ReplaceEncodeErrors (returns U+FFFD, the Unicode
replacement character to signify to the encoder that it
should choose a suitable replacement character) or directly
returns errors if it is a callable object. When an
unencodable character is encounterd the error handling
callback will be called with the encoding name, the original
unicode object and the error position and must return a
unicode object that will be encoded instead of the offending
character (or the callback may of course raise an
exception). U+FFFD characters in the replacement string will 
be replaced with a character that the encoder chooses ('?'
in all cases).

The implementation of the loop through the string is done in
the following way. A stack with two strings is kept and the
loop always encodes a character from the string at the
stacktop. If an error is encountered and the stack has only
one entry (during encoding of the original string) the
callback is called and the unicode object returned is pushed
on the stack, so the encoding continues with the replacement
string. If the stack has two entries when an error is
encountered, the replacement string itself has an
unencodable character and a normal exception raised. When
the encoder has reached the end of it's current string there
are two possibilities: when the stack contains two entries,
this was the replacement string, so the replacement string
will be poppep from the stack and encoding continues with
the next character from the original string. If the stack
had only one entry, encoding is finished.

(I hope that's enough explanation of the API and implementation)

I have renamed the static ...121 function to all lowercase
names.

BTW, I guess PyUnicode_EncodeUnicodeEscape could be
reimplemented as PyUnicode_EncodeASCII with a \uxxxx
replacement callback.

PyCodec_RaiseEncodeErrors, PyCodec_IgnoreEncodeErrors,
PyCodec_ReplaceEncodeErrors are globally visible because
they have to be available in _codecsmodule.c to wrap them as
Python function objects, but they can't be implemented in
_codecsmodule, because they need to be available to the
encoders in unicodeobject.c (through
PyCodec_EncodeHandlerForObject), but importing the codecs
module might result in an endless recursion, because
importing a module requires unpickling of the bytecode,
which might require decoding utf8, which ... (but this will
only happen, if we implement the same mechanism for the
decoding API)

I have not touched PyUnicode_TranslateCharmap yet, 
should this function also support error callbacks? Why would
one want the insert None into the mapping to call the callback?

A remaining problem is how to implement decoding error
callbacks. In Python 2.1 encoding and decoding errors are
handled in the same way with a string value. But with
callbacks it doesn't make sense to use the same callback for
encoding and decoding (like codecs.StreamReaderWriter and
codecs.StreamRecoder do). Decoding callbacks have a
different API. Which arguments should be passed to the
decoding callback, and what is the decoding callback
supposed to do?


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 20:00

Message:
Logged In: YES 
user_id=38388

About the Py_UNICODE*data, int size APIs:
Ok, point taken.

In general, I think we ought to keep the callback feature as
open as possible, so passing in pointers and sizes would not
be very useful.

BTW, could you summarize how the callback works in a few
lines ?

About _Encode121: I'd name this _EncodeUCS1 since that's
what it is ;-)

About the new functions: I was referring to the new static
functions which you gave PyUnicode_... names. If these are
not supposed to turn into non-static functions, I'd rather
have them use lower case names (since that's how the Python
internals work too -- most of the times).


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:56

Message:
Logged In: YES 
user_id=89016

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments
> --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

Another problem is, that the callback requires a Python
object, so in the PyObject *version, the refcount is
incref'd and the object is passed to the callback. The
Py_UNICODE*/int version would have to create a new Unicode
object from the data.


----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2001-06-12 18:32

Message:
Logged In: YES 
user_id=89016

> * please don't place more than one C statement on one line
> like in:
> """
> +               unicode = unicode2; unicodepos =
> unicode2pos;
> +               unicode2 = NULL; unicode2pos = 0;
> """

OK, done!

> * Comments should start with a capital letter and be
> prepended
> to the section they apply to

Fixed!

> * There should be spaces between arguments in compares
> (a == b) not (a==b)

Fixed!

> * Where does the name "...Encode121" originate ?

encode one-to-one, it implements both ASCII and latin-1
encoding.

> * module internal APIs should use lower case names (you
> converted some of these to  PyUnicode_...() -- this is
> normally reserved for APIs which are either marked as
> potential candidates for the public API or are very
> prominent in the code)

Which ones? I introduced a new function for every old one,
that had a "const char *errors" argument, and a few new ones
in codecs.h, of those PyCodec_EncodeHandlerForObject is
vital, because it is used to map for old string arguments to
the new function objects. PyCodec_RaiseEncodeErrors can be
used in the encoder implementation to raise an encode error,
but it could be made static in unicodeobject.h so only those
encoders implemented there have access to it.

> One thing which I don't like about your API change is that
> you removed the Py_UNICODE*data, int size style arguments > --
> this makes it impossible to use the new APIs on non-Python
> data or data which is not available as Unicode object.

I look through the code and found no situation where the
Py_UNICODE*/int version is really used and having two
(PyObject *)s (the original and the replacement string),
instead of UNICODE*/int and PyObject * made the
implementation a little easier, but I can fix that.

> Please separate the errors.c patch from this patch -- it
> seems totally unrelated to Unicode.

PyCodec_RaiseEncodeErrors uses this the have a \Uxxxx with
four hex digits. I removed it.

I'll upload a revised patch as soon as it's done.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-06-12 16:29

Message:
Logged In: YES 
user_id=38388

Thanks for the patch -- it looks very impressive !.

I'll give it a try later this week. 

Some first cosmetic tidbits:
* please don't place more than one C statement on one line
like in:
"""
+               unicode = unicode2; unicodepos =
unicode2pos;
+               unicode2 = NULL; unicode2pos = 0;
"""

* Comments should start with a capital letter and be
prepended
to the section they apply to

* There should be spaces between arguments in compares
(a == b) not (a==b)

* Where does the name "...Encode121" originate ?

* module internal APIs should use lower case names (you
converted some of these to  PyUnicode_...() -- this is
normally reserved for APIs which are either marked as
potential candidates for the public API or are very
prominent in the code)

One thing which I don't like about your API change is that
you removed the Py_UNICODE*data, int size style arguments --
this makes it impossible to use the new APIs on non-Python
data or data which is not available as Unicode object.

Please separate the errors.c patch from this patch -- it
seems totally unrelated to Unicode.

Thanks.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=432401&group_id=5470


From noreply@sourceforge.net  Sat Apr 20 18:12:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Apr 2002 10:12:38 -0700
Subject: [Patches] [ python-Patches-488073 ] AtheOS port of Python 2.2b2
Message-ID: <E16yyPe-0007fI-00@usw-sf-web3.sourceforge.net>

Patches item #488073, was opened at 2001-12-02 14:57
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=488073&group_id=5470

>Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 4
Submitted By: Octavian Cerna (tavyc)
>Assigned to: Martin v. Löwis (loewis)
Summary: AtheOS port of Python 2.2b2

Initial Comment:
Hi,
 
While I was playing with AtheOS in the last few days, I saw 
that it comes
with Python 1.5.2.  Being a Python enthusiast, I 
tried to see if a recent
version of Python would work.  It did not 
even compile, so I decided to port
it, knowing that AtheOS is 
almost POSIX-compatible.
 
The result is a full-blown Python port 
with only a (relative) small amount
of changes, that builds 
OOTB.
 
And here it is, with all the details:
 
 
About the patch:
 

  - The diff is against CVS 2001-11-26
  - 20 files changed, 1962 
insertions, 23 deletions
  - It does not include changes to 
autoconf-generated files
  - The patch includes a few very small 
fixes not directly related to
    the AtheOS port, but which were 
necessary to ensure the port works OK:
    - setup.py - look for the 
_curses_panel.c file in $srcdir/Modules, not
      in $PWD/Modules 
(otherwise VPATH builds will not build _curses_panel.so)
    - 
Makefile.pre.in - only ranlib *.a files (how will this affect 
other ports?)
    - Tools/freeze/bkfile.py - not all platforms have 
the file.truncate method
    - Tools/scripts/h2py.py - environment-
paths code was made more portable
 
Environment:

  - AtheOS 
0.3.6
  - gcc 2.95.2
  - binutils 2.10
  - glibc 2.1.2

Features:
 
  - 
Dynamic loading of modules
  - Native thread support

Issues:
 
  
Although AtheOS is not a BeOS clone, it is in many ways similar to 
BeOS,
  and some issues from BeOS also apply on AtheOS.
 
  - dlopen is 
not properly implemented, so I had to add support for native
    
dynamic loading.
  - The POSIX threads emulation is almost 
inexistent, so I added support for
    native AtheOS threads.
  - 
AtheOS doesn't support importing symbols in a shared library 
from the
    main executable (the `-export-dynamic' linker flag), so 
the Python core
    must be built as a shared library, like on BeOS.
  - 
mmap is not implemented in AtheOS, the mmap module was disabled
  - 
statvfs is not implemented, os.statvfs and os.fstatvfs 
disabled
  - crypt on AtheOS crashes with SEGV if salt is not "$1$" 
(glibc bug?),
    disabled
  - poll - it seems that poll never returns 
POLLNVAL for an invalid fd
    (hangs), disabled
  - Don't mix 
threading with fork() -- see test_popen2 below.

All tests pass, 
except:
 
  - test_nis hangs on my system, probably because NIS is 
not set up
  - test_popen2 hangs when imported from regrtest.py, 
otherwise working
    (like on BeOS).  I suspect that the import 
semaphore is locked when
    popen2 forks and AtheOS doesn't like 
that.
  - test_mhlib fails, I don't know exactly why, but I suspect a 
filesystem issue
    (It corrupted some of my files several times, 
and I don't want to run it
    again ;)
  - test_locale skipped, locale 
en_US not supported (maybe something's wrong
    with my 
environment?)
  - test_largefile skipped (unless you tell it to 
eat 2GB disk space)
    because apparently afs doesn't support 
sparse files.

Notes:
 
  - If you build Python, make sure you have 
shared versions of your
    libraries and you have links in 
/system/libs for them, otherwise the
    kernel won't load 
modules.
    AtheOS doesn't have a shared version for each library, 
I had serious
    problems with ncurses and the _curses_panel 
module.
 
Binary packages:
 
  - I have prepared binary packages of 
Python-2.2b2 and Numeric-20.2.1
    for testing, I will post them on 
kamidake.org later.
 
Future:
 
  - I'm going to try Zope these 
days, I think it should work with a minimum
    amount of changes, 
almost OOTB ;)
 

(Maybe I should have included this text in the 
README file?)
 
Please excuse my bad English, I am not a native 
speaker :)
 
Python is an excellent language/environment, keep 
going with the good work!
--------
 
Best Regards,
Octavian Cerna 
(Tavy)
yLabs


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-20 19:12

Message:
Logged In: YES 
user_id=21627

Can you please update the patch for the current CVS? A few
comments:

- the plat-*/FCNTL are not generated anymore, FCNTL is
deprecated.
- Why does test_nis need to skip explicitly on atheos? If
the nis module builds, why does the test fail?
- Please use the .startswith method over the (historic) [:n]
form when testing sys.platform.
- I don't understand the O_LARGEFILE chunk. What is the
problem with just not defining O_LARGEFILE?


----------------------------------------------------------------------

Comment By: Octavian Cerna (tavyc)
Date: 2001-12-02 19:40

Message:
Logged In: YES 
user_id=382173

I know that Python is now in a "feature freeze" or something similar, but 
this patch could go in Python 2.3.
I added the HAVE_GETADDRINFO to 
acconfig.h because it was simply missing and autoheader complained, 
(outdated CVS snapshot). I should have removed it from the patch. I 
posted it here just to make it available to anyone who wants to try it 
out.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-12-02 16:42

Message:
Logged In: YES 
user_id=21627

I'd advise not to add this port so shortly before 2.2, in
particular since these changes need careful review. As an
example: Why does it add HAVE_GETADDRINFO to acconfig.h?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=488073&group_id=5470


From noreply@sourceforge.net  Sun Apr 21 16:11:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 21 Apr 2002 08:11:04 -0700
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E16zIzY-000068-00@usw-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 14:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
>Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-21 11:11

Message:
Logged In: YES 
user_id=3066

This is the same as bug #505747.

These "tags" are not legal HTML in any form, but are some
Microsoft invention.  It's not entirely clear what the right
thing to do is, but it is clear that we need to deal with
these in some different way.

Changed group to indicate that such changes can only go into
the trunk; feature changes in maintenance versions are not
allowed.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 13:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 11:17:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 03:17:23 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16zast-0003NX-00@usw-sf-web1.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 13:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-04-22 10:17

Message:
Logged In: YES 
user_id=6656

Not sure if you were asking me, but it looks fine from here.

Do you want me to check it in?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-19 16:30

Message:
Logged In: YES 
user_id=89016

OK, I have a branch version that has the methods (attached 
as branch-diff.txt). In addition to the zfill changes it 
has Guido's change to test_userstring.py and the 
subinstance checks in the test.

Does this look ok?


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-18 09:09

Message:
Logged In: YES 
user_id=6656

Walter, do you feel like sorting out the release22-maint
branch too?

It's probably best to activate the new string methods there
too.  I can't see how it could possibly break anything.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 21:35

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 20:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 18:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 18:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 18:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 18:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 18:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 14:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 14:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 13:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 13:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 01:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 18:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 14:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 11:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 11:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 16:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 13:03:54 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 05:03:54 -0700
Subject: [Patches] [ python-Patches-536241 ] string.zfill and unicode
Message-ID: <E16zcXy-0004ew-00@usw-sf-web1.sourceforge.net>

Patches item #536241, was opened at 2002-03-28 14:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
Assigned to: Walter Dörwald (doerwalter)
Summary: string.zfill and unicode

Initial Comment:
This patch makes the function string.zfill work with 
unicode instances (and instances of str and unicode 
subclasses). Currently string.zfill(u"123", 10) 
results in "0000u'123'". With this patch the result is 
u'0000000123'.

Should zfill be made a real str und unicode method? I 
noticed that a zfill implementation is available in 
unicodeobject.c, but commented out.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-22 14:03

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.80.6.5
Lib/UserString.py 1.10.18.2
Lib/string.py 1.60.16.2
test/string_tests.py 1.10.16.2
test/test_string.py 1.15.6.1
test/test_unicode.py 1.47.6.2
test/test_userstring.py 1.5.24.1
Misc/NEWS 1.337.2.4.2.25
Objects/stringobject.c 2.147.6.2
Objects/unicodeobject.c 2.124.6.7


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-22 12:17

Message:
Logged In: YES 
user_id=6656

Not sure if you were asking me, but it looks fine from here.

Do you want me to check it in?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-19 18:30

Message:
Logged In: YES 
user_id=89016

OK, I have a branch version that has the methods (attached 
as branch-diff.txt). In addition to the zfill changes it 
has Guido's change to test_userstring.py and the 
subinstance checks in the test.

Does this look ok?


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-18 11:09

Message:
Logged In: YES 
user_id=6656

Walter, do you feel like sorting out the release22-maint
branch too?

It's probably best to activate the new string methods there
too.  I can't see how it could possibly break anything.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 23:35

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_string.py 1.16
Lib/test/test_unicode.py 1.56


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-17 22:50

Message:
Logged In: YES 
user_id=6380

The test seems fine, and a good addition.  Don't worry too
much about how to report the failure (though perhaps
including the key word "subtype" in the error output might
help).

I noticed that when I change the Unicode function fixup() to
not do a check for subclasses, I only get very few failures:
one for capitalize, two for lower, one for upper. I think
this is because the test suite doesn't have enough sample
cases where the output is the same as the input. Maybe some
could be added.

But go ahead and check in diff3.txt.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-17 20:55

Message:
Logged In: YES 
user_id=89016

Diff3.txt adds these tests to Lib/test/test_unicode.py and 
Lib/test/test_string.py. All tests pass (except that 
currently test_unicode.py fails the unicode_internal 
roundtripping test with --enable-unicode=ucs4) and when I 
change zfill back to always return self they properly fail.

I don't know whether the fail message should be made 
better, and how this would interact with "make test" and 
whether the "Prefer string methods over string module 
functions" part in test_string.py might pose problems.

And maybe the code could be simplyfied to always use the 
subclasses without first trying str und unicode?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:48

Message:
Logged In: YES 
user_id=6380

If you want to be thorough, yes, that's a good test to add!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:47

Message:
Logged In: YES 
user_id=89016

Checked in as:
Objects/stringobject.c 2.159
Objects/unicodeobject.c 2.139

Maybe we could add a test to Lib/test/test_unicode.py and 
Lib/test/test_string.py that makes sure that no method 
returns a str/unicode subinstance even when called for a 
str/unicode subinstance?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 20:29

Message:
Logged In: YES 
user_id=6380

Yes, that's the right thing.  Reopened this for now.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 20:23

Message:
Logged In: YES 
user_id=89016

Currently zfill returns the original if nothing has to be 
done. Should I change this to only do it, if it's a real 
str or unicode instance? (as it was done lots of methods 
for bug http://www.python.org/sf/460020)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 16:47

Message:
Logged In: YES 
user_id=6380

Yes, please open a separate bug report for those (I'd open a
separate report for each file with warnings, unless you have
an obvious fix).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 16:43

Message:
Logged In: YES 
user_id=89016

> Does your compiler not warn you? Or did
> you ignore warnings? 
> (The latter's a sin in Python-land :-).

The warning was just lost in the long list of outputs.

Now that you mention it, there are still a few warnings 
(gcc 2.96 on Linux):
Objects/unicodeobject.c: In function `PyUnicodeUCS4_Format':
Objects/unicodeobject.c:5574: warning: int format, long int 
arg (arg 3)
Objects/unicodeobject.c:5574: warning: unsigned int format, 
long unsigned int arg (arg 4)

libpython2.3.a(posixmodule.o): In function `posix_tmpnam':
Modules/posixmodule.c:5150: the use of `tmpnam_r' is 
dangerous, better use `mkstemp'
libpython2.3.a(posixmodule.o): In function `posix_tempnam':
Modules/posixmodule.c:5100: the use of `tempnam' is 
dangerous, better use `mkstemp'

Modules/pwdmodule.c: In function `initpwd':
Modules/pwdmodule.c:161: warning: unused variable `d'

Modules/readline.c: In function `set_completer_delims':
Modules/readline.c:273: warning: passing arg 1 of `free' 
discards qualifiers from pointer target type

Modules/expat/xmlrole.c:7: warning: `RCSId' defined but not 
used

Should I open a separate bug report for that?

> I've also folded some long lines that weren't 
> your fault -- but I noticed that elsewhere you 
> checked in some long lines;
> please try to limit line length to 78.

I noticed your descrobject.c checkin message.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-15 15:53

Message:
Logged In: YES 
user_id=6380

Thanks, Walter! Some nits:

The string_zfill() code you checked in caused two warnings
about modifying data pointed to by a const pointer. I've
removed the const, but I'd like to understand how come you
didn't catch this. Does your compiler not warn you? Or did
you ignore warnings? (The latter's a sin in Python-land :-).

I've also folded some long lines that weren't your fault --
but I noticed that elsewhere you checked in some long lines;
please try to limit line length to 78.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-15 15:41

Message:
Logged In: YES 
user_id=89016

Checked in as:
Doc/lib/libstdtypes.tex 1.88
Lib/UserString.py 1.12
Lib/string.py 1.63
test/string_tests.py 1.13
test/test_unicode.py 1.54
Misc/NEWS 1.388
Objects/stringobject.c 2.157
Objects/unicodeobject.c 2.138


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-13 03:00

Message:
Logged In: YES 
user_id=6380

I'm for making them methods. Walter, just check it in!

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-04-12 20:37

Message:
Logged In: YES 
user_id=89016

Now that test_userstring.py works and fails (rev 1.6) 
should we add zfill as str and unicode methods or change 
UserString.zfill to use string.zfill?

I've made a patch (attached) that implements zfill as 
methods (i.e. activates the version in unicodeobject.c that 
was commented out and implements the same in stringobject.c)

(And it adds the test for unicode support back in.)

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-12 16:51

Message:
Logged In: YES 
user_id=21627

Re: optional Unicode: Walter is correct; configuring with
--disable-unicode currently breaks the string module. One
might consider using types.StringTypes; OTOH, pulling in
types might not be desirable.

As for str vs. repr: Python was always using repr in zfill,
so changing it may break things.

So I recommend that Walter reverts Andrew's check-in and
applies his change.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:25

Message:
Logged In: YES 
user_id=6656

Hah, I was going to say that but was distracted by IE 
wiping out the machine I'm sitting at.

Re-opening.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2002-03-30 12:16

Message:
Logged In: YES 
user_id=89016

But Python could be compiled without unicode support (by
undefining PY_USING_UNICODE), and string.zfill should work
even in this case.

What about making zfill a real str and unicode method?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2002-03-29 17:24

Message:
Logged In: YES 
user_id=11375

Thanks for your patch!  I've checked it into CVS, with two 
modifications.  First, I removed the code to handle the case 
where Python doesn't have a unicode() built-in; there's no 
expection that 
you can take the standard library for Python version N and use 
it with version N-1, so this code isn't needed.

Second, I changed string.zfill() to take the str() and not the repr()
when it gets a non-string object because that seems to make 
more sense.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=536241&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 17:46:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 09:46:15 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16zgxD-0005g1-00@usw-sf-web2.sourceforge.net>

Patches item #545439, was opened at 2002-04-17 20:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 12:46

Message:
Logged In: YES 
user_id=12800

One problem with your patch is that symbol-near-point is a
XEmacs-ism.  I think we can fix that, and avoid the need to
"regexp-unquote" the return value by just inlining what we
need of symbol-near-point.  I'll work out that patch and
check it in if it works.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-18 03:05

Message:
Logged In: YES 
user_id=11105

FYI, a *somewhat* similar thing can be found at bug 541031.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 17:49:26 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 09:49:26 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E16zh0I-0007yr-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 18:12:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 10:12:42 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16zhMo-0004sj-00@usw-sf-web3.sourceforge.net>

Patches item #545439, was opened at 2002-04-17 20:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 13:12

Message:
Logged In: YES 
user_id=12800

Skip, check out the attached patch.  This ought to work in
Emacs too.  It binds py-help-at-point to C-c C-h.


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 12:46

Message:
Logged In: YES 
user_id=12800

One problem with your patch is that symbol-near-point is a
XEmacs-ism.  I think we can fix that, and avoid the need to
"regexp-unquote" the return value by just inlining what we
need of symbol-near-point.  I'll work out that patch and
check it in if it works.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-18 03:05

Message:
Logged In: YES 
user_id=11105

FYI, a *somewhat* similar thing can be found at bug 541031.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 18:19:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 10:19:16 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E16zhTA-0004xM-00@usw-sf-web3.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 17:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 18:51:19 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 10:51:19 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16zhyB-0000GG-00@usw-sf-web1.sourceforge.net>

Patches item #543867, was opened at 2002-04-14 18:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 18:53:48 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 10:53:48 -0700
Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function
Message-ID: <E16zi0a-0005Ma-00@usw-sf-web3.sourceforge.net>

Patches item #544909, was opened at 2002-04-16 19:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
>Assigned to: Tim Peters (tim_one)
Summary: addition of cmath.arg function

Initial Comment:
This patch adds the familiar "Arg" function from
complex analysis to the cmath module, though it's
called "arg" here for consistency with the other names.
Along with the built-in abs function this makes
polar/rectangular coordinate conversions trivial:

  z = complex(x,y)
  r, theta = abs(z), arg(z)
  
  z = r * exp(1j * theta)
  x, y = z.real, z.imag

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:53

Message:
Logged In: YES 
user_id=6380

For Tim to pronounce.

----------------------------------------------------------------------

Comment By: John Williams (johnw42)
Date: 2002-04-19 13:38

Message:
Logged In: YES 
user_id=44174

I forgot to mention this patch is relative to version 2.2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 18:55:10 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 10:55:10 -0700
Subject: [Patches] [ python-Patches-545096 ] Janitoring in ConfigParser
Message-ID: <E16zi1u-0005Np-00@usw-sf-web3.sourceforge.net>

Patches item #545096, was opened at 2002-04-17 06:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Janitoring in ConfigParser

Initial Comment:
The first patch fixes a bug, implements some speed 
improvements, some memory consumption improvements, 
enforces the usage of the already available global 
variables, and extends the allowed chars in option 
names to be very permissive. 
 
The second one, if used, is supposed to be applied 
over the first one, and implements a walk() 
generator method for walking trough the options of a 
section. 

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:55

Message:
Logged In: YES 
user_id=6380

I'm assigning this to Fred Drake, who has some opinions on
ConfigParser.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-18 13:07

Message:
Logged In: YES 
user_id=7887

I'd rather explain here the patch that changes behavior, 
since it's pretty small. This line in the regular 
expression OPTCRE: 
 
r'(?P<option>[]\-[\w_.*,(){}]+)' 
 
was replaced by: 
 
r'(?P<option>[^:=\s]+)' 
 
So that instead of giving a range of characters which may 
be part of the option name, it just looks for the 
separator chars and spaces. This behavior is already used 
in the headers, and I haven't found any good reason to 
deny usage of other characters as option names. 
 
In the same regular expression, I've also replaced 
'[ \t]' by '\s', but this shouln't change the current 
behavior at all. 
 
About the walk patch, I have no idea why it isn't 
attached. I remember to have checked the ticked, and it 
was there. Anyway, I'm attaching it again. 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 02:04

Message:
Logged In: YES 
user_id=21627

I'd like to see this split into even more parts: a patch
that supposedly has *no* semantic change (ie. the speed
improvements, memory consumption improvements, use of global
variables); a patch that changes behavior (please explain in
which ways); and the patch that implements walk (which
appears to be missing currently).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545096&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:00:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:00:32 -0700
Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate
Message-ID: <E16zi76-0005Rj-00@usw-sf-web3.sourceforge.net>

Patches item #545150, was opened at 2002-04-17 08:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: {a,b} in fnmatch.translate

Initial Comment:
This patch adds support to {a,b} expansion constructs  
in fnmatch.translate. That is, file{a,b}.txt will 
match both, filea.txt and fileb.txt, like usual 
shell expansions. 
  

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 14:00

Message:
Logged In: YES 
user_id=6380

I'm rejecting this, not for backwards compatibility reasons,
but because the shell semantics are different than those of
the other globbing patterns, and it's better to do this in
your application if you need it. (The difference is that in
the shell, {a,b}.txt expands to a.txt b.txt even if neither
file exists.)

About user expectations: the docs are quite clear about the
patterns recognized by fnmatch, so I don't know what you're
talking about.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-18 14:07

Message:
Logged In: YES 
user_id=7887

Indeed. I was even expecting this answer. It's not usual 
to have such characters in filenames, but they're not 
invalid. OTOH, I discovered that is was not supported 
while trying to use them with globbing meanings, so this 
may be expected by some users. 
 
Anyway, the patch works, and I'd like to use this 
functionality for myself, but if you decide not to 
include it, I'll understand. 
 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 02:01

Message:
Logged In: YES 
user_id=21627

I'm concerned about backwards compatibility of this change.
People who currently use { in their patterns get them
translated literally; under your change, this will have a
different effect.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:01:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:01:46 -0700
Subject: [Patches] [ python-Patches-545486 ] make test_linuxaudiodev ignore EBUSY
Message-ID: <E16zi8I-0006Z4-00@usw-sf-web2.sourceforge.net>

Patches item #545486, was opened at 2002-04-18 00:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470

Category: None
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: make test_linuxaudiodev ignore EBUSY

Initial Comment:
When testing, I don't want to have to stop
playing mp3s just to get the test suite to
have zero failures. The following trivial
patch makes it skip on EBUSY.

submitted for feedback as I'm not sure if
there's a reason to not do this...


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 14:01

Message:
Logged In: YES 
user_id=6380

To me too. Somebody can check it in.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 13:02

Message:
Logged In: YES 
user_id=21627

Sounds reasonable to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:02:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:02:16 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16zi8m-0006DV-00@usw-sf-web4.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:02

Message:
Logged In: YES 
user_id=112690

Yes to both questions. I'm withdrawing test_complex.py
and test_b1.py.
I can't delete them and I double checked that I were
correctly logged in as hfoffani.
SourceForge error:
File Delete: ArtifactFile: Permission Denied


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 19:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:02:15 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:02:15 -0700
Subject: [Patches] [ python-Patches-545523 ] patch for 514433  bsddb.dbopen (NULL)
Message-ID: <E16zi8l-0006DR-00@usw-sf-web4.sourceforge.net>

Patches item #545523, was opened at 2002-04-18 03:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
>Assigned to: Anthony Baxter (anthonybaxter)
Summary: patch for 514433  bsddb.dbopen (NULL)

Initial Comment:
This is a patch for 
[ 514433 ] bsddb: enable dbopen (file==NULL)
from that bug:
"""
bsddb: enable dbopen (file==NULL)
dbopen(): if the file argument is NULL, the library
will use a temporary file. this is useful if you want
that, or if you want to specify a large cache so that
it never actually touches the disk. [i.e., in-memory
hash/bt]
I've done this by replacing the "s" with a "z" in the
arg specs for the three open functions. Seems to work. 
"""

This patch does this. Some testing seems to show that
it works fine.

The docs for db-1.86 show that it was an acceptable
option back then, and it's still allowed for db-3.2,
so that seems a wide enough range of supported 
libraries.

Anyway, this patch has the trivial fix, additions
to the test suite (such as it is) and some short
additions to the docs (lifted from the dbopen()
manpage)

If this is fine, let me know and I'll check it into
the trunk.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-18 10:00

Message:
Logged In: YES 
user_id=6380

Looks OK. Check it in on the trunk, and we'll see how it
goes. (Could be a 2.2 candidate too I think.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:07:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:07:52 -0700
Subject: [Patches] [ python-Patches-546244 ] implementation of Text.dump method
Message-ID: <E16ziEC-0005WQ-00@usw-sf-web3.sourceforge.net>

Patches item #546244, was opened at 2002-04-19 14:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
>Priority: 3
Submitted By: John Williams (johnw42)
Assigned to: Nobody/Anonymous (nobody)
Summary: implementation of Text.dump method

Initial Comment:
This is a fairly robust implementation of the dump
command for the text widget. It supports all the
options of the underlying Tk command. This patch is
relative to version 2.2.1.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 14:07

Message:
Logged In: YES 
user_id=6380

A quick test suggets that you don't parse the dump output
correctly. I entered "The quick brown fox jumps over the
lazy dog.\n" into a text widget, and then called
t.dump("1.0", "2.0"). It returned the following garbage:

[('text', '{The', 'quick'), ('brown', 'fox', 'jumps'),
('over', 'the', 'lazy'), ('dog.}', '1.0', 'mark'),
('current', '1.44', 'mark'), ('anchor', '1.44', 'mark'),
('insert', '1.44', 'text'), ('{', '}', '1.44')]

It looks like you don't parse the curly braces at all.

Maybe you can find someone on c.l.py to help you fix this?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:08:37 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:08:37 -0700
Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function
Message-ID: <E16ziEv-0006Ho-00@usw-sf-web4.sourceforge.net>

Patches item #544909, was opened at 2002-04-16 19:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
Assigned to: Tim Peters (tim_one)
Summary: addition of cmath.arg function

Initial Comment:
This patch adds the familiar "Arg" function from
complex analysis to the cmath module, though it's
called "arg" here for consistency with the other names.
Along with the built-in abs function this makes
polar/rectangular coordinate conversions trivial:

  z = complex(x,y)
  r, theta = abs(z), arg(z)
  
  z = r * exp(1j * theta)
  x, y = z.real, z.imag

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-22 14:08

Message:
Logged In: YES 
user_id=31435

It's hard to know what to say -- since it's really just an 
alias for math.atan2, it's hard to get excited about.  
Might it be even more convenient to add

cmath.polar(complex) -> (abs, arg)
cmath.rect(abs, arg) -> complex

functions?  That is, what do you *really* want <wink>?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:53

Message:
Logged In: YES 
user_id=6380

For Tim to pronounce.

----------------------------------------------------------------------

Comment By: John Williams (johnw42)
Date: 2002-04-19 13:38

Message:
Logged In: YES 
user_id=44174

I forgot to mention this patch is relative to version 2.2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:09:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:09:52 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16ziG8-0006f3-00@usw-sf-web2.sourceforge.net>

Patches item #543867, was opened at 2002-04-14 18:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 14:09

Message:
Logged In: YES 
user_id=6380

OK, I've deleted them for you. Who do you expect to review
this?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 14:02

Message:
Logged In: YES 
user_id=112690

Yes to both questions. I'm withdrawing test_complex.py
and test_b1.py.
I can't delete them and I double checked that I were
correctly logged in as hfoffani.
SourceForge error:
File Delete: ArtifactFile: Permission Denied


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:15:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:15:33 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16ziLd-0006jm-00@usw-sf-web2.sourceforge.net>

Patches item #545439, was opened at 2002-04-17 19:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2002-04-22 13:15

Message:
Logged In: YES 
user_id=44345

The patch works for me for methods but fails for 
symbols. I hadn't tried it for something like 
"sys.platform" before, so I'm pretty sure my original
patch failed for that as well.  When I ask for help on 
sys.platform it complains about "no help for 'linux2'". 
That makes sense, since it's asking for

  pydoc.help(sys.platform)

Too bad Python doesn't have Lisp's quote feature! <0.5 
wink>

Ooh! Ooh!  Pydoc *does* understand quoting!  Try

    pydoc.help("sys.platform")
    pydoc.help("sys.exc_info")

Looks like the help arg should always be quoted.

A couple other suggestions:

  * How about binding py-help-at-point to [F1]?

  * How about adding the following two lines to the
    end of py-help-at-point:

      (set-buffer "*Python Output*")
      (help-mode)

    That puts point in the *P O* buffer and makes it
    easy to dump the help buffer when you're through
    with it (just press 'q').

S


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 12:12

Message:
Logged In: YES 
user_id=12800

Skip, check out the attached patch.  This ought to work in
Emacs too.  It binds py-help-at-point to C-c C-h.


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 11:46

Message:
Logged In: YES 
user_id=12800

One problem with your patch is that symbol-near-point is a
XEmacs-ism.  I think we can fix that, and avoid the need to
"regexp-unquote" the return value by just inlining what we
need of symbol-near-point.  I'll work out that patch and
check it in if it works.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-18 02:05

Message:
Logged In: YES 
user_id=11105

FYI, a *somewhat* similar thing can be found at bug 541031.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:21:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:21:49 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16ziRh-0000cQ-00@usw-sf-web1.sourceforge.net>

Patches item #543867, was opened at 2002-04-14 18:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-22 14:21

Message:
Logged In: YES 
user_id=31435

I'm not sure what lines like

vereq(a ** 105, a ** 105)
vereq(b ** -105, b ** -105)
vereq(b ** -30, b ** -30)

are trying to test.  That we get the same answer when we do 
exactly the same thing twice?

Note that complex % has been deprecated:  no point adding a 
test for a deprecated feature.

The error msg for complex pow says "remainder"; it 
shouldn't.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 14:09

Message:
Logged In: YES 
user_id=6380

OK, I've deleted them for you. Who do you expect to review
this?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 14:02

Message:
Logged In: YES 
user_id=112690

Yes to both questions. I'm withdrawing test_complex.py
and test_b1.py.
I can't delete them and I double checked that I were
correctly logged in as hfoffani.
SourceForge error:
File Delete: ArtifactFile: Permission Denied


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-14 19:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:31:23 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:31:23 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16ziax-0005oQ-00@usw-sf-web3.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:31

Message:
Logged In: YES 
user_id=112690

On:
   vereq(a ** 105, a ** 105) ... etc ...
The c code in complexobject.c has special cases when the 
exponent is > 100, < than -100, and in-between.
I didn't want to test for equality with constants to avoid 
messing up with floating point issues.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-22 20:21

Message:
Logged In: YES 
user_id=31435

I'm not sure what lines like

vereq(a ** 105, a ** 105)
vereq(b ** -105, b ** -105)
vereq(b ** -30, b ** -30)

are trying to test.  That we get the same answer when we do 
exactly the same thing twice?

Note that complex % has been deprecated:  no point adding a 
test for a deprecated feature.

The error msg for complex pow says "remainder"; it 
shouldn't.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:09

Message:
Logged In: YES 
user_id=6380

OK, I've deleted them for you. Who do you expect to review
this?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:02

Message:
Logged In: YES 
user_id=112690

Yes to both questions. I'm withdrawing test_complex.py
and test_b1.py.
I can't delete them and I double checked that I were
correctly logged in as hfoffani.
SourceForge error:
File Delete: ArtifactFile: Permission Denied


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 19:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:38:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:38:58 -0700
Subject: [Patches] [ python-Patches-543867 ] test for patch #543865 & others
Message-ID: <E16ziiI-0006e8-00@usw-sf-web4.sourceforge.net>

Patches item #543867, was opened at 2002-04-15 00:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hernan Martinez Foffani (hfoffani)
Assigned to: Nobody/Anonymous (nobody)
Summary: test for patch #543865 & others

Initial Comment:
Here are 3 patches for:

- test_complex.py:
    . add several checks to force execution of
unvisited       
    parts of complexobject.c code.
    . add a test for complex floor division corresponding
    bug #543387 and fix #543865

- test_complex_future.py
    . add test for "future" true division.
    (actually this is not a patch but the hole file)

- test_b1.py
    . add test for bug #543840 and it's fix at patch
    #543865

Regards,
-Hernan


----------------------------------------------------------------------

>Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:38

Message:
Logged In: YES 
user_id=112690

Regarding "the error msg for complex pow says "remainder"; 
it shouldn't" you are correct, the exception string has a 
bad wording.


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:31

Message:
Logged In: YES 
user_id=112690

On:
   vereq(a ** 105, a ** 105) ... etc ...
The c code in complexobject.c has special cases when the 
exponent is > 100, < than -100, and in-between.
I didn't want to test for equality with constants to avoid 
messing up with floating point issues.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-22 20:21

Message:
Logged In: YES 
user_id=31435

I'm not sure what lines like

vereq(a ** 105, a ** 105)
vereq(b ** -105, b ** -105)
vereq(b ** -30, b ** -30)

are trying to test.  That we get the same answer when we do 
exactly the same thing twice?

Note that complex % has been deprecated:  no point adding a 
test for a deprecated feature.

The error msg for complex pow says "remainder"; it 
shouldn't.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:09

Message:
Logged In: YES 
user_id=6380

OK, I've deleted them for you. Who do you expect to review
this?

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-22 20:02

Message:
Logged In: YES 
user_id=112690

Yes to both questions. I'm withdrawing test_complex.py
and test_b1.py.
I can't delete them and I double checked that I were
correctly logged in as hfoffani.
SourceForge error:
File Delete: ArtifactFile: Permission Denied


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 19:51

Message:
Logged In: YES 
user_id=6380

I don't understand your comment. Are you withdrawing the
files test_complex.py and test_b1.py? Have you uploaded
these to separate patch issues? You should be able to delete
them as the original submitter; ifthis doesn't work, let me
know and I'll do it.

----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:48

Message:
Logged In: YES 
user_id=112690

Following Tim's advise to group together bug/fix/test, I'll
leave this patch entry for improvements in the tests of
complex numbers.

Then the valid files are:
21173: test_complex_future.py
and
21180: test_complex.diff3


----------------------------------------------------------------------

Comment By: Hernan Martinez Foffani (hfoffani)
Date: 2002-04-15 01:47

Message:
Logged In: YES 
user_id=112690


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=543867&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 19:50:30 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 11:50:30 -0700
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E16zitS-0000xQ-00@usw-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 14:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

>Comment By: Steven F. Lott (slott56)
Date: 2002-04-22 14:50

Message:
Logged In: YES 
user_id=328067

My suggestion for handling this MS extension syntax is 
to (1) tolerate the extension without an error, (2) treat it 
as an SGML marked section, using the 
unknown_decl() call-back.  Since this is a separate 
function, subclasses can override to alter this behavior.  

The content hidden in these MS-specific marked 
section appears to always be a &nbsp;.  While it might 
be expedient to completly skip over this junk, it makes it 
difficult to handle marked sections in a future version of 
markupbase.

Attached is a revised patch against V1.39 of sgmllib.py 
and 1.4 of markupbase.py

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-21 11:11

Message:
Logged In: YES 
user_id=3066

This is the same as bug #505747.

These "tags" are not legal HTML in any form, but are some
Microsoft invention.  It's not entirely clear what the right
thing to do is, but it is clear that we need to deal with
these in some different way.

Changed group to indicate that such changes can only go into
the trunk; feature changes in maintenance versions are not
allowed.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 13:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 20:06:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 12:06:42 -0700
Subject: [Patches] [ python-Patches-546244 ] implementation of Text.dump method
Message-ID: <E16zj98-00019H-00@usw-sf-web1.sourceforge.net>

Patches item #546244, was opened at 2002-04-19 13:10
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470

Category: Tkinter
Group: None
Status: Open
Resolution: None
Priority: 3
Submitted By: John Williams (johnw42)
Assigned to: Nobody/Anonymous (nobody)
Summary: implementation of Text.dump method

Initial Comment:
This is a fairly robust implementation of the dump
command for the text widget. It supports all the
options of the underlying Tk command. This patch is
relative to version 2.2.1.

----------------------------------------------------------------------

>Comment By: John Williams (johnw42)
Date: 2002-04-22 14:06

Message:
Logged In: YES 
user_id=44174

What an embarassing mistake! I've attached a new version
that uses a callback function to build the result list
instead of trying to parse a string with Tcl quoting in it.
It seems to work correctly when there are spaces and other
magic characters in the results.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 13:07

Message:
Logged In: YES 
user_id=6380

A quick test suggets that you don't parse the dump output
correctly. I entered "The quick brown fox jumps over the
lazy dog.\n" into a text widget, and then called
t.dump("1.0", "2.0"). It returned the following garbage:

[('text', '{The', 'quick'), ('brown', 'fox', 'jumps'),
('over', 'the', 'lazy'), ('dog.}', '1.0', 'mark'),
('current', '1.44', 'mark'), ('anchor', '1.44', 'mark'),
('insert', '1.44', 'text'), ('{', '}', '1.44')]

It looks like you don't parse the curly braces at all.

Maybe you can find someone on c.l.py to help you fix this?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=546244&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 20:17:45 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 12:17:45 -0700
Subject: [Patches] [ python-Patches-544909 ] addition of cmath.arg function
Message-ID: <E16zjJp-00073j-00@usw-sf-web4.sourceforge.net>

Patches item #544909, was opened at 2002-04-16 18:11
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Williams (johnw42)
Assigned to: Tim Peters (tim_one)
Summary: addition of cmath.arg function

Initial Comment:
This patch adds the familiar "Arg" function from
complex analysis to the cmath module, though it's
called "arg" here for consistency with the other names.
Along with the built-in abs function this makes
polar/rectangular coordinate conversions trivial:

  z = complex(x,y)
  r, theta = abs(z), arg(z)
  
  z = r * exp(1j * theta)
  x, y = z.real, z.imag

----------------------------------------------------------------------

>Comment By: John Williams (johnw42)
Date: 2002-04-22 14:17

Message:
Logged In: YES 
user_id=44174

You have a good point; the functions you suggest *are* a lot
closer to what is needed in practice, but at the same time
something just bothers me about a function I learned about
in the first week of complex analysis not being in the
complex math module--kind of like saying you don't need tan
since it's just really just sin/cos.

How would you feel about adding all three?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-22 13:08

Message:
Logged In: YES 
user_id=31435

It's hard to know what to say -- since it's really just an 
alias for math.atan2, it's hard to get excited about.  
Might it be even more convenient to add

cmath.polar(complex) -> (abs, arg)
cmath.rect(abs, arg) -> complex

functions?  That is, what do you *really* want <wink>?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 12:53

Message:
Logged In: YES 
user_id=6380

For Tim to pronounce.

----------------------------------------------------------------------

Comment By: John Williams (johnw42)
Date: 2002-04-19 12:38

Message:
Logged In: YES 
user_id=44174

I forgot to mention this patch is relative to version 2.2.1.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=544909&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 21:19:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 13:19:32 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E16zkHc-0001ob-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
>Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 21:22:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 13:22:08 -0700
Subject: [Patches] [ python-Patches-545150 ] {a,b} in fnmatch.translate
Message-ID: <E16zkK8-0007bA-00@usw-sf-web4.sourceforge.net>

Patches item #545150, was opened at 2002-04-17 12:40
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: {a,b} in fnmatch.translate

Initial Comment:
This patch adds support to {a,b} expansion constructs  
in fnmatch.translate. That is, file{a,b}.txt will 
match both, filea.txt and fileb.txt, like usual 
shell expansions. 
  

----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-22 20:22

Message:
Logged In: YES 
user_id=7887

About "user expectations", as I told before, I was just 
explaining my own experience, and I haven't looked at 
the documentations since I know how shell expansions 
works. But again, I have posted this because I used it in 
a project, and posting it was of no cost to me. My  
opinion is that besides the differences in expanding {}, 
it is useful in many cases. Nevertheless, the rejection 
is ok for me. 

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 18:00

Message:
Logged In: YES 
user_id=6380

I'm rejecting this, not for backwards compatibility reasons,
but because the shell semantics are different than those of
the other globbing patterns, and it's better to do this in
your application if you need it. (The difference is that in
the shell, {a,b}.txt expands to a.txt b.txt even if neither
file exists.)

About user expectations: the docs are quite clear about the
patterns recognized by fnmatch, so I don't know what you're
talking about.

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-04-18 18:07

Message:
Logged In: YES 
user_id=7887

Indeed. I was even expecting this answer. It's not usual 
to have such characters in filenames, but they're not 
invalid. OTOH, I discovered that is was not supported 
while trying to use them with globbing meanings, so this 
may be expected by some users. 
 
Anyway, the patch works, and I'd like to use this 
functionality for myself, but if you decide not to 
include it, I'll understand. 
 

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-18 06:01

Message:
Logged In: YES 
user_id=21627

I'm concerned about backwards compatibility of this change.
People who currently use { in their patterns get them
translated literally; under your change, this will have a
different effect.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545150&group_id=5470


From noreply@sourceforge.net  Mon Apr 22 23:04:25 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 15:04:25 -0700
Subject: [Patches] [ python-Patches-545439 ] interactive help in python-mode
Message-ID: <E16zlv7-0000FW-00@usw-sf-web4.sourceforge.net>

Patches item #545439, was opened at 2002-04-17 20:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470

Category: Demos and tools
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Barry Warsaw (bwarsaw)
Summary: interactive help in python-mode

Initial Comment:
If you apply the patch from bug 545436 to
python-mode.el, the attached code allows programmers
to get help from pydoc about the current possibly
dotted expression.

This is just a quick-n-dirty hack, but seems at
least marginally useful.


----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 18:03

Message:
Logged In: YES 
user_id=12800

I like the fix for sys.platform, and the binding to [f1]. 
I'm less sure about the (help-mode) bit because *Python
Output* is overloaded for other things too.  Oh well, I'll
add them all and see if anybody complains. :)

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2002-04-22 14:15

Message:
Logged In: YES 
user_id=44345

The patch works for me for methods but fails for 
symbols. I hadn't tried it for something like 
"sys.platform" before, so I'm pretty sure my original
patch failed for that as well.  When I ask for help on 
sys.platform it complains about "no help for 'linux2'". 
That makes sense, since it's asking for

  pydoc.help(sys.platform)

Too bad Python doesn't have Lisp's quote feature! <0.5 
wink>

Ooh! Ooh!  Pydoc *does* understand quoting!  Try

    pydoc.help("sys.platform")
    pydoc.help("sys.exc_info")

Looks like the help arg should always be quoted.

A couple other suggestions:

  * How about binding py-help-at-point to [F1]?

  * How about adding the following two lines to the
    end of py-help-at-point:

      (set-buffer "*Python Output*")
      (help-mode)

    That puts point in the *P O* buffer and makes it
    easy to dump the help buffer when you're through
    with it (just press 'q').

S


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 13:12

Message:
Logged In: YES 
user_id=12800

Skip, check out the attached patch.  This ought to work in
Emacs too.  It binds py-help-at-point to C-c C-h.


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 12:46

Message:
Logged In: YES 
user_id=12800

One problem with your patch is that symbol-near-point is a
XEmacs-ism.  I think we can fix that, and avoid the need to
"regexp-unquote" the return value by just inlining what we
need of symbol-near-point.  I'll work out that patch and
check it in if it works.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-18 03:05

Message:
Logged In: YES 
user_id=11105

FYI, a *somewhat* similar thing can be found at bug 541031.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545439&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 00:09:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 16:09:24 -0700
Subject: [Patches] [ python-Patches-510288 ] Emacs auto-detect J/Python mode
Message-ID: <E16zmw0-0001LZ-00@usw-sf-web2.sourceforge.net>

Patches item #510288, was opened at 2002-01-29 14:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470

Category: Demos and tools
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Kevin J. Butler (kevinbutler)
>Assigned to: Barry Warsaw (bwarsaw)
Summary: Emacs auto-detect J/Python mode

Initial Comment:
Modify python-mode.el (v. 4.6) to auto-detect the 
interpreter based on:

- #! line  (a la 'get-auto-mode')
- import of Java-specific packages (defaults to (java 
javax org com)
- default to cpython (not py-default-interpreter, 
because the import test can only
detect jpython packages)

I'm not an elisp expert, so review and feedback is 
welcome!

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 00:15:01 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 16:15:01 -0700
Subject: [Patches] [ python-Patches-510288 ] Emacs auto-detect J/Python mode
Message-ID: <E16zn1R-0001PD-00@usw-sf-web2.sourceforge.net>

Patches item #510288, was opened at 2002-01-29 14:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470

Category: Demos and tools
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Kevin J. Butler (kevinbutler)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Emacs auto-detect J/Python mode

Initial Comment:
Modify python-mode.el (v. 4.6) to auto-detect the 
interpreter based on:

- #! line  (a la 'get-auto-mode')
- import of Java-specific packages (defaults to (java 
javax org com)
- default to cpython (not py-default-interpreter, 
because the import test can only
detect jpython packages)

I'm not an elisp expert, so review and feedback is 
welcome!

----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 19:15

Message:
Logged In: YES 
user_id=12800

There is a big problem with this patch: it's generated as a
reverse patch so all the lines it should be adding, it's
deleting instead.

Do you think you could:

- port your patch to python-mode.el 4.13 (the current CVS
version).
- regenerate your patch as a non-reversed patch?

If so, I'll re-examine this for inclusion in python-mode. 
Also, I took assignment of this patch (I'd missed it the
first time around).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 03:14:25 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 19:14:25 -0700
Subject: [Patches] [ python-Patches-545523 ] patch for 514433  bsddb.dbopen (NULL)
Message-ID: <E16zpp3-0002w4-00@usw-sf-web4.sourceforge.net>

Patches item #545523, was opened at 2002-04-18 17:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470

Category: Modules
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Anthony Baxter (anthonybaxter)
Summary: patch for 514433  bsddb.dbopen (NULL)

Initial Comment:
This is a patch for 
[ 514433 ] bsddb: enable dbopen (file==NULL)
from that bug:
"""
bsddb: enable dbopen (file==NULL)
dbopen(): if the file argument is NULL, the library
will use a temporary file. this is useful if you want
that, or if you want to specify a large cache so that
it never actually touches the disk. [i.e., in-memory
hash/bt]
I've done this by replacing the "s" with a "z" in the
arg specs for the three open functions. Seems to work. 
"""

This patch does this. Some testing seems to show that
it works fine.

The docs for db-1.86 show that it was an acceptable
option back then, and it's still allowed for db-3.2,
so that seems a wide enough range of supported 
libraries.

Anyway, this patch has the trivial fix, additions
to the test suite (such as it is) and some short
additions to the docs (lifted from the dbopen()
manpage)

If this is fine, let me know and I'll check it into
the trunk.

----------------------------------------------------------------------

>Comment By: Anthony Baxter (anthonybaxter)
Date: 2002-04-23 12:14

Message:
Logged In: YES 
user_id=29957

checked into the trunk, considering it for 2.2.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-19 00:00

Message:
Logged In: YES 
user_id=6380

Looks OK. Check it in on the trunk, and we'll see how it
goes. (Could be a 2.2 candidate too I think.)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545523&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 03:43:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 22 Apr 2002 19:43:56 -0700
Subject: [Patches] [ python-Patches-545486 ] make test_linuxaudiodev ignore EBUSY
Message-ID: <E16zqHc-0005ve-00@usw-sf-web1.sourceforge.net>

Patches item #545486, was opened at 2002-04-18 14:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Nobody/Anonymous (nobody)
Summary: make test_linuxaudiodev ignore EBUSY

Initial Comment:
When testing, I don't want to have to stop
playing mp3s just to get the test suite to
have zero failures. The following trivial
patch makes it skip on EBUSY.

submitted for feedback as I'm not sure if
there's a reason to not do this...


----------------------------------------------------------------------

>Comment By: Anthony Baxter (anthonybaxter)
Date: 2002-04-23 12:43

Message:
Logged In: YES 
user_id=29957

checked in on trunk and 22 branch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 04:01

Message:
Logged In: YES 
user_id=6380

To me too. Somebody can check it in.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-19 03:02

Message:
Logged In: YES 
user_id=21627

Sounds reasonable to me.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=545486&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 08:10:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 00:10:38 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E16zuRi-0005yL-00@usw-sf-web4.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 18:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 09:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 22:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 19:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 14:58:51 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 06:58:51 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E1700ol-0004V0-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 16:21:18 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 08:21:18 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E17026Y-0003CW-00@usw-sf-web2.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 15:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 13:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 07:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 17:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 20:17:54 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 12:17:54 -0700
Subject: [Patches] [ python-Patches-510288 ] Emacs auto-detect J/Python mode
Message-ID: <E1705nW-00051C-00@usw-sf-web3.sourceforge.net>

Patches item #510288, was opened at 2002-01-29 12:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470

Category: Demos and tools
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Kevin J. Butler (kevinbutler)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Emacs auto-detect J/Python mode

Initial Comment:
Modify python-mode.el (v. 4.6) to auto-detect the 
interpreter based on:

- #! line  (a la 'get-auto-mode')
- import of Java-specific packages (defaults to (java 
javax org com)
- default to cpython (not py-default-interpreter, 
because the import test can only
detect jpython packages)

I'm not an elisp expert, so review and feedback is 
welcome!

----------------------------------------------------------------------

>Comment By: Kevin J. Butler (kevinbutler)
Date: 2002-04-23 13:17

Message:
Logged In: YES 
user_id=117665

Woops.

Fixed & ported to 4.13.


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 17:15

Message:
Logged In: YES 
user_id=12800

There is a big problem with this patch: it's generated as a
reverse patch so all the lines it should be adding, it's
deleting instead.

Do you think you could:

- port your patch to python-mode.el 4.13 (the current CVS
version).
- regenerate your patch as a non-reversed patch?

If so, I'll re-examine this for inclusion in python-mode. 
Also, I took assignment of this patch (I'd missed it the
first time around).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 20:34:06 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 12:34:06 -0700
Subject: [Patches] [ python-Patches-547734 ] Distutils & non-installed Python
Message-ID: <E17063C-00069r-00@usw-sf-web2.sourceforge.net>

Patches item #547734, was opened at 2002-04-23 15:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Nobody/Anonymous (nobody)
Summary: Distutils & non-installed Python

Initial Comment:
When using a Python that has not been installed to
build 3rd-party modules, distutils does not understand
that the build version of the source tree is needed.

This patch fixes distutils.sysconfig to understand that
the running Python is part of the build tree and needs
to use the appropriate "shape" of the tree.  This does
not assume anything about the current directory, so can
be used to build 3rd-party modules using Python's build
tree as well.

This is useful since it allows us to use a
non-installed debug-mode Python with 3rd-party modules
for testing.  It as the side-effect that
set_python_build() is no longer needed (the hack which
was added to allow distutils to be used to build the
"standard" extension modules).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 21:33:12 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 13:33:12 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E1706yO-0005rw-00@usw-sf-web3.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 22:26:56 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 14:26:56 -0700
Subject: [Patches] [ python-Patches-534304 ] PEP 263 phase 2 Implementation
Message-ID: <E1707oO-0007AB-00@usw-sf-web4.sourceforge.net>

Patches item #534304, was opened at 2002-03-24 08:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470

Category: Parser/Compiler
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: SUZUKI Hisao (suzuki_hisao)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP 263 phase 2 Implementation

Initial Comment:
This is a sample implementation of PEP 263 phase 2.

This implementation behaves just as normal Python does
if no other coding hints are given.  Thus it does not
hurt anyone who uses Python now.  Note that it is
strictly compatible with the PEP in that every program
valid in the PEP is also valid in this implementation.

This implementation also accepts files in UTF-16 with
BOM.  They are read as UTF-8 internally.  Please try
"utf16sample.py" included.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 17:26

Message:
Logged In: YES 
user_id=6380

I haven't looked at this very carefully, but it looks like
it's well thought-out.

Suzuki, can you prepare a patch relative to current CVS?  I
get several patch failures now. (Fortunately I have a
checkout of 2.2 so I can still review and test the patch.)
I don't know what the patch failures are about (haven't
investigated) but imagine it might have to do with the PEP
279 (universal newlines) changes checked in by Jack Jansen,
which replaces the tokenizer's fgets() calls with calls to
Py_UniversalNewlineFgets().

Also, I can't read the README file (it's in Japanese :-).
What is the expected output from the samples? For me,
sjis_sample.py gives SyntaxError: 'unknown encoding'

Martin, I'm unclear of how you intend to use this code. Do
you intend to go straight to phase 2 of the PEP using this
patch? Or do you intend to implement phase 1 of the PEP by
modifying this code?

Also, does the PEP describe the UTF-16 support as
implemented by Suziki's patch?


----------------------------------------------------------------------

Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2002-03-31 11:16

Message:
Logged In: YES 
user_id=495142

Thank you for your review.
Now 1. and 3. are fixed, and 2. is improved.
(4. is not true.)


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 06:27

Message:
Logged In: YES 
user_id=6656

Not going into 2.2.x.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-25 08:23

Message:
Logged In: YES 
user_id=21627

The patch looks good, but needs a number of improvements.

1. I have problems building this code. When trying to build
pgen, I get an error message of

Parser/parsetok.c: In function `parsetok':
Parser/parsetok.c:175: `encoding_decl' undeclared

The problem here is that graminit.h hasn't been built yet,
but parsetok refers to the symbol.

2. For some reason, error printing for incorrect encodings
does not work - it appears that it prints the wrong line in
the traceback.

3. The escape processing in Unicode literals is incorrect.
For example, u"\<non-ascii character>" should denote only
the non-ascii character. However, your implementation
replaces the non-ASCII character with \u<hex>, resulting in
\u<hex>, so the first backslash unescapes the second one.

4. I believe the escape processing in byte strings is also
incorrect for encodings that allow \ in the second byte.
Before processing escape characters, you convert back into
the source encoding. If this produces a backslash character,
escape processing will misinterpret that byte as an escape
character.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 23:16:26 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 15:16:26 -0700
Subject: [Patches] [ python-Patches-547813 ] Some corrections to newtypes.tex
Message-ID: <E1708aI-0007iZ-00@usw-sf-web4.sourceforge.net>

Patches item #547813, was opened at 2002-04-24 00:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Some corrections to newtypes.tex

Initial Comment:
Small mistakes, and incomplete function prototypes.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470


From noreply@sourceforge.net  Tue Apr 23 23:44:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 15:44:16 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E17091E-0002Fi-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 18:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 01:18:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 17:18:39 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E170AUZ-0003HD-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 18:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 06:14:43 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 22:14:43 -0700
Subject: [Patches] [ python-Patches-547813 ] Some corrections to newtypes.tex
Message-ID: <E170F75-0002t2-00@usw-sf-web3.sourceforge.net>

Patches item #547813, was opened at 2002-04-23 18:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470

Category: Documentation
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Thomas Heller (theller)
>Assigned to: Thomas Heller (theller)
Summary: Some corrections to newtypes.tex

Initial Comment:
Small mistakes, and incomplete function prototypes.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-24 01:14

Message:
Logged In: YES 
user_id=3066

Accepted, except for the 2nd chunk.  "NUL-terminated" should
remain as-is; "NULL" is a pointer value in C, and "NUL" is
the ASCII character with ordinal value 0.

Go ahead and check in the rest; thanks!

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 07:37:25 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 23 Apr 2002 23:37:25 -0700
Subject: [Patches] [ python-Patches-547813 ] Some corrections to newtypes.tex
Message-ID: <E170GP7-0004gq-00@usw-sf-web2.sourceforge.net>

Patches item #547813, was opened at 2002-04-24 00:16
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470

Category: Documentation
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Thomas Heller (theller)
Summary: Some corrections to newtypes.tex

Initial Comment:
Small mistakes, and incomplete function prototypes.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-24 08:37

Message:
Logged In: YES 
user_id=11105

Checked in as rev 1.12.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-24 07:14

Message:
Logged In: YES 
user_id=3066

Accepted, except for the 2nd chunk.  "NUL-terminated" should
remain as-is; "NULL" is a pointer value in C, and "NUL" is
the ASCII character with ordinal value 0.

Go ahead and check in the rest; thanks!

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547813&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 08:01:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Apr 2002 00:01:33 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E170GmT-0004wY-00@usw-sf-web2.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 07:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-24 00:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 22:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 15:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 13:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 07:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 17:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 08:03:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Apr 2002 00:03:46 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E170Goc-0007U6-00@usw-sf-web1.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 07:03

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 07:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-24 00:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 22:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 15:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 13:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 07:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 17:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 13:09:55 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Apr 2002 05:09:55 -0700
Subject: [Patches] [ python-Patches-547734 ] Distutils & non-installed Python
Message-ID: <E170Lat-0002Th-00@usw-sf-web1.sourceforge.net>

Patches item #547734, was opened at 2002-04-23 19:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Nobody/Anonymous (nobody)
Summary: Distutils & non-installed Python

Initial Comment:
When using a Python that has not been installed to
build 3rd-party modules, distutils does not understand
that the build version of the source tree is needed.

This patch fixes distutils.sysconfig to understand that
the running Python is part of the build tree and needs
to use the appropriate "shape" of the tree.  This does
not assume anything about the current directory, so can
be used to build 3rd-party modules using Python's build
tree as well.

This is useful since it allows us to use a
non-installed debug-mode Python with 3rd-party modules
for testing.  It as the side-effect that
set_python_build() is no longer needed (the hack which
was added to allow distutils to be used to build the
"standard" extension modules).

----------------------------------------------------------------------

>Comment By: Michael Hudson (mwh)
Date: 2002-04-24 12:09

Message:
Logged In: YES 
user_id=6656

Fred, can you look at

[ 458898 ] --python-build for install

(which has been sitting on my plate for an embarrassingly
long time)

Is it solving the same problem?  Which appraoch do you prefer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470


From noreply@sourceforge.net  Wed Apr 24 17:31:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Apr 2002 09:31:35 -0700
Subject: [Patches] [ python-Patches-548197 ] Adds cookie support to urllib2.py
Message-ID: <E170Pg7-0002t6-00@usw-sf-web4.sourceforge.net>

Patches item #548197, was opened at 2002-04-25 00:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548197&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeff Pitman (bruthasj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Adds cookie support to urllib2.py

Initial Comment:
Problem    
   
Whenever urllib2 encounters a complex sequence of redirects   
and authentication schemes it is extremely difficult to   
"inject" Cookie headers to each request.   
   
Solution   
   
Make a small change to AbstractHTTPHandler that will detect on   
a host-by-host basis if there are any stored cookies.  This is   
done by using a Dict that is initialized on import of urllib2.    
Therefore the Dict of Hosts is available throughout multiple   
instances of request objects and multiple urlopens on those   
objects.     
   
Upon detection, it will send the Cookies along with the other   
request headers to the host.  If there isn't any, it just   
skips it.    
   
The next thing is to collect the Cookies when we get the   
"reply" from the host.  All of this is done within   
AbstractHTTPHandler -- no other modifications required.   
  
Test  
  
I can login into http://mantisbt.sf.net/ which requires the  
browser to maintain a set of cookies during the session.  
  
   
Future   
   
* Maybe use external files to store and control storage of   
cookies   
* Better API in Cookie module so we don't have to do the 
string.split hack   
* Better API in urllib2 for Cookies (or just make it 
transparent? or both?) 
   
   
----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548197&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 05:41:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Apr 2002 21:41:38 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E170b4c-0001q4-00@usw-sf-web3.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 16:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-25 04:41

Message:
Logged In: YES 
user_id=80475

Please re-examine the recoding of enum_dealloc:

PyObject_GC_UnTrack(en);
Py_XDECREF(en->en_sit);
en->ob_type->tp_free(en);

The last line doesn't make sense to me. 
en doesn't have an ob_type structure member.
I had expected:  PyObject_GC_Del(en);
and the XDECREF ought to take care of the subordinate 
iterator if its count drops to zero.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 07:03

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 07:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-24 00:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 22:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 15:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 13:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 07:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 20:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 17:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 13:46:09 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 05:46:09 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E170idV-0007ko-00@usw-sf-web2.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 08:46

Message:
Logged In: YES 
user_id=6380

Thanks for asking! This is something that's not documented
very well...

en does have an ob_type member, it's part of the
PyObject_HEAD macro, so all objects have this. Since tp_free
in PyEnum_Type is initialized to PyObject_GC_Del, the effect
is the same as calling that function directly. The
indirection via ob_type->tp_free is so that a subclass can
override both tp_alloc and tp_free to manage memory for its
instances differently.

I'm not sure what you're asking about the XDECREF; the
XDECREF indeed takes care of the en_sit fiend, the tp_free
doesn't touch it.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-25 00:41

Message:
Logged In: YES 
user_id=80475

Please re-examine the recoding of enum_dealloc:

PyObject_GC_UnTrack(en);
Py_XDECREF(en->en_sit);
en->ob_type->tp_free(en);

The last line doesn't make sense to me. 
en doesn't have an ob_type structure member.
I had expected:  PyObject_GC_Del(en);
and the XDECREF ought to take care of the subordinate 
iterator if its count drops to zero.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:03

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 18:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 17:34:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 09:34:33 -0700
Subject: [Patches] [ python-Patches-510288 ] Emacs auto-detect J/Python mode
Message-ID: <E170mCX-00044e-00@usw-sf-web1.sourceforge.net>

Patches item #510288, was opened at 2002-01-29 14:13
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470

Category: Demos and tools
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Kevin J. Butler (kevinbutler)
Assigned to: Barry Warsaw (bwarsaw)
Summary: Emacs auto-detect J/Python mode

Initial Comment:
Modify python-mode.el (v. 4.6) to auto-detect the 
interpreter based on:

- #! line  (a la 'get-auto-mode')
- import of Java-specific packages (defaults to (java 
javax org com)
- default to cpython (not py-default-interpreter, 
because the import test can only
detect jpython packages)

I'm not an elisp expert, so review and feedback is 
welcome!

----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-25 12:34

Message:
Logged In: YES 
user_id=12800

Thanks, that one applied cleanly.  I'll probably make some
changes, but it looks decent enough that I'll probably add it.

----------------------------------------------------------------------

Comment By: Kevin J. Butler (kevinbutler)
Date: 2002-04-23 15:17

Message:
Logged In: YES 
user_id=117665

Woops.

Fixed & ported to 4.13.


----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-04-22 19:15

Message:
Logged In: YES 
user_id=12800

There is a big problem with this patch: it's generated as a
reverse patch so all the lines it should be adding, it's
deleting instead.

Do you think you could:

- port your patch to python-mode.el 4.13 (the current CVS
version).
- regenerate your patch as a non-reversed patch?

If so, I'll re-examine this for inclusion in python-mode. 
Also, I took assignment of this patch (I'd missed it the
first time around).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=510288&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 18:13:52 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 10:13:52 -0700
Subject: [Patches] [ python-Patches-533008 ] specifying headers for extensions
Message-ID: <E170moa-00024Q-00@usw-sf-web2.sourceforge.net>

Patches item #533008, was opened at 2002-03-21 12:09
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Thomas Heller (theller)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: specifying headers for extensions

Initial Comment:
This patch allows to specify that C header files are 
part of source files for dependency checking. 
The 'sources' list in Extension instances can be 
simple filenames as before, but they can also be 
SourceFile instances created by

SourceFile("myfile.c", headers=["inc1.h", "inc2.h"]).

Unfortunately not only changes to command.build_ext 
and command.build_clib had to be made, also all the 
ccompiler (sub)classes have to be changed because the 
ccompiler does the actual dependency checking. I 
updated all the ccompiler subclasses except 
mwerkscompiler.py, but only msvccompiler has actually 
been tested.

The argument list which dep_util.newer_pairwise() now 
accepts has changed, the first arg must now be a 
sequence of SourceFile instances. This may be 
problematic, better would IMO be to move this function 
(with a new name?) into ccompiler.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-25 19:13

Message:
Logged In: YES 
user_id=11105

This patch has two problems:
- it is too large,
- it breaks custom subclasses of distutils' build command
family. I have two of them in my setup scripts, there are
probably more out in the world.

Here is an idea for a totally different approach:
The checking whether targets needs to be rebuild is done
with  the newer_group() and newer_pairwise() functions in
distutils.dep_util. These methods could scan the
source-files for include directives and user the timestamps
of the include files as well.
This patch would be fairly small.
OTOH, I have no C-source scanner lying around, and also
don't feel to write one.

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-04 21:57

Message:
Logged In: YES 
user_id=3066

Wow!  That's certainly more patch than I'd expected, but the
approach looks about right to me.  I'd like to take another
look at it in a few days (mail me if I don't take action
soon) before we accept, just to make sure I understand it
better.

Thanks!

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-03-25 10:03

Message:
Logged In: YES 
user_id=11105

Fred requested it this way:
http://mail.python.org/pipermail/distutils-sig/2002-
March/002806.html

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-03-24 23:05

Message:
Logged In: YES 
user_id=6380

Why is this priority 7??????

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=533008&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 21:02:05 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 13:02:05 -0700
Subject: [Patches] [ python-Patches-512981 ] readline /dev/tty problem
Message-ID: <E170pRN-0003xn-00@usw-sf-web4.sourceforge.net>

Patches item #512981, was opened at 2002-02-04 13:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512981&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Daniel Stutzbach (agthorr)
Assigned to: Nobody/Anonymous (nobody)
Summary: readline /dev/tty problem

Initial Comment:
GNU readline doesn't work if the C symbol "stdin" is
not a tty, even if the python symbol sys.stdin is. 
This will happen, for example, if a program initially
receives input from a pipe, but then changes sys.stdin
to read from /dev/tty.  (This is how programs like
"less" work)

Here's a sample program that exhibits this behavior:

------------------------------------------------------------------------
#!/usr/bin/env python

import sys

mail = sys.stdin.read ()

sys.stdin = open ('/dev/tty', 'r')

import readline

foo = raw_input ('add_idea> ')
print foo
------------------------------------------------------------------------

You can test this by saving the above program to a file
(foo.py), piping data to it, then trying to use GNU
readline editing commands at the prompt.

E.g.:

------------------------------------------------------------------------
liberty:~$ cat ideas.html | ./foo.py
add_idea> asdfsdf^Afoo
asdfsdffoo
------------------------------------------------------------------------

The patch attached seems to fix the problem.  You may
want to grep the source for other modules that may have
similar bugs.  Also, this patch assumes that the
readline module is imported *after* sys.stdin is
changed.  This much better than nothing (particularly
if it's documented), but there may be a better solution.

-- Agthorr

----------------------------------------------------------------------

>Comment By: Daniel Stutzbach (agthorr)
Date: 2002-04-25 13:02

Message:
Logged In: YES 
user_id=6324

If I create a patch that operates as described in my
previous followup, will you apply it?  Is there anything I
can do to get this integrated into the main python trunk?  I
don't like having to repatch and rebuild python everytime a
new version comes out that I need for some other reason :>


----------------------------------------------------------------------

Comment By: Daniel Stutzbach (agthorr)
Date: 2002-02-05 10:21

Message:
Logged In: YES 
user_id=6324

1) Well, it lets python treat sys.stdin as a tty even if C
stdin != python sys.stdin.  It still checks to make sure
sys.stdin is a tty using isatty().  If some user changes
sys.stdin to point to a tty, but *wants* Python to treat it
as a non-tty, then this might cause them some grief.  I
can't think of any case where they'd want to do that,
though.  The behavior would be unchanged when sys.stdin
points to a regular file.

2) hmm.. I suppose, ideally, the readline module should
smoothly handle sys.stdin being changed out from under it. 
Readline alters various terminal settings on rl_instream
during initialization, though.  For example, it changes the
terminal to raw or cbreak mode from cooked mode, so that it
can receive input a character at a time instead of a line at
a time.  It may be possible to uninitialized and
reinitialized terminal each time call_readline is called, I
suppose (I believe libreadline provides hooks for this).  It
would also have to check if sys.stdin is a tty, and call
PyFile_GetLine if it is not.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-02-05 02:35

Message:
Logged In: YES 
user_id=6656

Comments:

1) in what ways does this change existing behaviour?  I can
think of a few, but are there any that will inconvenience
existing users
2) why not do the rl_instream = PySys_GetObject("stdin")
dance in call_readline()?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=512981&group_id=5470


From noreply@sourceforge.net  Thu Apr 25 23:31:10 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 15:31:10 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170rle-0005CN-00@usw-sf-web3.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 22:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 01:21:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 17:21:32 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170tUS-0006kh-00@usw-sf-web4.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 18:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 20:21

Message:
Logged In: YES 
user_id=6380

Hm, this was intentional. You can still set __debug__ (just
import __builtin__).

But I'm not sure if it was ever used or documented.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 01:29:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 17:29:24 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170tc4-0000th-00@usw-sf-web1.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 22:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-26 00:29

Message:
Logged In: YES 
user_id=35752

Assigning to __debug__ doesn't work anyhow.  If the code is
compiled
with -O then the assert operations don't appear in the
bytecode.  The
only use would be to disable asserts by assigning False
__debug__.

Did you forget this thread
http://mail.python.org/pipermail/python-dev/2001-March/013999.html
or am I missing something?

Adding a LOAD_GLOBAL for every assert discourages people
from using
asserts, IMHO.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 00:21

Message:
Logged In: YES 
user_id=6380

Hm, this was intentional. You can still set __debug__ (just
import __builtin__).

But I'm not sure if it was ever used or documented.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 01:33:28 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 17:33:28 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170tg0-0006un-00@usw-sf-web2.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 18:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
>Assigned to: Neil Schemenauer (nascheme)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 20:33

Message:
Logged In: YES 
user_id=6380

OK, you've convinced me.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-25 20:29

Message:
Logged In: YES 
user_id=35752

Assigning to __debug__ doesn't work anyhow.  If the code is
compiled
with -O then the assert operations don't appear in the
bytecode.  The
only use would be to disable asserts by assigning False
__debug__.

Did you forget this thread
http://mail.python.org/pipermail/python-dev/2001-March/013999.html
or am I missing something?

Adding a LOAD_GLOBAL for every assert discourages people
from using
asserts, IMHO.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 20:21

Message:
Logged In: YES 
user_id=6380

Hm, this was intentional. You can still set __debug__ (just
import __builtin__).

But I'm not sure if it was ever used or documented.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 02:58:34 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 18:58:34 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170v0M-0001lw-00@usw-sf-web1.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 22:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-26 01:58

Message:
Logged In: YES 
user_id=35752

Thanks for the review Guido.  Commited as compile.c 2.242

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 00:33

Message:
Logged In: YES 
user_id=6380

OK, you've convinced me.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-26 00:29

Message:
Logged In: YES 
user_id=35752

Assigning to __debug__ doesn't work anyhow.  If the code is
compiled
with -O then the assert operations don't appear in the
bytecode.  The
only use would be to disable asserts by assigning False
__debug__.

Did you forget this thread
http://mail.python.org/pipermail/python-dev/2001-March/013999.html
or am I missing something?

Adding a LOAD_GLOBAL for every assert discourages people
from using
asserts, IMHO.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 00:21

Message:
Logged In: YES 
user_id=6380

Hm, this was intentional. You can still set __debug__ (just
import __builtin__).

But I'm not sure if it was ever used or documented.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 03:00:57 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 25 Apr 2002 19:00:57 -0700
Subject: [Patches] [ python-Patches-548833 ] Optimize code for assert statement
Message-ID: <E170v2f-0007Od-00@usw-sf-web3.sourceforge.net>

Patches item #548833, was opened at 2002-04-25 18:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Neil Schemenauer (nascheme)
Summary: Optimize code for assert statement

Initial Comment:
The compiler knows the value of __debug__ so there is
no
need to test it at run time.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 22:00

Message:
Logged In: YES 
user_id=6380

No thanks. :-)

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-25 21:58

Message:
Logged In: YES 
user_id=35752

Thanks for the review Guido.  Commited as compile.c 2.242

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 20:33

Message:
Logged In: YES 
user_id=6380

OK, you've convinced me.

----------------------------------------------------------------------

Comment By: Neil Schemenauer (nascheme)
Date: 2002-04-25 20:29

Message:
Logged In: YES 
user_id=35752

Assigning to __debug__ doesn't work anyhow.  If the code is
compiled
with -O then the assert operations don't appear in the
bytecode.  The
only use would be to disable asserts by assigning False
__debug__.

Did you forget this thread
http://mail.python.org/pipermail/python-dev/2001-March/013999.html
or am I missing something?

Adding a LOAD_GLOBAL for every assert discourages people
from using
asserts, IMHO.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 20:21

Message:
Logged In: YES 
user_id=6380

Hm, this was intentional. You can still set __debug__ (just
import __builtin__).

But I'm not sure if it was ever used or documented.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548833&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 08:00:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 00:00:46 -0700
Subject: [Patches] [ python-Patches-548943 ] Floating point issues in body of text
Message-ID: <E170zio-0002KQ-00@usw-sf-web4.sourceforge.net>

Patches item #548943, was opened at 2002-04-26 10:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548943&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jarno Virtanen (jajvirta)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Floating point issues in body of text

Initial Comment:
Assuming that people do read the tutorial and that
people don't read the appendix (or at least it is
read much later), the body of the text should briefly
mention the floating point issues with an example and
reference the corresponding appendix. This issue seems
to be one of the most frequently asked questions on 
python-list/c.l.python and since people don't read 
FAQs ;-), I thought it would be convinient to mention
it in the body of the tutorial text.

This patch is made against the CVS version but it
should work with earlier versions too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548943&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 09:27:31 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 01:27:31 -0700
Subject: [Patches] [ python-Patches-548943 ] Floating point issues in body of text
Message-ID: <E17114l-0005iw-00@usw-sf-web1.sourceforge.net>

Patches item #548943, was opened at 2002-04-26 10:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548943&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
>Priority: 2
Submitted By: Jarno Virtanen (jajvirta)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Floating point issues in body of text

Initial Comment:
Assuming that people do read the tutorial and that
people don't read the appendix (or at least it is
read much later), the body of the text should briefly
mention the floating point issues with an example and
reference the corresponding appendix. This issue seems
to be one of the most frequently asked questions on 
python-list/c.l.python and since people don't read 
FAQs ;-), I thought it would be convinient to mention
it in the body of the tutorial text.

This patch is made against the CVS version but it
should work with earlier versions too.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=548943&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 13:00:30 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 05:00:30 -0700
Subject: [Patches] [ python-Patches-549037 ] ConfigParser: optional section header
Message-ID: <E1714Os-0005k3-00@usw-sf-web4.sourceforge.net>

Patches item #549037, was opened at 2002-04-26 12:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549037&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Detlef Lannert (lannert)
Assigned to: Nobody/Anonymous (nobody)
Summary: ConfigParser: optional section header

Initial Comment:
Each configuration file parsed by ConfigParser.py must
start
with a section header line (a section name enclosed in
[...]).
In many cases where I just want to parse a file with
some
variable settings for a program this is IMO a nuisance:
The
user must know the expected section name and insert a
redundant header line, even if there are no other
sections
possible.

The "surprise factor" is even higher when RFC[2]822
syntax
is used; the config file then looks more like a
standard mail
header which wouldn't start with a section title
anyway.

Since the config file is read and the case of missing
section
titles handled in the __read() method of the
ConfigParser,
there is no easy way to modify the parser's behaviour
just
by subclassing and overwriting a method.

The patch lets the caller specify a default section
name; if
the config file doesn't start with a [section] line but
with
option lines, a suitable section is automatically
created and
holds the option entries. In any other case, i.e., when
the
header line is present in the config file and/or when
no
default startsection name is specified, the parser's
behaviour
is unchanged. Thus there shouldn't be a compatibility
issue.

In case the upload of the patch file fails (which has
happened
to me and my browser before), please have a look at
<http://starship.python.net/~lannert/ConfigParser.diff>;
the
diffs also add some lines to test_cfgparser.py.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549037&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 16:24:57 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 08:24:57 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1717aj-00086Y-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Nobody/Anonymous (nobody)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 16:28:33 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 08:28:33 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1717eD-00089O-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
>Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:14:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:14:27 -0700
Subject: [Patches] [ python-Patches-549213 ] warn on assignment to None, True, False
Message-ID: <E171BAp-0002GB-00@usw-sf-web4.sourceforge.net>

Patches item #549213, was opened at 2002-04-26 19:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
Assigned to: Guido van Rossum (gvanrossum)
Summary: warn on assignment to None, True, False

Initial Comment:
This patch issues a warning when None, True, or False
is assigned to.  I don't know what the text of the
warning should say, but that's a minor matter.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:28:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:28:24 -0700
Subject: [Patches] [ python-Patches-549213 ] warn on assignment to None, True, False
Message-ID: <E171BOK-0002QJ-00@usw-sf-web4.sourceforge.net>

Patches item #549213, was opened at 2002-04-26 15:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
>Assigned to: Jeremy Hylton (jhylton)
Summary: warn on assignment to None, True, False

Initial Comment:
This patch issues a warning when None, True, or False
is assigned to.  I don't know what the text of the
warning should say, but that's a minor matter.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 15:28

Message:
Logged In: YES 
user_id=6380

Nice, but doesn't catch enough cases. E.g. using None as an
argument name, assigning to an attribute named None, "import
foo.None", "from foo.None import bar", "from None import
bar", "from None.foo import bar", using None as a keyword
argument in a call: "f(1, 2, None=3)".

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:34:09 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:34:09 -0700
Subject: [Patches] [ python-Patches-549213 ] warn on assignment to None, True, False
Message-ID: <E171BTt-00028E-00@usw-sf-web3.sourceforge.net>

Patches item #549213, was opened at 2002-04-26 19:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
Assigned to: Jeremy Hylton (jhylton)
Summary: warn on assignment to None, True, False

Initial Comment:
This patch issues a warning when None, True, or False
is assigned to.  I don't know what the text of the
warning should say, but that's a minor matter.


----------------------------------------------------------------------

>Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-26 19:34

Message:
Logged In: YES 
user_id=31392

Right.  I'll have to fiddle the patch to make the check into
a function and we can see where else it should be applied.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 19:28

Message:
Logged In: YES 
user_id=6380

Nice, but doesn't catch enough cases. E.g. using None as an
argument name, assigning to an attribute named None, "import
foo.None", "from foo.None import bar", "from None import
bar", "from None.foo import bar", using None as a keyword
argument in a call: "f(1, 2, None=3)".

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:41:36 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:41:36 -0700
Subject: [Patches] [ python-Patches-534304 ] PEP 263 phase 2 Implementation
Message-ID: <E171Bb6-000536-00@usw-sf-web1.sourceforge.net>

Patches item #534304, was opened at 2002-03-24 14:52
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470

Category: Parser/Compiler
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: SUZUKI Hisao (suzuki_hisao)
Assigned to: Nobody/Anonymous (nobody)
Summary: PEP 263 phase 2 Implementation

Initial Comment:
This is a sample implementation of PEP 263 phase 2.

This implementation behaves just as normal Python does
if no other coding hints are given.  Thus it does not
hurt anyone who uses Python now.  Note that it is
strictly compatible with the PEP in that every program
valid in the PEP is also valid in this implementation.

This implementation also accepts files in UTF-16 with
BOM.  They are read as UTF-8 internally.  Please try
"utf16sample.py" included.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 21:41

Message:
Logged In: YES 
user_id=21627

I've updated the PEP to describe how this approach should be
used: Python 2.3 still should generate warnings only for
using non-ASCII without declared encoding. I, too, hope that
Mr Suzuki will update the patch to match the PEP, and for
the CVS tree.

As for supporting UTF-16: The stream reader currently has
the .readline method disabled, since it won't work reliable
for little-endian. So I think this should be an undocumented
feature at the moment; I see no other technical problems
with the approach taken in the patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 23:26

Message:
Logged In: YES 
user_id=6380

I haven't looked at this very carefully, but it looks like
it's well thought-out.

Suzuki, can you prepare a patch relative to current CVS?  I
get several patch failures now. (Fortunately I have a
checkout of 2.2 so I can still review and test the patch.)
I don't know what the patch failures are about (haven't
investigated) but imagine it might have to do with the PEP
279 (universal newlines) changes checked in by Jack Jansen,
which replaces the tokenizer's fgets() calls with calls to
Py_UniversalNewlineFgets().

Also, I can't read the README file (it's in Japanese :-).
What is the expected output from the samples? For me,
sjis_sample.py gives SyntaxError: 'unknown encoding'

Martin, I'm unclear of how you intend to use this code. Do
you intend to go straight to phase 2 of the PEP using this
patch? Or do you intend to implement phase 1 of the PEP by
modifying this code?

Also, does the PEP describe the UTF-16 support as
implemented by Suziki's patch?


----------------------------------------------------------------------

Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2002-03-31 18:16

Message:
Logged In: YES 
user_id=495142

Thank you for your review.
Now 1. and 3. are fixed, and 2. is improved.
(4. is not true.)


----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-03-30 12:27

Message:
Logged In: YES 
user_id=6656

Not going into 2.2.x.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-25 14:23

Message:
Logged In: YES 
user_id=21627

The patch looks good, but needs a number of improvements.

1. I have problems building this code. When trying to build
pgen, I get an error message of

Parser/parsetok.c: In function `parsetok':
Parser/parsetok.c:175: `encoding_decl' undeclared

The problem here is that graminit.h hasn't been built yet,
but parsetok refers to the symbol.

2. For some reason, error printing for incorrect encodings
does not work - it appears that it prints the wrong line in
the traceback.

3. The escape processing in Unicode literals is incorrect.
For example, u"\<non-ascii character>" should denote only
the non-ascii character. However, your implementation
replaces the non-ASCII character with \u<hex>, resulting in
\u<hex>, so the first backslash unescapes the second one.

4. I believe the escape processing in byte strings is also
incorrect for encodings that allow \ in the second byte.
Before processing escape characters, you convert back into
the source encoding. If this produces a backslash character,
escape processing will misinterpret that byte as an escape
character.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=534304&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:41:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:41:49 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E171BbJ-0002cZ-00@usw-sf-web2.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 15:41

Message:
Logged In: YES 
user_id=6380

This is now checked in, except for the docs.  I'm assigning
this to Fred for docs review and commit. Thanks, Raymond and
Alex!

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 08:46

Message:
Logged In: YES 
user_id=6380

Thanks for asking! This is something that's not documented
very well...

en does have an ob_type member, it's part of the
PyObject_HEAD macro, so all objects have this. Since tp_free
in PyEnum_Type is initialized to PyObject_GC_Del, the effect
is the same as calling that function directly. The
indirection via ob_type->tp_free is so that a subclass can
override both tp_alloc and tp_free to manage memory for its
instances differently.

I'm not sure what you're asking about the XDECREF; the
XDECREF indeed takes care of the en_sit fiend, the tp_free
doesn't touch it.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-25 00:41

Message:
Logged In: YES 
user_id=80475

Please re-examine the recoding of enum_dealloc:

PyObject_GC_UnTrack(en);
Py_XDECREF(en->en_sit);
en->ob_type->tp_free(en);

The last line doesn't make sense to me. 
en doesn't have an ob_type structure member.
I had expected:  PyObject_GC_Del(en);
and the XDECREF ought to take care of the subordinate 
iterator if its count drops to zero.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:03

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 18:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 20:58:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 12:58:35 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E171BrX-0002l6-00@usw-sf-web4.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 17:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 21:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 21:30:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 13:30:38 -0700
Subject: [Patches] [ python-Patches-549213 ] warn on assignment to None, True, False
Message-ID: <E171CMY-000388-00@usw-sf-web4.sourceforge.net>

Patches item #549213, was opened at 2002-04-26 19:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
Assigned to: Jeremy Hylton (jhylton)
Summary: warn on assignment to None, True, False

Initial Comment:
This patch issues a warning when None, True, or False
is assigned to.  I don't know what the text of the
warning should say, but that's a minor matter.


----------------------------------------------------------------------

>Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-26 20:30

Message:
Logged In: YES 
user_id=31392

Here's an improved pass, although the calls are getting a
bit messy.  Not sure how worried I am, since I'd like to
replace this code anyway.  (The worry is that I don't get
around to replacing it before 2.3.)

Also, fix com_arglist() to do only the work that is necessary.


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-26 19:34

Message:
Logged In: YES 
user_id=31392

Right.  I'll have to fiddle the patch to make the check into
a function and we can see where else it should be applied.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 19:28

Message:
Logged In: YES 
user_id=6380

Nice, but doesn't catch enough cases. E.g. using None as an
argument name, assigning to an attribute named None, "import
foo.None", "from foo.None import bar", "from None import
bar", "from None.foo import bar", using None as a keyword
argument in a call: "f(1, 2, None=3)".

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 21:30:44 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 13:30:44 -0700
Subject: [Patches] [ python-Patches-547162 ] PEP 279 enumerate() implementation
Message-ID: <E171CMe-0003BD-00@usw-sf-web2.sourceforge.net>

Patches item #547162, was opened at 2002-04-22 12:49
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: PEP 279 enumerate() implementation

Initial Comment:
C version of enumerate().

Docs forthcoming this afternoon.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-26 16:30

Message:
Logged In: YES 
user_id=3066

Accepted documentation patches with small changes:

Doc/lib/libfuncs.tex rev. 1.105
Doc/tut/tut.tex rev. 1.160

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 15:41

Message:
Logged In: YES 
user_id=6380

This is now checked in, except for the docs.  I'm assigning
this to Fred for docs review and commit. Thanks, Raymond and
Alex!

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-25 08:46

Message:
Logged In: YES 
user_id=6380

Thanks for asking! This is something that's not documented
very well...

en does have an ob_type member, it's part of the
PyObject_HEAD macro, so all objects have this. Since tp_free
in PyEnum_Type is initialized to PyObject_GC_Del, the effect
is the same as calling that function directly. The
indirection via ob_type->tp_free is so that a subclass can
override both tp_alloc and tp_free to manage memory for its
instances differently.

I'm not sure what you're asking about the XDECREF; the
XDECREF indeed takes care of the en_sit fiend, the tp_free
doesn't touch it.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-25 00:41

Message:
Logged In: YES 
user_id=80475

Please re-examine the recoding of enum_dealloc:

PyObject_GC_UnTrack(en);
Py_XDECREF(en->en_sit);
en->ob_type->tp_free(en);

The last line doesn't make sense to me. 
en doesn't have an ob_type structure member.
I had expected:  PyObject_GC_Del(en);
and the XDECREF ought to take care of the subordinate 
iterator if its count drops to zero.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:03

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-24 03:01

Message:
Logged In: YES 
user_id=80475

Added a revised diff for the tutorial.
- Moved up one section to improve flow
- Retitled to Looping Techniques
- Added a zip() example

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 20:18

Message:
Logged In: YES 
user_id=6380

Thanks for looking, Neil!

en_sit is NULL when the call to PyObject_GetIter(sit) in
enum_new() fails, and before this call completes. Since
GetIter can invoke arbitrary Python code, it's just possible
that it en might be traversed during this time. (Or is it
safe to call visit(NULL)? Then I'll remove the NULL test.)
I'll fix the nits in my copy before checking in, won't
bother regenerating the diff and uploading unless there are
other changes.

There's nothing to be done about enumerate(dict) yielding
the keys only, except point it out in a tutorial; doing
anything else would be too inconsistent to consider. We
crossed this bridge and burned it behind us when we (I :-)
decided that "for i in dict" should iterate over the keys.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-04-23 18:44

Message:
Logged In: YES 
user_id=33168

This may be a stupid question, but how can en_sit ever be
NULL?  In enum_new(), en_sit is guaranteed to be non-NULL.
But in enum_traverse(), en_sit is checked for NULL.
Other little things, inconsistent whitespace between
if & (.  Docstring (enum_doc) is long, line should be wrapped.
While this probably isn't a big deal, enumerate(dict)
yields the keys only.  This makes perfect sense to me,
but I don't know what newbies might expect.
All of these points are very minor though.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 16:33

Message:
Logged In: YES 
user_id=6380

Here's a new combined diff, named enum2.diff, integrating
Raymond's changes and my changes, and incorporating Martin's
recommendations.

This is ready AFAIC, but I wouldn't mind a little more
review. Also, I'm not 100% sure on the name enumerate() --
today I like itemize() better. :-)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-23 11:21

Message:
Logged In: YES 
user_id=80475

Revised libfuncs.tex to include /versionadded{2.3}.
Revised test_enumerate.py to eliminate test of types.py.
Leaving GC changes for enumerate.c to GvR.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-23 09:58

Message:
Logged In: YES 
user_id=6380

Yes, the GC API is used completely wrong, also mixing
old-style and new-style GC constants, and not using tp_alloc
/ tp_dealloc. I'm halfway fixing this, and adding support
for making the type subclassable.

enumobject.h is needed so that the type object can be
exported in the builtin module.

I agree that the types.py change shouldn't be done.

The C code should be indented with (8-space) tabs, not with
4 spaces.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-23 03:10

Message:
Logged In: YES 
user_id=21627

I recommend to add a test to trigger cyclic garbage, and a
garbage collection. The GC code looks wrong in multiple ways:
- it doesn't use GC_New
- it doesn't implement a tp_clear
See e.g. tupleobject.c for an example.

enumobject.h should get a include guard. It's not clear why
this is needed at all - if somebody needs it, you should add
the proper _Check macro as well.

The types.py change should be eliminated; I believe the
policy is to not extend types.py anymore.

In the Tex documentation, please use \versionadded, instead
of its expansion.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-22 16:19

Message:
Logged In: YES 
user_id=6380

Thanks! I'll review it "soon". Tip for next time (don't
bother now): pack all the diffs in a single file. New files
can also be produced in diff form, by using diff -N. Having
6 separate files makes it 6 times as much work for a
reviewer to review the patch.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-04-22 13:19

Message:
Logged In: YES 
user_id=80475

Alex Martelli and I pair programmed this project -- meaning 
he did the sophisticated part and then I picked at it and 
then documented, tested, and packaged it.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547162&group_id=5470


From noreply@sourceforge.net  Fri Apr 26 21:44:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 13:44:21 -0700
Subject: [Patches] [ python-Patches-549213 ] warn on assignment to None, True, False
Message-ID: <E171CZp-0003Hi-00@usw-sf-web4.sourceforge.net>

Patches item #549213, was opened at 2002-04-26 15:14
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Hylton (jhylton)
Assigned to: Jeremy Hylton (jhylton)
Summary: warn on assignment to None, True, False

Initial Comment:
This patch issues a warning when None, True, or False
is assigned to.  I don't know what the text of the
warning should say, but that's a minor matter.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 16:44

Message:
Logged In: YES 
user_id=6380

Better.  Some strange behavior remains:

from foo import None    # gives three warnings!
import None   # gives two warnings!

We need a test suite that systematically tests all cases.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-26 16:30

Message:
Logged In: YES 
user_id=31392

Here's an improved pass, although the calls are getting a
bit messy.  Not sure how worried I am, since I'd like to
replace this code anyway.  (The worry is that I don't get
around to replacing it before 2.3.)

Also, fix com_arglist() to do only the work that is necessary.


----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2002-04-26 15:34

Message:
Logged In: YES 
user_id=31392

Right.  I'll have to fiddle the patch to make the check into
a function and we can see where else it should be applied.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-26 15:28

Message:
Logged In: YES 
user_id=6380

Nice, but doesn't catch enough cases. E.g. using None as an
argument name, assigning to an attribute named None, "import
foo.None", "from foo.None import bar", "from None import
bar", "from None.foo import bar", using None as a keyword
argument in a call: "f(1, 2, None=3)".

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549213&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 05:35:55 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 26 Apr 2002 21:35:55 -0700
Subject: [Patches] [ python-Patches-549375 ] Compromise PyUnicode_EncodeUTF8
Message-ID: <E171JwB-0007UM-00@usw-sf-web3.sourceforge.net>

Patches item #549375, was opened at 2002-04-27 00:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Compromise PyUnicode_EncodeUTF8

Initial Comment:
This combines various ideas from Python-Dev.  It 
overallocates, but:

1) For short strings it does the conversion into a 
stack buffer, and allocates exactly as much string 
space as it turns out it needs at the end.  So it 
should be faster, but not waste any small-block memory.

2) For long strings it knows it's going to end up in 
the system malloc/realloc, so it asks for the maximum 
possibly needed at the start, returning the excess 
untouched at the end.  This gets rid of all the 
embedded "but did I really get enough memory yet?" 
tests and reallocations.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 15:53:16 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 07:53:16 -0700
Subject: [Patches] [ python-Patches-549375 ] Compromise PyUnicode_EncodeUTF8
Message-ID: <E171TZc-00051k-00@usw-sf-web3.sourceforge.net>

Patches item #549375, was opened at 2002-04-27 04:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Compromise PyUnicode_EncodeUTF8

Initial Comment:
This combines various ideas from Python-Dev.  It 
overallocates, but:

1) For short strings it does the conversion into a 
stack buffer, and allocates exactly as much string 
space as it turns out it needs at the end.  So it 
should be faster, but not waste any small-block memory.

2) For long strings it knows it's going to end up in 
the system malloc/realloc, so it asks for the maximum 
possibly needed at the start, returning the excess 
untouched at the end.  This gets rid of all the 
embedded "but did I really get enough memory yet?" 
tests and reallocations.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-27 14:53

Message:
Logged In: YES 
user_id=38388

Cool. I like it.

You better make sure the stack buffer doesn't overrun though
-- I've only skimmed the implementation, but would suggest
to an explicit test for this which is not only executed in
the debug build.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 15:54:59 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 07:54:59 -0700
Subject: [Patches] [ python-Patches-531901 ] binary packagers
Message-ID: <E171TbH-0007kC-00@usw-sf-web1.sourceforge.net>

Patches item #531901, was opened at 2002-03-19 15:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: binary packagers

Initial Comment:
zip file with updated Solaris and HP-UX packagers.
Replaces 415226, 415227, 415228.

Changes made to take advantage of new PEP241 changes in
the Distribution class.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-27 14:54

Message:
Logged In: YES 
user_id=38388

Mark, I checked in your code, but still need the 
documentation to not get beaten over the head
by Fred ;-)

TIA.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:54

Message:
Logged In: YES 
user_id=38388

I will try to checkin your latest version into CVS today. The PSF will
still require you to sign a contributor agreement for these
addition, though, after these have been through the legal
review phase.

http://www.python.org/psf/psf-contributor-agreement.html

Is that acceptable ?

Note: I'm still awaiting the documentation for these files.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-04-15 18:54

Message:
Logged In: YES 
user_id=12810

New file submitted. No documentation yet, but I am committed
to maintaining them.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:59

Message:
Logged In: YES 
user_id=38388

Mark, could you reupload the ZIP file ? I cannot download it
from the SF page (the file is mostly empty).

Also, is the documentation already included in the ZIP file ?
If not, it would be nice if you could add them as well.

I don't require a special PEP for these changes, BTW, but
I do require you to maintain them.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 19:55

Message:
Logged In: YES 
user_id=12810

OK, the PEP seems to me to mean most of this is done.

These additions are not library modules, they are Distutils
"commands". So the way i read it, the Distutils-SIG (where
I've been hanging around for some time) are the Maintainers.

The documentation will be 2 new chapters for the Distutils
manual "Creating Solaris packages" and "Creating HP-UX
packages" each looking a whole lot like "Creating RPM packages".

Does that clarify anything, or am I still missing a clue?

p.s. Thanks for cleaning up the extra uploads!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 15:35

Message:
Logged In: YES 
user_id=21627

You volunteering as the maintainer is part of the
prerequisites of accepting new modules, when following PEP
2, see

http://python.sourceforge.net/peps/pep-0002.html

It says: "developers ... will first form a group of
maintainers. Then, this group shall produce a PEP called a
library PEP."

So existance of a PEP describing these library extensions
would be a prerequisite for accepting them. If MAL wants to
waive this requirement, it would be fine with me. However,
such a PEP could also share text with the documentation, so
it might not be wasted effort.


----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 14:49

Message:
Logged In: YES 
user_id=12810

Any of the three (they're all the same). SourceForge
hiccuped during the upload, and I don't have permission to
delete the duplicates.

I don't exactly understand what you mean by applying PEP 2.
I uploaded this per Marc Lemburg's request for the latest
versions of patches 41522[6-8]. He's acting as as the
integrator in this case (see
http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html).
I let him know about the duplicate uploads, so hopefully
he'll correct it. If you can and want, feel free to delete
the 2 of your choice.

I agree they need to be documented. As soon as I can, I'll
submit changes to the Distutils documentation.

Finally, yes, I'll act as maintainer. I'm on the
Distutils-sig and as soon as some other poor soul who has to
deal with Solaris or HP-UX tries them, I'm there to work out
issues.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 07:35

Message:
Logged In: YES 
user_id=21627

Which of the three attached files is the right one (19633,
19634, or 19635)? Unless they are all needed, we should
delete the extra copies.

I recommend to apply PEP 2 to this patch: A library PEP is
needed (which could be quite short), documentation, perhaps
test cases. Most importantly, there must be an identified
maintainer of these modules. Are you willing to act as the
maintainer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 16:10:49 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 08:10:49 -0700
Subject: [Patches] [ python-Patches-531901 ] binary packagers
Message-ID: <E171Tqb-0007rP-00@usw-sf-web1.sourceforge.net>

Patches item #531901, was opened at 2002-03-19 15:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
>Priority: 3
Submitted By: Mark Alexander (mwa)
Assigned to: M.-A. Lemburg (lemburg)
Summary: binary packagers

Initial Comment:
zip file with updated Solaris and HP-UX packagers.
Replaces 415226, 415227, 415228.

Changes made to take advantage of new PEP241 changes in
the Distribution class.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-27 14:54

Message:
Logged In: YES 
user_id=38388

Mark, I checked in your code, but still need the 
documentation to not get beaten over the head
by Fred ;-)

TIA.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-17 19:54

Message:
Logged In: YES 
user_id=38388

I will try to checkin your latest version into CVS today. The PSF will
still require you to sign a contributor agreement for these
addition, though, after these have been through the legal
review phase.

http://www.python.org/psf/psf-contributor-agreement.html

Is that acceptable ?

Note: I'm still awaiting the documentation for these files.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-04-15 18:54

Message:
Logged In: YES 
user_id=12810

New file submitted. No documentation yet, but I am committed
to maintaining them.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-11 16:59

Message:
Logged In: YES 
user_id=38388

Mark, could you reupload the ZIP file ? I cannot download it
from the SF page (the file is mostly empty).

Also, is the documentation already included in the ZIP file ?
If not, it would be nice if you could add them as well.

I don't require a special PEP for these changes, BTW, but
I do require you to maintain them.

Thanks.

----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 19:55

Message:
Logged In: YES 
user_id=12810

OK, the PEP seems to me to mean most of this is done.

These additions are not library modules, they are Distutils
"commands". So the way i read it, the Distutils-SIG (where
I've been hanging around for some time) are the Maintainers.

The documentation will be 2 new chapters for the Distutils
manual "Creating Solaris packages" and "Creating HP-UX
packages" each looking a whole lot like "Creating RPM packages".

Does that clarify anything, or am I still missing a clue?

p.s. Thanks for cleaning up the extra uploads!

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 15:35

Message:
Logged In: YES 
user_id=21627

You volunteering as the maintainer is part of the
prerequisites of accepting new modules, when following PEP
2, see

http://python.sourceforge.net/peps/pep-0002.html

It says: "developers ... will first form a group of
maintainers. Then, this group shall produce a PEP called a
library PEP."

So existance of a PEP describing these library extensions
would be a prerequisite for accepting them. If MAL wants to
waive this requirement, it would be fine with me. However,
such a PEP could also share text with the documentation, so
it might not be wasted effort.


----------------------------------------------------------------------

Comment By: Mark Alexander (mwa)
Date: 2002-03-20 14:49

Message:
Logged In: YES 
user_id=12810

Any of the three (they're all the same). SourceForge
hiccuped during the upload, and I don't have permission to
delete the duplicates.

I don't exactly understand what you mean by applying PEP 2.
I uploaded this per Marc Lemburg's request for the latest
versions of patches 41522[6-8]. He's acting as as the
integrator in this case (see
http://mail.python.org/pipermail/distutils-sig/2001-December/002659.html).
I let him know about the duplicate uploads, so hopefully
he'll correct it. If you can and want, feel free to delete
the 2 of your choice.

I agree they need to be documented. As soon as I can, I'll
submit changes to the Distutils documentation.

Finally, yes, I'll act as maintainer. I'm on the
Distutils-sig and as soon as some other poor soul who has to
deal with Solaris or HP-UX tries them, I'm there to work out
issues.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 07:35

Message:
Logged In: YES 
user_id=21627

Which of the three attached files is the right one (19633,
19634, or 19635)? Unless they are all needed, we should
delete the extra copies.

I recommend to apply PEP 2 to this patch: A library PEP is
needed (which could be quite short), documentation, perhaps
test cases. Most importantly, there must be an identified
maintainer of these modules. Are you willing to act as the
maintainer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=531901&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 18:41:39 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 10:41:39 -0700
Subject: [Patches] [ python-Patches-549375 ] Compromise PyUnicode_EncodeUTF8
Message-ID: <E171WCZ-0006gb-00@usw-sf-web3.sourceforge.net>

Patches item #549375, was opened at 2002-04-27 00:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Tim Peters (tim_one)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Compromise PyUnicode_EncodeUTF8

Initial Comment:
This combines various ideas from Python-Dev.  It 
overallocates, but:

1) For short strings it does the conversion into a 
stack buffer, and allocates exactly as much string 
space as it turns out it needs at the end.  So it 
should be faster, but not waste any small-block memory.

2) For long strings it knows it's going to end up in 
the system malloc/realloc, so it asks for the maximum 
possibly needed at the start, returning the excess 
untouched at the end.  This gets rid of all the 
embedded "but did I really get enough memory yet?" 
tests and reallocations.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-27 13:41

Message:
Logged In: YES 
user_id=31435

Well, the overallocation is exactly the same whether it's 
on the stack or on the heap:  where size is the # of 
Unicode characters, it's guaranteed that 4*size bytes are 
available for writing.  The PyString_xyz routines guarantee 
to make an additional byte available to store a trailing 
\0, and indeed they add a trailing \0 automatically.

So the only question remaining is whether 4*size is a 
correct upper bound.  I think it's clear enough from your 
code that it is, and so I'm happy to leave verification of 
that to the debug build.  What it could use more is runtime 
release-build verfication that 4*size doesn't overflow a C 
int.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-27 10:53

Message:
Logged In: YES 
user_id=38388

Cool. I like it.

You better make sure the stack buffer doesn't overrun though
-- I've only skimmed the implementation, but would suggest
to an explicit test for this which is not only executed in
the debug build.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470


From noreply@sourceforge.net  Sat Apr 27 19:05:13 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 11:05:13 -0700
Subject: [Patches] [ python-Patches-549375 ] Compromise PyUnicode_EncodeUTF8
Message-ID: <E171WZN-0007Hy-00@usw-sf-web2.sourceforge.net>

Patches item #549375, was opened at 2002-04-27 00:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Tim Peters (tim_one)
>Assigned to: Tim Peters (tim_one)
Summary: Compromise PyUnicode_EncodeUTF8

Initial Comment:
This combines various ideas from Python-Dev.  It 
overallocates, but:

1) For short strings it does the conversion into a 
stack buffer, and allocates exactly as much string 
space as it turns out it needs at the end.  So it 
should be faster, but not waste any small-block memory.

2) For long strings it knows it's going to end up in 
the system malloc/realloc, so it asks for the maximum 
possibly needed at the start, returning the excess 
untouched at the end.  This gets rid of all the 
embedded "but did I really get enough memory yet?" 
tests and reallocations.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-04-27 14:05

Message:
Logged In: YES 
user_id=31435

I added runtime release-build verification that 4*size 
doesn't overflow a C int, and cleaned up the patch a 
little.  Since you and Martin both seem basically happy 
with it, I just checked it in:

Objects/unicodeobject.c new revision: 2.146

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-04-27 13:41

Message:
Logged In: YES 
user_id=31435

Well, the overallocation is exactly the same whether it's 
on the stack or on the heap:  where size is the # of 
Unicode characters, it's guaranteed that 4*size bytes are 
available for writing.  The PyString_xyz routines guarantee 
to make an additional byte available to store a trailing 
\0, and indeed they add a trailing \0 automatically.

So the only question remaining is whether 4*size is a 
correct upper bound.  I think it's clear enough from your 
code that it is, and so I'm happy to leave verification of 
that to the debug build.  What it could use more is runtime 
release-build verfication that 4*size doesn't overflow a C 
int.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-04-27 10:53

Message:
Logged In: YES 
user_id=38388

Cool. I like it.

You better make sure the stack buffer doesn't overrun though
-- I've only skimmed the implementation, but would suggest
to an explicit test for this which is not only executed in
the debug build.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549375&group_id=5470


From noreply@sourceforge.net  Sun Apr 28 02:32:24 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 27 Apr 2002 18:32:24 -0700
Subject: [Patches] [ python-Patches-549662 ] iterzip() implementation
Message-ID: <E171dY8-00039n-00@usw-sf-web3.sourceforge.net>

Patches item #549662, was opened at 2002-04-28 01:32
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549662&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: iterzip() implementation

Initial Comment:
Fast, clean C implementation of iterzip().

def iterzip(*collections):
    iterables = map(iter, collections)
    while 1:
        yield tuple([i.next() for i in iterables])

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549662&group_id=5470


From noreply@sourceforge.net  Sun Apr 28 15:39:42 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 28 Apr 2002 07:39:42 -0700
Subject: [Patches] [ python-Patches-539487 ] build info docs from tex sources
Message-ID: <E171pq2-0004KE-00@usw-sf-web3.sourceforge.net>

Patches item #539487, was opened at 2002-04-04 22:26
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470

Category: Documentation
Group: Python 2.2.x
>Status: Open
Resolution: Rejected
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: build info docs from tex sources

Initial Comment:
This patch adds Milan Zamazals conversion script and 
modifies the mkinfo script to build the info doc files 
from the latex sources. Currently, the mac, doc and 
inst tex files are not handled.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2002-04-28 14:39

Message:
Logged In: YES 
user_id=60903

Find below the updated patch, which explicitely checks for
GNU Emacs 21. So this is more robust than the previous one.

Having the choice of building no info docs with the current
version, building the info docs with a specific tool is a
better solution.

> Until then, I'm glad to publish contributed GNU info
> versions provided by community members.

give them a tool for fishing, not fishes ... ;-)


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-09 14:17

Message:
Logged In: YES 
user_id=3066

I just installed emacs 20.7 'cause those are the RPMs that
came with the distro I have on this box (RedHat 7.2), and
that produced a similar error.  I'll have to ask that a more
robust patch be available before I can spend more time on
it; this one will be marked as rejected.  Until then, I'm
glad to publish contributed GNU info versions provided by
community members.

For the record, here's the specific error output I got and
the FSF Emacs version info:

grendel(.../r22-maint/Doc); make info 
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
emacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Mark set
Args out of range: 27914, 27916
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2
[2] grendel(.../r22-maint/Doc); emacs --version
GNU Emacs 20.7.1
Copyright (C) 1999 Free Software Foundation, Inc.
GNU Emacs comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of Emacs
under the terms of the GNU General Public License.
For more information about these matters, see the file named
COPYING.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2002-04-09 08:05

Message:
Logged In: YES 
user_id=60903

Yes, forget to mention that Milan said, it only works for 
emacs. I built the info docs using emacs-21.2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 21:24

Message:
Logged In: YES 
user_id=3066

For the record, here's the specific errors I get when using
XEmacs with this patch on the current release22-maint branch
(hopefully SF won't munge them too badly):

grendel(.../r22-maint/Doc); make EMACS=xemacs info
cd info && make
make[1]: Entering directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
../tools/mkinfo ../api/api.tex python-api.info
xemacs -batch -q --no-site-file -l
/home/fdrake/projects/python/r22-maint/Doc/tools/py2texi.el
--eval (setq py2texi-dirs '("./" "../texinputs/"
"/home/fdrake/projects/python/r22-maint/Doc/api")) --eval
(py2texi
"/home/fdrake/projects/python/r22-maint/Doc/api/api.tex") -f
kill-emacs
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/aspell-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/mew-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/psgml-init.el...
Loading
/usr/lib/xemacs/xemacs-packages/lisp/site-start.d/xemacs-po-mode-init.el...
Mark set
Args out of range: 72, 132
xemacs exiting.
make[1]: *** [python-api.info] Error 255
make[1]: Leaving directory
`/home/fdrake/projects/python/r22-maint/Doc/info'
make: *** [info] Error 2


----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-08 21:19

Message:
Logged In: YES 
user_id=3066

I'll add a note here just in case:  This patch applies to
the 2.3 development as well as 2.2 maintenance tree.

This still seems to suffer the problems that all versions of
this conversion have suffered; it isn't portable between FSF
Emacs and XEmacs.  I'll see about installing FSF Emacs to
see if it'll work for me there.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=539487&group_id=5470


From noreply@sourceforge.net  Sun Apr 28 22:03:59 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 28 Apr 2002 14:03:59 -0700
Subject: [Patches] [ python-Patches-549901 ] use readline in pydoc.help if available
Message-ID: <E171vpv-0008Cl-00@usw-sf-web3.sourceforge.net>

Patches item #549901, was opened at 2002-04-28 21:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549901&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Yallop (yallop)
Assigned to: Nobody/Anonymous (nobody)
Summary: use readline in pydoc.help if available

Initial Comment:
Interactive help (via pydoc.interact in 2.1 onwards)
doesn't use GNU readline.  This is easily amended by
using raw_input() instead of input.readline().  The
patch assumes that Helper.input is sys.stdin when
Helper.interact() is called but can be fixed trivially
if this assumption is invalid..

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549901&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 02:30:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 28 Apr 2002 18:30:27 -0700
Subject: [Patches] [ python-Patches-549975 ] block support for builtin iter()
Message-ID: <E171zzn-0004zd-00@usw-sf-web1.sourceforge.net>

Patches item #549975, was opened at 2002-04-29 10:30
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549975&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: block support for builtin iter()

Initial Comment:
>>> import xiter
>>> for i in xiter.iter(range(40), blocksize=12):
...     print i
...
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]
[36, 37, 38, 39]


(attached simple patch as extension module with
 unittest and demonstration script)


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549975&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 04:28:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sun, 28 Apr 2002 20:28:58 -0700
Subject: [Patches] [ python-Patches-550002 ] Unittest for base64
Message-ID: <E1721qU-0003mb-00@usw-sf-web3.sourceforge.net>

Patches item #550002, was opened at 2002-04-28 22:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550002&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mitchell Surface (msurface)
Assigned to: Nobody/Anonymous (nobody)
Summary: Unittest for base64

Initial Comment:
Script for testing base64.decodestring and base64.encodestring.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550002&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 11:40:38 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 03:40:38 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1728aE-0002KZ-00@usw-sf-web1.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 23:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 11:40:43 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 03:40:43 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1728aJ-0002Kd-00@usw-sf-web1.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 23:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 12:34:28 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 04:34:28 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1729QK-0000wD-00@usw-sf-web4.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 17:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 13:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 12:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 12:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 21:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 12:50:35 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 04:50:35 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1729fv-000091-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 15:50

Message:
Logged In: YES 
user_id=4799

> I discourage the use of the default encoding. Instead, if
an encoding is present, a Unicode object, or the information
about the original encoding should be returned.

This particular function (_formatparam) must return an ASCII
string, not an Unicode object. The resulting string is put
into a header.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 15:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 23:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 12:58:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 04:58:58 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E1729o2-0000FL-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 17:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 13:58

Message:
Logged In: YES 
user_id=21627

If it really *has* to be ASCII, please be explicit about
this, invoking .encode('ascii'). I still wonder whether this
could raise a UnicodeError, though.

Another comment: 'languge' is spelled incorrectly in a few
places.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 13:50

Message:
Logged In: YES 
user_id=4799

> I discourage the use of the default encoding. Instead, if
an encoding is present, a Unicode object, or the information
about the original encoding should be returned.

This particular function (_formatparam) must return an ASCII
string, not an Unicode object. The resulting string is put
into a header.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 13:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 12:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 12:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 21:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 13:24:57 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 05:24:57 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E172ADB-0000XA-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 16:24

Message:
Logged In: YES 
user_id=4799

> .encode('ascii')

Agree.

> languge

Fixed.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 15:58

Message:
Logged In: YES 
user_id=21627

If it really *has* to be ASCII, please be explicit about
this, invoking .encode('ascii'). I still wonder whether this
could raise a UnicodeError, though.

Another comment: 'languge' is spelled incorrectly in a few
places.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 15:50

Message:
Logged In: YES 
user_id=4799

> I discourage the use of the default encoding. Instead, if
an encoding is present, a Unicode object, or the information
about the original encoding should be returned.

This particular function (_formatparam) must return an ASCII
string, not an Unicode object. The resulting string is put
into a header.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 15:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 23:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 13:26:21 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 05:26:21 -0700
Subject: [Patches] [ python-Patches-549133 ] RFC 2231 support for email package
Message-ID: <E172AEX-0000aW-00@usw-sf-web2.sourceforge.net>

Patches item #549133, was opened at 2002-04-26 19:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 16:26

Message:
Logged In: YES 
user_id=4799

New patch uploaded.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 16:24

Message:
Logged In: YES 
user_id=4799

> .encode('ascii')

Agree.

> languge

Fixed.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 15:58

Message:
Logged In: YES 
user_id=21627

If it really *has* to be ASCII, please be explicit about
this, invoking .encode('ascii'). I still wonder whether this
could raise a UnicodeError, though.

Another comment: 'languge' is spelled incorrectly in a few
places.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 15:50

Message:
Logged In: YES 
user_id=4799

> I discourage the use of the default encoding. Instead, if
an encoding is present, a Unicode object, or the information
about the original encoding should be returned.

This particular function (_formatparam) must return an ASCII
string, not an Unicode object. The resulting string is put
into a header.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 15:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 14:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 23:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 15:39:46 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 07:39:46 -0700
Subject: [Patches] [ python-Patches-550192 ] Set softspace to 0 in raw_input()
Message-ID: <E172CJe-000347-00@usw-sf-web4.sourceforge.net>

Patches item #550192, was opened at 2002-04-29 14:39
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550192&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Set softspace to 0 in raw_input()

Initial Comment:
Setting softspace to 0 in raw_input() makes it 
behave as expected when a "print 'something'," 
precedes the raw_input() call, with or without a 
prompt argument. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550192&group_id=5470


From noreply@sourceforge.net  Mon Apr 29 18:58:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 29 Apr 2002 10:58:32 -0700
Subject: [Patches] [ python-Patches-550290 ] pydoc __doc__ attribute problems
Message-ID: <E172FQ0-0005Er-00@usw-sf-web4.sourceforge.net>

Patches item #550290, was opened at 2002-04-29 19:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
Summary: pydoc __doc__ attribute problems

Initial Comment:
pydoc creates a strange description for the __doc__
member: it prints the doc-string of 'str'. Example:


C:\>c:\sf\python\dist\src\PCBuild\python.exe
Python 2.3a0 (#29, Apr 15 2002, 18:33:31) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for
more information.
>>> class O(object):
...   "some text"
...
>>> import pydoc
>>> pydoc.help(O)
Help on class O in module __main__:

class O(__builtin__.object)
 |  some text
 |
 |  Data and non-method functions defined here:
 |
 |  __dict__ = <dict-proxy object at 0x0080D410>
 |
 |  __doc__ = 'some text'
 |      str(object) -> string
 |
 |      Return a nice string representation of the object.
 |      If the argument is a string, the return value
is the same object.
 |

The attached patch prints the following (also for the
HTML output):
 |  __doc__ = 'some text'
 |      The documentation string
 |


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 09:24:50 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 01:24:50 -0700
Subject: [Patches] [ python-Patches-550290 ] pydoc __doc__ attribute problems
Message-ID: <E172SwM-0005Dc-00@usw-sf-web3.sourceforge.net>

Patches item #550290, was opened at 2002-04-29 19:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
Summary: pydoc __doc__ attribute problems

Initial Comment:
pydoc creates a strange description for the __doc__
member: it prints the doc-string of 'str'. Example:


C:\>c:\sf\python\dist\src\PCBuild\python.exe
Python 2.3a0 (#29, Apr 15 2002, 18:33:31) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for
more information.
>>> class O(object):
...   "some text"
...
>>> import pydoc
>>> pydoc.help(O)
Help on class O in module __main__:

class O(__builtin__.object)
 |  some text
 |
 |  Data and non-method functions defined here:
 |
 |  __dict__ = <dict-proxy object at 0x0080D410>
 |
 |  __doc__ = 'some text'
 |      str(object) -> string
 |
 |      Return a nice string representation of the object.
 |      If the argument is a string, the return value
is the same object.
 |

The attached patch prints the following (also for the
HTML output):
 |  __doc__ = 'some text'
 |      The documentation string
 |


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-30 10:24

Message:
Logged In: YES 
user_id=11105

The same problem exists for the __module__ attribute. I can
update the patch if needed.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 10:07:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 02:07:58 -0700
Subject: [Patches] [ python-Patches-550543 ] Correction to buffer documentation
Message-ID: <E172Tc6-0005hi-00@usw-sf-web3.sourceforge.net>

Patches item #550543, was opened at 2002-04-30 09:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550543&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Scott Gilbert (xscott)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Correction to buffer documentation

Initial Comment:
The documentation incorrectly states that objects 
returned from the buffer() builtin do not support 
concatenation or repeating.  I don't have an old 
version of Python at my disposal, but it appears from 
CVS that they do support concatenation and repeating 
since version 1.6.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550543&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 10:18:27 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 02:18:27 -0700
Subject: [Patches] [ python-Patches-550551 ] Read/Write buffers from buffer()
Message-ID: <E172TmF-0005T1-00@usw-sf-web2.sourceforge.net>

Patches item #550551, was opened at 2002-04-30 09:18
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550551&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Scott Gilbert (xscott)
Assigned to: Nobody/Anonymous (nobody)
Summary: Read/Write buffers from buffer()

Initial Comment:
The buffer() builtin does not currently allow the 
creation of read-write buffers.  So there is no way 
from pure Python code to manipulate objects which 
support getting a writable pointer via their 
PyBufferProcs.  This patch tries to create a read-
write buffer first, and if that fails it will return a 
read-only buffer object as before.

It's tempting to check if the PyBufferProcs has the 
bf_getwritebuffer pointer and simply return 
PyBuffer_FromReadWriteObject(...) in this case.  This 
ends up being incorrect for PyStrings since they do 
have the bf_getwritebuffer pointer, but that always 
sets an exception.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550551&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 10:22:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 02:22:04 -0700
Subject: [Patches] [ python-Patches-550555 ] short patch to buffer() documentation
Message-ID: <E172Tpk-0005Vi-00@usw-sf-web2.sourceforge.net>

Patches item #550555, was opened at 2002-04-30 09:22
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550555&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Scott Gilbert (xscott)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: short patch to buffer() documentation

Initial Comment:
The documentation incorrectly states that objects 
returned from the buffer() builtin do not support 
concatenation or repeating.  I don't have an old 
version of Python at my disposal, but it appears from 
CVS that they do support concatenation and repeating 
since version 1.6.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550555&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 10:24:08 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 02:24:08 -0700
Subject: [Patches] [ python-Patches-550543 ] Correction to buffer documentation
Message-ID: <E172Trk-0005XJ-00@usw-sf-web2.sourceforge.net>

Patches item #550543, was opened at 2002-04-30 09:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550543&group_id=5470

Category: Documentation
Group: None
>Status: Deleted
Resolution: None
Priority: 5
Submitted By: Scott Gilbert (xscott)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Correction to buffer documentation

Initial Comment:
The documentation incorrectly states that objects 
returned from the buffer() builtin do not support 
concatenation or repeating.  I don't have an old 
version of Python at my disposal, but it appears from 
CVS that they do support concatenation and repeating 
since version 1.6.

----------------------------------------------------------------------

>Comment By: Scott Gilbert (xscott)
Date: 2002-04-30 09:24

Message:
Logged In: YES 
user_id=38318

I forgot to check the box, so I resubmitted...

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550543&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 15:08:58 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 07:08:58 -0700
Subject: [Patches] [ python-Patches-547734 ] Distutils & non-installed Python
Message-ID: <E172YJO-00017o-00@usw-sf-web4.sourceforge.net>

Patches item #547734, was opened at 2002-04-23 15:34
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Fred L. Drake, Jr. (fdrake)
Assigned to: Nobody/Anonymous (nobody)
Summary: Distutils & non-installed Python

Initial Comment:
When using a Python that has not been installed to
build 3rd-party modules, distutils does not understand
that the build version of the source tree is needed.

This patch fixes distutils.sysconfig to understand that
the running Python is part of the build tree and needs
to use the appropriate "shape" of the tree.  This does
not assume anything about the current directory, so can
be used to build 3rd-party modules using Python's build
tree as well.

This is useful since it allows us to use a
non-installed debug-mode Python with 3rd-party modules
for testing.  It as the side-effect that
set_python_build() is no longer needed (the hack which
was added to allow distutils to be used to build the
"standard" extension modules).

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-30 10:08

Message:
Logged In: YES 
user_id=3066

Based on the descriptions, this solves a closely related
problem to what's addressed by patch 458898.  I've not read
that patch, but it sounds like there are likely conflicts
between the two patches due to needing to touch related
code, but the approaches are different.  I'd have to spend
more time looking at that patch to see what's going on, but
it's not clear that the approach (or the problem statement)
makes sense.

These issues should be discussed in the distutils SIG.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2002-04-24 08:09

Message:
Logged In: YES 
user_id=6656

Fred, can you look at

[ 458898 ] --python-build for install

(which has been sitting on my plate for an embarrassingly
long time)

Is it solving the same problem?  Which appraoch do you prefer?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=547734&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 15:49:51 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 07:49:51 -0700
Subject: [Patches] [ python-Patches-550290 ] __doc__ strings of builtin types
Message-ID: <E172Ywx-0001bn-00@usw-sf-web4.sourceforge.net>

Patches item #550290, was opened at 2002-04-29 19:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
>Summary: __doc__ strings of builtin types

Initial Comment:
pydoc creates a strange description for the __doc__
member: it prints the doc-string of 'str'. Example:


C:\>c:\sf\python\dist\src\PCBuild\python.exe
Python 2.3a0 (#29, Apr 15 2002, 18:33:31) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for
more information.
>>> class O(object):
...   "some text"
...
>>> import pydoc
>>> pydoc.help(O)
Help on class O in module __main__:

class O(__builtin__.object)
 |  some text
 |
 |  Data and non-method functions defined here:
 |
 |  __dict__ = <dict-proxy object at 0x0080D410>
 |
 |  __doc__ = 'some text'
 |      str(object) -> string
 |
 |      Return a nice string representation of the object.
 |      If the argument is a string, the return value
is the same object.
 |

The attached patch prints the following (also for the
HTML output):
 |  __doc__ = 'some text'
 |      The documentation string
 |


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-30 16:49

Message:
Logged In: YES 
user_id=11105

Please ignore my previous comments.

It turns out that the problem is not in pydoc, it is in
Python itself. "spam".__doc__ is the same as str.__doc__
(which describes what the callable 'str' does.
Is this intended?

The side-effect is that pydoc prints unexpected output for
class variables. Execute this code to find out:

class X:
    a = "a string"
    b = 42

import pydoc
pydoc.help(X)


----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-30 10:24

Message:
Logged In: YES 
user_id=11105

The same problem exists for the __module__ attribute. I can
update the patch if needed.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 15:50:04 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 07:50:04 -0700
Subject: [Patches] [ python-Patches-550290 ] __doc__ strings of builtin types
Message-ID: <E172YxA-0001cA-00@usw-sf-web4.sourceforge.net>

Patches item #550290, was opened at 2002-04-29 19:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Nobody/Anonymous (nobody)
Summary: __doc__ strings of builtin types

Initial Comment:
pydoc creates a strange description for the __doc__
member: it prints the doc-string of 'str'. Example:


C:\>c:\sf\python\dist\src\PCBuild\python.exe
Python 2.3a0 (#29, Apr 15 2002, 18:33:31) [MSC 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for
more information.
>>> class O(object):
...   "some text"
...
>>> import pydoc
>>> pydoc.help(O)
Help on class O in module __main__:

class O(__builtin__.object)
 |  some text
 |
 |  Data and non-method functions defined here:
 |
 |  __dict__ = <dict-proxy object at 0x0080D410>
 |
 |  __doc__ = 'some text'
 |      str(object) -> string
 |
 |      Return a nice string representation of the object.
 |      If the argument is a string, the return value
is the same object.
 |

The attached patch prints the following (also for the
HTML output):
 |  __doc__ = 'some text'
 |      The documentation string
 |


----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2002-04-30 16:50

Message:
Logged In: YES 
user_id=11105

Please ignore my previous comments.

It turns out that the problem is not in pydoc, it is in
Python itself. "spam".__doc__ is the same as str.__doc__
(which describes what the callable 'str' does.
Is this intended?

The side-effect is that pydoc prints unexpected output for
class variables. Execute this code to find out:

class X:
    a = "a string"
    b = 42

import pydoc
pydoc.help(X)


----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-30 16:49

Message:
Logged In: YES 
user_id=11105

Please ignore my previous comments.

It turns out that the problem is not in pydoc, it is in
Python itself. "spam".__doc__ is the same as str.__doc__
(which describes what the callable 'str' does.
Is this intended?

The side-effect is that pydoc prints unexpected output for
class variables. Execute this code to find out:

class X:
    a = "a string"
    b = 42

import pydoc
pydoc.help(X)


----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-04-30 10:24

Message:
Logged In: YES 
user_id=11105

The same problem exists for the __module__ attribute. I can
update the patch if needed.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550290&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 19:37:11 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 11:37:11 -0700
Subject: [Patches] [ python-Patches-550732 ] PyArg_VaParseTupleAndKeywords
Message-ID: <E172cUx-0004G9-00@usw-sf-web4.sourceforge.net>

Patches item #550732, was opened at 2002-04-30 10:37
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550732&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Nobody/Anonymous (nobody)
Summary: PyArg_VaParseTupleAndKeywords

Initial Comment:
I need a Va version of PyArg_ParseTupleAndKeywords for 
some generic code where the number of expected 
arguments isn't known until runtime.  Attached is a 
patch which adds a PyArg_VaParseTupleAndKeywords to 
getargs.c.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550732&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 20:53:03 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 12:53:03 -0700
Subject: [Patches] [ python-Patches-491107 ] Cygwin setup.py import workaround patch
Message-ID: <E172dgN-0004OO-00@usw-sf-web2.sourceforge.net>

Patches item #491107, was opened at 2001-12-10 02:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=491107&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Jason Tishler (jlt63)
Assigned to: Michael Hudson (mwh)
Summary: Cygwin setup.py import workaround patch

Initial Comment:
Sorry for submitting this in the 11th hour, but this patch
re-enables clean building under Cygwin.  See the following
for details:

    http://cygwin.com/ml/cygwin/2001-12/msg00409.html

Unfortunately, this patch is only a build workaround and
does *not* solve the root cause which is Cygwin's problem
with DLL address clashes during fork().  Hopefully, a yet to
be instituted rebase tool will solve this problem for real.
See the following for details:

    http://sources.redhat.com/ml/cygwin/2001-12/msg00446.html

----------------------------------------------------------------------

>Comment By: Jason Tishler (jlt63)
Date: 2002-04-30 11:53

Message:
Logged In: YES 
user_id=86216

mwh wrote:
> Jason, feel free to complain if you think this isn't
> the right thing to do.

I guess that I would like to complain and reopen this
issue. :,)  I cannot build a Python 2.2.1 with threads
under Cygwin without this patch even though I'm using
Michael's static _socket workaround.  This is due to the
Cygwin fork() problem with DLL base address conflicts
that are triggered by importing many modules during the
setup.py run.  Similar problems can also be caused by
regrtest.py.

Even after my rebase patch is accepted into Cygwin's
setup.exe, I feel that patch will still be necessary.
This is because during the build process the shared
extension (i.e., DLLs) will not be rebased yet.  Hence,
the potential for DLL base address conflicts will exist.

One way to obviate this patch is to push the rebase
functionality into Cygwin's ld.  Unfortunately, I don't
think this is likely to happen.  Another possible way,
is to use the yet to be defined and implemented unload
module functionality:

http://mail.python.org/pipermail/python-dev/2001-December/019028.html

I have refreshed the patch against current CVS for
convenience.  Can this patch be accepted this time?


----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2001-12-12 06:25

Message:
Logged In: YES 
user_id=86216

I'm not happy with my workaround, but I'm not happy that
Python will not build OOTB under Cygwin until the fork()
issue gets resolved.  Choose your poison!

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-12-12 05:40

Message:
Logged In: YES 
user_id=6656

I'm rejecting this.  Linking _socket statically is a better
workaround until the issue actually gets sorted out at the
cygwin end.

Jason, feel free to complain if you think this isn't the
right thing to do.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-10 06:34

Message:
Logged In: YES 
user_id=6380

Sure.  While the release candidate is officially scheduled
for Wednesday this weel, I think it'll actually be Friday.

----------------------------------------------------------------------

Comment By: Michael Hudson (mwh)
Date: 2001-12-10 06:21

Message:
Logged In: YES 
user_id=6656

Well, it lets Python build, but the resulting Python doesn't
work all that well.

I've just noticed that linking _socket statically seems to
cure the problem.

Can we have a few more days to fiddle with this?  I wouldn't
recommend applying this patch at this stage.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-12-10 06:08

Message:
Logged In: YES 
user_id=6380

Michael, can you review this ASAP? If not, please assign to
Tim.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=491107&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 21:06:32 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 13:06:32 -0700
Subject: [Patches] [ python-Patches-550765 ] SocketServer behavior
Message-ID: <E172dtQ-0004Xt-00@usw-sf-web2.sourceforge.net>

Patches item #550765, was opened at 2002-04-30 16:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550765&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Gilfix (mgilfix)
Assigned to: Nobody/Anonymous (nobody)
Summary: SocketServer behavior

Initial Comment:
A bug, or lack of behavior in ServerSocket.py was
exposed while created a unit test for Python 2.3. As of
2.3, signals between threads propagate differently.
When cancelling a server that implemented threading
with a keyboard interrupt, the server would shut down
but not terminate (waiting on client threads).

The fit for this was to make the client threads
daemon-threads with the setDaemon call. Because this
was non-apparent, this patch adds a member variable
which acts as a hook and makes it clear that clients
need to either set or unset the variable when deriving
the class to control this behavior.

setDaemon is off by default, as this is thought to be
the consensus behavior.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550765&group_id=5470


From noreply@sourceforge.net  Tue Apr 30 22:58:30 2002
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 30 Apr 2002 14:58:30 -0700
Subject: [Patches] [ python-Patches-550804 ] Make os.environ.copy() return a copy
Message-ID: <E172fdm-0008UP-00@usw-sf-web1.sourceforge.net>

Patches item #550804, was opened at 2002-04-30 21:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550804&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make os.environ.copy() return a copy

Initial Comment:
As reported by Jeff Epler in python-list: 
 
""" 
Under Python2.1 and earlier (back to 1.5.2): 
>>> import os 
>>> os.environ.copy()['COPY_HUH'] = "not really" 
>>> print os.environ['COPY_HUH'] 
not really 
 
Under even 2.3a0 (CVS): 
>>> import os 
>>> os.environ.copy()['COPY_HUH'] = "not really" 
>>> os.system("echo $COPY_HUH") 
not really 
0 
""" 
 
This patch fixes this behavior, by returning a real 
dictionary from the copy() operation. 

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=550804&group_id=5470