From noreply@sourceforge.net  Sat Mar  1 01:30:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Feb 2003 17:30:10 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18ovpK-0002Et-00@sc8-sf-web3.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 02:00:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Feb 2003 18:00:34 -0800
Subject: [Patches] [ python-Patches-693195 ] Add sys.exc_clear() to clear current exception
Message-ID: <E18owIk-0002vH-00@sc8-sf-web3.sourceforge.net>

Patches item #693195, was opened at 2003-02-25 16:26
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693195&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Kevin Jacobs (jacobs99)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Add sys.exc_clear() to clear current exception

Initial Comment:
There is no way to clear the "current" exception, which is
available via the sys.exc_info() function.  There are a few
(obscure) easons why one would want to be able to do so,
and mainly due to the implementation details of how
exception information is stored.  

Specifically, sys.exc_info()  will return information
on the last 
exception even outside of an 'except:' block that
caught the 
exception.  So an exception and all of the frame
objects on the 
stack, and all local variables stored in those frames
are kept 
alive in the last exception traceback until either 1)
another 
exception is thrown, or 2) the stack returns to a frame
that is 
handling another exception (thrown before the "current" 
exception).

Thus, it is sometimes useful to be able to clear the
"current"
exception.  e.g.:

1) Some error handling and logging handlers will report
    on the current or last exception (as a hint about
what may
    have gone wrong).  Once that information is handled, 
    additional error handling or logging calls should
not report 
    it again. 

2) Sometimes resources are not released when an exception
     is raised until the next exception is raised. 
This causes
     problems for programs that rely on object
finalization to
     release resources (like memory, locks, file
descriptors, etc.).
     Such code is suboptimal, but it exists and there
are few
     easy alternatives other than creating many
'finally:' clauses
     (which can violate encapsulation and abstraction layer
     boundries and is syntactically hairy at times). 
Anyhow,
     such programs may want to clear the current exception
     and trigger garbage collection at certain
synchronization
     points, in order to flush pending object
finalization.  Clearly,
     this is a somewhat hit-or-miss strategy, though it
works
     fairly well on practice, though no sane developer
should
     ever rely on it.

Anyhow, I've implemented a trivial patch to sysmodule.c
to add a 'exc_clear()' function that clears the current 
or last exception.  I've also added a test case and
updated the documentation.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:00

Message:
Logged In: YES 
user_id=6380

Grabbing this for review.


----------------------------------------------------------------------

Comment By: Kevin Jacobs (jacobs99)
Date: 2003-02-26 07:39

Message:
Logged In: YES 
user_id=459565

I've updated my patchj based on Neil's feedback:

  1) sys_exc_clear and sys_exc_info now use the 
       recommended prototype and the cast to
       PyCFunction was removed.
  2) \versionadded was added to the exc_clear docs.
  3)  The exc_info docs were slightly modified to better
        match the updated doc string.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-25 22:32

Message:
Logged In: YES 
user_id=33168

I have a couple of minor things.  The prototype should be: 
sys_exc_clear(PyObject *self, PyObject *noargs).  This will
remove the (PyCFunction) cast.  (I realize there are other
places in the file you copied, but they are wrong too. :-)

The doc for exc_clear should have a \versionadded{2.3}
before the \end.  Should the doc for exc_info() also be
updated, since the docstring was updated?

----------------------------------------------------------------------

Comment By: Kevin Jacobs (jacobs99)
Date: 2003-02-25 16:39

Message:
Logged In: YES 
user_id=459565

Before someone else says it -- yes, technically there is a way
to "clear" the current exception -- by raising another
exception.
However that leaves a bogus excepton in the thread state, which
still stores at least one Python stack frame.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693195&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 02:21:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Feb 2003 18:21:18 -0800
Subject: [Patches] [ python-Patches-695090 ] Make build_py allow modules and packages at the same time
Message-ID: <E18owco-0006Lx-00@sc8-sf-web2.sourceforge.net>

Patches item #695090, was opened at 2003-02-28 09:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695090&group_id=5470

Category: Distutils and setup.py
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Bernhard Herzog (bernhard)
Assigned to: A.M. Kuchling (akuchling)
Summary: Make build_py allow modules and packages at the same time

Initial Comment:
The build command of the distutils currently doesn't
support both python modules and python packages in the
same setup.py call. See the distutils-sig for a discussion:
http://mail.python.org/pipermail/distutils-sig/2003-February/003192.html

This patch modifies the build_py command to allow this.

----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2003-02-28 21:21

Message:
Logged In: YES 
user_id=11375

Checked in; thanks!

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695090&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 02:59:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Feb 2003 18:59:37 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18oxDt-0005YZ-00@sc8-sf-web1.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 03:31:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Feb 2003 19:31:35 -0800
Subject: [Patches] [ python-Patches-693195 ] Add sys.exc_clear() to clear current exception
Message-ID: <E18oxip-0008O5-00@sc8-sf-web2.sourceforge.net>

Patches item #693195, was opened at 2003-02-25 16:26
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693195&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Kevin Jacobs (jacobs99)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add sys.exc_clear() to clear current exception

Initial Comment:
There is no way to clear the "current" exception, which is
available via the sys.exc_info() function.  There are a few
(obscure) easons why one would want to be able to do so,
and mainly due to the implementation details of how
exception information is stored.  

Specifically, sys.exc_info()  will return information
on the last 
exception even outside of an 'except:' block that
caught the 
exception.  So an exception and all of the frame
objects on the 
stack, and all local variables stored in those frames
are kept 
alive in the last exception traceback until either 1)
another 
exception is thrown, or 2) the stack returns to a frame
that is 
handling another exception (thrown before the "current" 
exception).

Thus, it is sometimes useful to be able to clear the
"current"
exception.  e.g.:

1) Some error handling and logging handlers will report
    on the current or last exception (as a hint about
what may
    have gone wrong).  Once that information is handled, 
    additional error handling or logging calls should
not report 
    it again. 

2) Sometimes resources are not released when an exception
     is raised until the next exception is raised. 
This causes
     problems for programs that rely on object
finalization to
     release resources (like memory, locks, file
descriptors, etc.).
     Such code is suboptimal, but it exists and there
are few
     easy alternatives other than creating many
'finally:' clauses
     (which can violate encapsulation and abstraction layer
     boundries and is syntactically hairy at times). 
Anyhow,
     such programs may want to clear the current exception
     and trigger garbage collection at certain
synchronization
     points, in order to flush pending object
finalization.  Clearly,
     this is a somewhat hit-or-miss strategy, though it
works
     fairly well on practice, though no sane developer
should
     ever rely on it.

Anyhow, I've implemented a trivial patch to sysmodule.c
to add a 'exc_clear()' function that clears the current 
or last exception.  I've also added a test case and
updated the documentation.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 22:31

Message:
Logged In: YES 
user_id=6380

All checked in, thanks. I changed the docstring for
sys.exc_info() again, to:

"Return information about the most recent exception caught
by an except clause in the current stack frame or in an
older stack frame."

I think this is accurate and concise.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:00

Message:
Logged In: YES 
user_id=6380

Grabbing this for review.


----------------------------------------------------------------------

Comment By: Kevin Jacobs (jacobs99)
Date: 2003-02-26 07:39

Message:
Logged In: YES 
user_id=459565

I've updated my patchj based on Neil's feedback:

  1) sys_exc_clear and sys_exc_info now use the 
       recommended prototype and the cast to
       PyCFunction was removed.
  2) \versionadded was added to the exc_clear docs.
  3)  The exc_info docs were slightly modified to better
        match the updated doc string.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-25 22:32

Message:
Logged In: YES 
user_id=33168

I have a couple of minor things.  The prototype should be: 
sys_exc_clear(PyObject *self, PyObject *noargs).  This will
remove the (PyCFunction) cast.  (I realize there are other
places in the file you copied, but they are wrong too. :-)

The doc for exc_clear should have a \versionadded{2.3}
before the \end.  Should the doc for exc_info() also be
updated, since the docstring was updated?

----------------------------------------------------------------------

Comment By: Kevin Jacobs (jacobs99)
Date: 2003-02-25 16:39

Message:
Logged In: YES 
user_id=459565

Before someone else says it -- yes, technically there is a way
to "clear" the current exception -- by raising another
exception.
However that leaves a bogus excepton in the thread state, which
still stores at least one Python stack frame.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693195&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 14:04:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 01 Mar 2003 06:04:49 -0800
Subject: [Patches] [ python-Patches-695581 ] "returnself" -> "return self" in pydoc.py
Message-ID: <E18p7bd-00005x-00@sc8-sf-web4.sourceforge.net>

Patches item #695581, was opened at 2003-03-01 14:04
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oren Tirosh (orenti)
Assigned to: Nobody/Anonymous (nobody)
Summary: "returnself" -> "return self" in pydoc.py

Initial Comment:
The error has probably been introduced in the process of converting 
the code from using "apply" to "*args". 
 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 14:06:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 01 Mar 2003 06:06:55 -0800
Subject: [Patches] [ python-Patches-695581 ] "returnself" -> "return self" in pydoc.py
Message-ID: <E18p7df-0006ru-00@sc8-sf-web1.sourceforge.net>

Patches item #695581, was opened at 2003-03-01 14:04
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Oren Tirosh (orenti)
Assigned to: Nobody/Anonymous (nobody)
>Summary: "returnself" -> "return self" in pydoc.py

Initial Comment:
The error has probably been introduced in the process of converting 
the code from using "apply" to "*args". 
 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 15:32:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 01 Mar 2003 07:32:55 -0800
Subject: [Patches] [ python-Patches-695581 ] "returnself" -> "return self" in pydoc.py
Message-ID: <E18p8yt-00033a-00@sc8-sf-web4.sourceforge.net>

Patches item #695581, was opened at 2003-03-01 09:04
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Oren Tirosh (orenti)
>Assigned to: Neal Norwitz (nnorwitz)
>Summary: "returnself" -> "return self" in pydoc.py

Initial Comment:
The error has probably been introduced in the process of converting 
the code from using "apply" to "*args". 
 

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-01 10:32

Message:
Logged In: YES 
user_id=33168

Thanks!

Checked in as: Lib/pydoc.py 1.79

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695581&group_id=5470


From noreply@sourceforge.net  Sat Mar  1 19:49:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 01 Mar 2003 11:49:27 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18pCz9-0001zx-00@sc8-sf-web1.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 19:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Sun Mar  2 02:40:46 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 01 Mar 2003 18:40:46 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18pJPC-0005HR-00@sc8-sf-web3.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-01 21:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Sun Mar  2 20:52:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 02 Mar 2003 12:52:31 -0800
Subject: [Patches] [ python-Patches-696184 ] Enable __slots__ for meta-types
Message-ID: <E18paRj-0007EM-00@sc8-sf-web1.sourceforge.net>

Patches item #696184, was opened at 2003-03-02 21:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696184&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or complexity.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696184&group_id=5470


From noreply@sourceforge.net  Sun Mar  2 21:02:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 02 Mar 2003 13:02:44 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18pabc-0007cK-00@sc8-sf-web1.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 22:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Nobody/Anonymous (nobody)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 07:38:30 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 02 Mar 2003 23:38:30 -0800
Subject: [Patches] [ python-Patches-696392 ] allow proxy server authentication with pimp
Message-ID: <E18pkWs-0000cT-00@sc8-sf-web4.sourceforge.net>

Patches item #696392, was opened at 2003-03-03 07:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696392&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Andrew Straw (astraw)
Assigned to: Jack Jansen (jackjansen)
Summary: allow proxy server authentication with pimp

Initial Comment:
The urllib module does not support http proxy authentication with passwords.  The urllib2 module does, so I changed pimp.py to use urllib2.  I have tested the patch below after setting my http_proxy environment variable to the form "http://user:pass@proxy.com:1234".

It may be possible to remove the dependency on urllib entirely by sustituting a urllib2 work-alike for a call to urllib.url2pathname().

This may affect the exception(s) raised when unable to connect.  For example, PackageManager.py catches an IOError, but I believe urllib2 raises a socket.gaierror when unable to resolve the name of the URL. I have not resolved this issue.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696392&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 09:45:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 01:45:24 -0800
Subject: [Patches] [ python-Patches-671666 ] Make the default encoding provided on Windows
Message-ID: <E18pmVg-0004ZR-00@sc8-sf-web3.sourceforge.net>

Patches item #671666, was opened at 2003-01-21 09:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671666&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: SUZUKI Hisao (suzuki_hisao)
Assigned to: Martin v. L�wis (loewis)
Summary: Make the default encoding provided on Windows

Initial Comment:
On Windows, some default encodings are not
provided by Python (e.g. "cp932" in Japanese
locale), while they are always available as "mbcs"
in each locale.  This patch ensures them usable in
a very efficient way by aliasing them to "mbcs" in
such a case.

Note that IDLE does not start up on Windows unless
the default encoding is provided.  The patch makes
IDLE operable all over the (Windows) world ;-).

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 10:45

Message:
Logged In: YES 
user_id=21627

I missed the point of this patch, indeed. Applied as site.py
1.48.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-02-21 03:10

Message:
Logged In: YES 
user_id=31435

Assigning to Martin, in the hopes they can work out their 
differences.

----------------------------------------------------------------------

Comment By: SUZUKI Hisao (suzuki_hisao)
Date: 2003-01-28 06:49

Message:
Logged In: YES 
user_id=495142

I can reproduce the IDLE problem on my Windows
2000 in Japanese locale.  I hope you will confirm it
by asking your  friends in Japan or other countries.

I am afraid you missed the point.  The patch does NOT change
the default encoding of Python itself.
It is ASCII still.  It only makes the encoding of
locale.getdefaultlocale()[1]  be PROVIDED.

Please read that short patch. 


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-21 23:04

Message:
Logged In: YES 
user_id=21627

I'm rejecting this patch. The factory system default
encoding of Python is ASCII, on all platforms (atleast, it
should be this way; MacOS currently deviates).

I cannot reproduce the IDLE problem; IDLE starts without
that patch just fine.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671666&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 10:15:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 02:15:37 -0800
Subject: [Patches] [ python-Patches-658327 ] Add inet_pton and inet_ntop to socket
Message-ID: <E18pmyv-0006Ev-00@sc8-sf-web3.sourceforge.net>

Patches item #658327, was opened at 2002-12-24 22:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jp Calderone (kuran)
Assigned to: Martin v. L�wis (loewis)
Summary: Add inet_pton and inet_ntop to socket

Initial Comment:
Patch is against current CVS and adds two socket module
functions, inet_pton and inet_ntop.  Both of these
should be available on all platforms (because of other
dependancies in the code) so I don't think portability
is a problem.  inet_ntop converts a packed IP address
to a human-readable '.' or ':' separated string
representation of the IP.  inet_pton performs the
reverse operation.

(Potential) problems: inet_pton sets errno to ENOSPC,
which may lead to a confusing error message.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:15

Message:
Logged In: YES 
user_id=21627

The has_ipv6 test is only there for the tests? In that case,
drop it, and just perform AF_INET6 conversions unconditionally.

OTOH, I think we should not expose the emulated inet_pton:
it doesn't set errno correctly, and offers no advantage over
inet_addr. So wrap the entire code with HAVE_INET_PTON, and
only perform the tests if the function is supported.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-05 03:40

Message:
Logged In: YES 
user_id=33168

I was just about to check this in, but then I ran into a
problem.  IPv6 may not be enabled, even if the constant
AF_INET6 exists.  The cleanest way I saw to address this in
the test was to add a has_ipv6 boolean constant to the
socket module.  Martin, do you think this is acceptable?

Attached is a complete patch which should be safe (based on
the discussion below), includes tests and doc changes.

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2003-01-11 18:04

Message:
Logged In: YES 
user_id=366566

Yea, testing for the proper input length is definitely
something that should be done.  The patch looks good, but
for one thing.  If the specified address family is neither
AF_INET nor AF_INET6, the length won't be tested and the
underlying inet_ntop will be called.  This isn't a problem
now (afaik) because only those two address families are
support, but in a future libc version with more supported
address families, it might open a similar hole to the one
you've fixed.  Perhaps the

+       } else {
+               PyErr_SetString(socket_error, "unknown
address family");
+               return NULL;
+       }

should be moved up from the second if-grouping to follow the
first if-grouping.  Everything else looks good to me. 
Thanks for taking the time to look at this :)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-11 04:49

Message:
Logged In: YES 
user_id=33168

JP, do you agree with my comment on 2002-12-30 about the
checks?  I have attached an updated patch.  Please review
and verify this is correct.

Thank you for the additional tests.  Feel free to submit
patches with additional tests for any and all modules!

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-31 17:52

Message:
Logged In: YES 
user_id=366566

Doc, NEWS, and test_socket patch attached.  I didn't notice
any inet_aton/inet_ntoa tests in the module so I added a
couple for those as well (I excluded a test for
inet_ntoa('255.255.255.255') ;) Also included are a couple
IPv6 tests.  I'm not sure if these are appropriate, since
many systems may still lack the required support for them to
pass.  I'll leave it up to you to decide whether they should
be commented out or removed or whatever.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-31 14:17

Message:
Logged In: YES 
user_id=21627

I agree that such a change should be added. Neal, you have
given this patch more attention than I did - please check it
in when you consider it complete. I just like to point out
that it is missing documentation changes (libsocket.tex), a
NEWS entry, and a test case. kuran, please provide those as
a single patch file.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-31 01:11

Message:
Logged In: YES 
user_id=33168

ISTM that in socket_inet_ntop() you need to verify the size
of the packed value passed in.  If the user passes an empty
string, inet_ntop() could read beyond the buffer passed in,
potentially causing a core dump.

The checks could be something like this:

  if (af == AF_INET && len != sizeof(struct in_addr))
  else if (af == AF_INET6 && len != sizeof(struct in6_addr))

Do this make sense?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-27 16:39

Message:
Logged In: YES 
user_id=366566

The use case I have for it at the moment is a DNS server
(Twisted.names).  inet_pton allows me to handle IPv6
addresses, so it allows me to support AAAA and A6 records. 
I believe an IPv6 capable socks proxy would find this useful
as well.  Basically, low level network stuff.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-27 11:23

Message:
Logged In: YES 
user_id=21627

What is the rationale for providing this functionality?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-26 19:32

Message:
Logged In: YES 
user_id=366566

Ooops, I made two, and uploaded the wrong one >:O  Sorry. 
Dunno if it's still helpful, but here's the unified diff.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 19:10

Message:
Logged In: YES 
user_id=33168

Next time, please use context or unified diff.  -c or -u
option to cvs diff:  cvs diff -c ...

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-24 22:05

Message:
Logged In: YES 
user_id=366566

Sourceforge decided not to attach the file the first time...
 Here it is.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 10:59:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 02:59:25 -0800
Subject: [Patches] [ python-Patches-671384 ] test_pty hanging on hpux11
Message-ID: <E18pnfJ-0007va-00@sc8-sf-web3.sourceforge.net>

Patches item #671384, was opened at 2003-01-20 22:23
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Martin v. L�wis (loewis)
Summary: test_pty hanging on hpux11

Initial Comment:
The attached hack fixes a problem which occurs since
switching the pty code.  isatty() hangs if the slave_fd
is closed and reopened as in the deprecated APIs
pty.master_open() and pty.slave_open().

This patch reverts to the old behaviour where
_open_terminal() is called in master_open() to avoid
the hang later.

Here's a very simple test for the problem:

import pty, os

master_fd, slave_name = pty.master_open()
slave_fd = pty.slave_open(slave_name)
print os.isatty(slave_fd)

In slave_open() the first ioctl raises an IOError,
Invalid Argument 22.

I don't know if this problem affects hpux10.  Hopefully
someone will have a better idea how to really fix this
problem.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:59

Message:
Logged In: YES 
user_id=21627

I can't reproduce a test failure for Solaris 8 (on the SF
compile farm) for Python 2.3a2. Can you please try that
specific release and report what test fails for you, in
which way?

I'm concerned that the patch isn't that good, e.g. on Linux,
it would cause usage of the old-style interface to
pseudo-terminals, even though an all-singing all-dancing
Unix98 pty support is available in the C library.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-28 01:21

Message:
Logged In: YES 
user_id=33168

I have attached an updated patch.  It seems Solaris 8 (on
the snake farm) also had a test failure.  I have basically
restored the old functionality in this patch. 
_open_terminal is called if /dev/ptmx exists, so
os.openpty() is not called.  This fixes the test
failures/hangs on both solaris and hpux and should be
equivalent to the 2.2 behaviour.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 11:23:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 03:23:01 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18po29-0001PC-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 11:36:06 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 03:36:06 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18poEo-00042e-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 11:37:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 03:37:39 -0800
Subject: [Patches] [ python-Patches-679505 ] Deprecate rotor module
Message-ID: <E18poGJ-0000rt-00@sc8-sf-web3.sourceforge.net>

Patches item #679505, was opened at 2003-02-03 15:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=679505&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 4
Submitted By: A.M. Kuchling (akuchling)
Assigned to: Nobody/Anonymous (nobody)
Summary: Deprecate rotor module

Initial Comment:
Here's a trivial patch that marks the rotor module as deprecated.

To be used if Paul Rubin's AES module goes into 2.3 (maybe even if it doesn't).


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:37

Message:
Logged In: YES 
user_id=21627

What is the rationale for deprecating the rotor module?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2003-02-03 15:04

Message:
Logged In: YES 
user_id=11375

Attach patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=679505&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 11:39:07 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 03:39:07 -0800
Subject: [Patches] [ python-Patches-679505 ] Deprecate rotor module
Message-ID: <E18poHj-0000aa-00@sc8-sf-web4.sourceforge.net>

Patches item #679505, was opened at 2003-02-03 15:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=679505&group_id=5470

Category: Modules
Group: None
Status: Open
Resolution: None
Priority: 4
Submitted By: A.M. Kuchling (akuchling)
Assigned to: Nobody/Anonymous (nobody)
Summary: Deprecate rotor module

Initial Comment:
Here's a trivial patch that marks the rotor module as deprecated.

To be used if Paul Rubin's AES module goes into 2.3 (maybe even if it doesn't).


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:39

Message:
Logged In: YES 
user_id=21627

I retract my comment, the rationale is fine.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:37

Message:
Logged In: YES 
user_id=21627

What is the rationale for deprecating the rotor module?

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2003-02-03 15:04

Message:
Logged In: YES 
user_id=11375

Attach patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=679505&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 12:22:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 04:22:32 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18poxk-000442-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 12:57:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 04:57:58 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18ppW2-0003Qi-00@sc8-sf-web4.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 21:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christian Tismer (tismer)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 13:02:30 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 05:02:30 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18ppaQ-0005Ta-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 13:11:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 05:11:04 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18ppii-0008Ke-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 14:32:02 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 06:32:02 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18pqz4-00005v-00@sc8-sf-web3.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 14:57:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 06:57:11 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18prNP-0001xE-00@sc8-sf-web2.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 16:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-03 09:57

Message:
Logged In: YES 
user_id=6380

I'll look at this on Friday.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:03:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:03:28 -0800
Subject: [Patches] [ python-Patches-691928 ] Use datetime in _strptime
Message-ID: <E18prTU-0001wP-00@sc8-sf-web3.sourceforge.net>

Patches item #691928, was opened at 2003-02-23 18:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Use datetime in _strptime

Initial Comment:
To prevent code duplication, I patched _strptime to use datetime's date object to do Julian day, Gregorian, and day of the week calculations (Tim's code has to be more reliable than mine  =).  Patch also includes new regression tests to test results and calculation gets triggered.

Very minor comment changes and my contact email are also changed.

----------------------------------------------------------------------

>Comment By: Skip Montanaro (montanaro)
Date: 2003-03-03 09:03

Message:
Logged In: YES 
user_id=44345

Meta comment - I think that when uploading successive patches it's useful
to either name them differently or delete the prior one to avoid confusion.
In this case it's not a big deal, especially since the submission dates are
different, but after a few revisions it can sometimes be a challenge to
figure out which patch should be downloaded.
 
Comment comment - Unless there's some evidence the elided functions
have been used, I suspect it best to just let people use the relevant
datetime functions.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-02-25 15:51

Message:
Logged In: YES 
user_id=357491

Only in the module (which was removed).  None of the helper functions have ever been publicly advertised (although I think the locale date info might be helpful in locale; MvL wasn't interested, though).

I uploaded a new diff that removes one more line that I forgot to remove when I eliminated the ability to pass in a regex object.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-23 18:56

Message:
Logged In: YES 
user_id=33168

Brett, is there any doc for the functions that were removed?
   firstjulian, gregorian, julianday, dayofweek

Otherwise, the patch seemed fine (but I didn't look that
closely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:19:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:19:37 -0800
Subject: [Patches] [ python-Patches-696613 ] test options don't work on FreeBSD
Message-ID: <E18prj7-0002f0-00@sc8-sf-web3.sourceforge.net>

Patches item #696613, was opened at 2003-03-03 15:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Ben Laurie (benl)
Assigned to: Nobody/Anonymous (nobody)
Summary: test options don't work on FreeBSD

Initial Comment:
test -L is used during make install - I'm guessing it
is supposed to test for a softlink. Sadly, this is -h
under FreeBSD, so the install fails.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:27:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:27:24 -0800
Subject: [Patches] [ python-Patches-667730 ] More DictMixin
Message-ID: <E18prqe-0003tJ-00@sc8-sf-web2.sourceforge.net>

Patches item #667730, was opened at 2003-01-14 14:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Nobody/Anonymous (nobody)
Summary: More DictMixin

Initial Comment:
This patch is intended to provide a more consistent
implementation for the various dictionary like objects
of the standard library.

test_userdict has been rewritten, it now use unittest
and define a test-case wich allow to check for
conformity with the dictionary protocol. 

test_shelve and test_weakref have been rewritten to use
the test_userdict test-case.

test_os has been extended: a new test case check for
environ object conformity to the dictionary protocol.

The patch modify the UserDict module:
* The doc says that __contains__ should be one of the
methods to redefine for better efficiency but the
implementation make __contains__ dependent of has_key
definition. The patch reverse methods dependencies.
* Change iterkey = __iter__ to def iterkey(self):
return self.__iter__() to make iterkey able to use
overiden __iter__ methods. 
* I have also a added __init__, copy and  __repr__
methods to DictMixin. 
* The UserDict.UserDict class is a subclass of
DictMixin, this allow to simplify UserDict
implementation. The patch is rather conservative since
a lot of methods definition could still be removed from
UserDict.

In the weakref module, the  patch make
WeakValueDictionnary and WeakKeyDictionnary subclasses
of UserDict.DictMixin. It also use nested scopes, the
new generators syntax  for iterator methods and rewrite
WeakKeyDictionnary.__delitem__ . All of this allow to
decrease the 
module size by 50%.

In the shelve module, the patch add a copy() method
which return a dictionary with the keys and values of
the database.

----------------------------------------------------------------------

>Comment By: Sebastien Keim (s_keim)
Date: 2003-03-03 16:27

Message:
Logged In: YES 
user_id=498191

I have downloaded a new version of the patch updated to
Python2.3a2

I hope to have removed all the stuff which could break
backward compatibility since the new proposed patch contain
now only the testing stuff (well, almost since I have also
added a pop method to the weak dictionary classes to make
them compatible with the test case).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-16 03:50

Message:
Logged In: YES 
user_id=80475

Also, +1 on consolidating the test cases though it should 
be done after any other changes to the files so we can 
make sure that nothing got broken.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-16 03:35

Message:
Logged In: YES 
user_id=80475

* UserDict.UserDict should not change. As Martin pointed-
out, inheriting from object changes the semantics in a non-
backward compatible way.  Also, the class is efficiently 
implemented in terms an internal dictionary and would be 
slowed down by the nest of calls in Mixin.  Also, I think the 
code in incorrect in defining __iter__, there was a reason it 
was pulled out into a separate subclass -- that was done in 
Py2.2. and is not an easily reversible decision.

*  -0 on the changes to has_key() and __contains__(). 
has_key() was put at a lower level than __contains__ 
because the older dict-style interfaces all define has_key.

* +1 for changing iterkeys() to a full definition (and +1 for 
doing the same for __iter__()).  Sabastien is correct is 
pointing out the advantages for propagating an overridden 
method.

* -1 for altering repr() implementation.  The current 
approach is shorter, cleaner, and faster.

* -1 for adding __nonzero__().  Even dictionaries don't 
implement this method; they let len() do the talking.

* -1 for adding __init__() and copy().  Both need to make 
assumptions about the order and number of parameters 
in the constructor of the class using the mixin.  I think they 
are rarely helpful and are sometime harmful in introducing 
surprising, hard-to-find errors.  People who need an init() 
or copy() can code them more cleanly and directly in the 
extending class.  Also, I don't think the code is correct 
since DictMixin will be a base class, the use of super() is 
not what is wanted here -- *if* you were going to do this, 
try something like self.__class__().  Further, adding these 
methods violates my original intent for this class which 
was to extrapolate four basic mapping methods into a full 
mapping interface.  It was not intended as a stand-alone 
class.  Also, copy() cannot guarantee that it is copying all 
the relevant data for the sub-class and that violates the 
definition of what copy() is supposed to do.  If something 
like this were attempted, it should be its own mixin 
(automatically adding copy support to any class) and it 
should be rather sophisticated about how to perfectly 
replicate itself (not easily done if the underlying data is in a 
file, database, or in a distributed app).

* +0 on changing weakdicts provided it is done minimally 
and carefully with attention to leaving semantics 
unchanged and not slowing performance.  The advantage 
goes beyond consistency, it removes code duplication, 
keeps well thought-out logic in one place, and provides an 
automatic interface update from DictMixin if the dictionary 
interface ever sprouts another method.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-14 22:43

Message:
Logged In: YES 
user_id=21627

This patch breaks backwards compatibility. UserDict is an
oldstyle class on purpose, since changing it to a newstyle
class will certainly break the compatibility in subtle ways
(e.g. by changing what type(userdictinstance) is).

Unless you can bring forward a better rationale than
consistency, this patch will be rejected.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:48:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:48:09 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18psAj-0008I9-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:49:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:49:49 -0800
Subject: [Patches] [ python-Patches-696645 ] VMS patches, cleaning part
Message-ID: <E18psCL-0005Bi-00@sc8-sf-web2.sourceforge.net>

Patches item #696645, was opened at 2003-03-03 16:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Nobody/Anonymous (nobody)
Summary: VMS patches, cleaning part

Initial Comment:
This is the cleaning patches.
I will provide other patches in a separate item.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 15:51:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 07:51:32 -0800
Subject: [Patches] [ python-Patches-696645 ] VMS patches, cleaning part
Message-ID: <E18psE0-0005I4-00@sc8-sf-web2.sourceforge.net>

Patches item #696645, was opened at 2003-03-03 16:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
>Assigned to: Martin v. L�wis (loewis)
Summary: VMS patches, cleaning part

Initial Comment:
This is the cleaning patches.
I will provide other patches in a separate item.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 16:08:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 08:08:53 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18psUn-0004UE-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 16:39:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 08:39:34 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18psyU-0003Ad-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 16:59:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 08:59:41 -0800
Subject: [Patches] [ python-Patches-667730 ] More DictMixin
Message-ID: <E18ptHx-0007hg-00@sc8-sf-web4.sourceforge.net>

Patches item #667730, was opened at 2003-01-14 08:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: More DictMixin

Initial Comment:
This patch is intended to provide a more consistent
implementation for the various dictionary like objects
of the standard library.

test_userdict has been rewritten, it now use unittest
and define a test-case wich allow to check for
conformity with the dictionary protocol. 

test_shelve and test_weakref have been rewritten to use
the test_userdict test-case.

test_os has been extended: a new test case check for
environ object conformity to the dictionary protocol.

The patch modify the UserDict module:
* The doc says that __contains__ should be one of the
methods to redefine for better efficiency but the
implementation make __contains__ dependent of has_key
definition. The patch reverse methods dependencies.
* Change iterkey = __iter__ to def iterkey(self):
return self.__iter__() to make iterkey able to use
overiden __iter__ methods. 
* I have also a added __init__, copy and  __repr__
methods to DictMixin. 
* The UserDict.UserDict class is a subclass of
DictMixin, this allow to simplify UserDict
implementation. The patch is rather conservative since
a lot of methods definition could still be removed from
UserDict.

In the weakref module, the  patch make
WeakValueDictionnary and WeakKeyDictionnary subclasses
of UserDict.DictMixin. It also use nested scopes, the
new generators syntax  for iterator methods and rewrite
WeakKeyDictionnary.__delitem__ . All of this allow to
decrease the 
module size by 50%.

In the shelve module, the patch add a copy() method
which return a dictionary with the keys and values of
the database.

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2003-03-03 10:27

Message:
Logged In: YES 
user_id=498191

I have downloaded a new version of the patch updated to
Python2.3a2

I hope to have removed all the stuff which could break
backward compatibility since the new proposed patch contain
now only the testing stuff (well, almost since I have also
added a pop method to the weak dictionary classes to make
them compatible with the test case).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:50

Message:
Logged In: YES 
user_id=80475

Also, +1 on consolidating the test cases though it should 
be done after any other changes to the files so we can 
make sure that nothing got broken.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:35

Message:
Logged In: YES 
user_id=80475

* UserDict.UserDict should not change. As Martin pointed-
out, inheriting from object changes the semantics in a non-
backward compatible way.  Also, the class is efficiently 
implemented in terms an internal dictionary and would be 
slowed down by the nest of calls in Mixin.  Also, I think the 
code in incorrect in defining __iter__, there was a reason it 
was pulled out into a separate subclass -- that was done in 
Py2.2. and is not an easily reversible decision.

*  -0 on the changes to has_key() and __contains__(). 
has_key() was put at a lower level than __contains__ 
because the older dict-style interfaces all define has_key.

* +1 for changing iterkeys() to a full definition (and +1 for 
doing the same for __iter__()).  Sabastien is correct is 
pointing out the advantages for propagating an overridden 
method.

* -1 for altering repr() implementation.  The current 
approach is shorter, cleaner, and faster.

* -1 for adding __nonzero__().  Even dictionaries don't 
implement this method; they let len() do the talking.

* -1 for adding __init__() and copy().  Both need to make 
assumptions about the order and number of parameters 
in the constructor of the class using the mixin.  I think they 
are rarely helpful and are sometime harmful in introducing 
surprising, hard-to-find errors.  People who need an init() 
or copy() can code them more cleanly and directly in the 
extending class.  Also, I don't think the code is correct 
since DictMixin will be a base class, the use of super() is 
not what is wanted here -- *if* you were going to do this, 
try something like self.__class__().  Further, adding these 
methods violates my original intent for this class which 
was to extrapolate four basic mapping methods into a full 
mapping interface.  It was not intended as a stand-alone 
class.  Also, copy() cannot guarantee that it is copying all 
the relevant data for the sub-class and that violates the 
definition of what copy() is supposed to do.  If something 
like this were attempted, it should be its own mixin 
(automatically adding copy support to any class) and it 
should be rather sophisticated about how to perfectly 
replicate itself (not easily done if the underlying data is in a 
file, database, or in a distributed app).

* +0 on changing weakdicts provided it is done minimally 
and carefully with attention to leaving semantics 
unchanged and not slowing performance.  The advantage 
goes beyond consistency, it removes code duplication, 
keeps well thought-out logic in one place, and provides an 
automatic interface update from DictMixin if the dictionary 
interface ever sprouts another method.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-14 16:43

Message:
Logged In: YES 
user_id=21627

This patch breaks backwards compatibility. UserDict is an
oldstyle class on purpose, since changing it to a newstyle
class will certainly break the compatibility in subtle ways
(e.g. by changing what type(userdictinstance) is).

Unless you can bring forward a better rationale than
consistency, this patch will be rejected.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 17:45:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 09:45:44 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18pu0W-0002Ai-00@sc8-sf-web3.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 17:56:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 09:56:14 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18puAg-0002A4-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Just van Rossum (jvr)
>Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 19:59:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 11:59:54 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18pw6M-00088b-00@sc8-sf-web2.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 16:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 19:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-02 02:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-01 02:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-01 01:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 21:14:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 13:14:53 -0800
Subject: [Patches] [ python-Patches-691928 ] Use datetime in _strptime
Message-ID: <E18pxGv-0000R9-00@sc8-sf-web1.sourceforge.net>

Patches item #691928, was opened at 2003-02-23 16:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Use datetime in _strptime

Initial Comment:
To prevent code duplication, I patched _strptime to use datetime's date object to do Julian day, Gregorian, and day of the week calculations (Tim's code has to be more reliable than mine  =).  Patch also includes new regression tests to test results and calculation gets triggered.

Very minor comment changes and my contact email are also changed.

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-03-03 13:14

Message:
Logged In: YES 
user_id=357491

Response to meta comment - I would normally delete it, Skip, but last time I tried I was told I didn't have the proper rights to do it.  Unless SF has changed their setup to allow patch creators to manage the files regardless of whether they have CVS access I can't.

Response to comment comment - The reason I am doing this is that I want to make sure that the returned time tuple is a valid date.  If strptime is going to have default values I want those values to lead to a valid time that does not require someone to have to do more processing or wonder whether it is valid.

Now currently the docs say you can't expect anything back in the time tuple but what was in the data string, so doing this does not go against the docs.  But if strptime becomes the only strptime implementation, then I will write a doc patch to make the docs say that all returned time tuples will be valid dates.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-03-03 07:03

Message:
Logged In: YES 
user_id=44345

Meta comment - I think that when uploading successive patches it's useful
to either name them differently or delete the prior one to avoid confusion.
In this case it's not a big deal, especially since the submission dates are
different, but after a few revisions it can sometimes be a challenge to
figure out which patch should be downloaded.
 
Comment comment - Unless there's some evidence the elided functions
have been used, I suspect it best to just let people use the relevant
datetime functions.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-02-25 13:51

Message:
Logged In: YES 
user_id=357491

Only in the module (which was removed).  None of the helper functions have ever been publicly advertised (although I think the locale date info might be helpful in locale; MvL wasn't interested, though).

I uploaded a new diff that removes one more line that I forgot to remove when I eliminated the ability to pass in a regex object.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-23 16:56

Message:
Logged In: YES 
user_id=33168

Brett, is there any doc for the functions that were removed?
   firstjulian, gregorian, julianday, dayofweek

Otherwise, the patch seemed fine (but I didn't look that
closely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 22:23:22 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 14:23:22 -0800
Subject: [Patches] [ python-Patches-671384 ] test_pty hanging on hpux11
Message-ID: <E18pyLC-0005vh-00@sc8-sf-web3.sourceforge.net>

Patches item #671384, was opened at 2003-01-20 16:23
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Martin v. L�wis (loewis)
Summary: test_pty hanging on hpux11

Initial Comment:
The attached hack fixes a problem which occurs since
switching the pty code.  isatty() hangs if the slave_fd
is closed and reopened as in the deprecated APIs
pty.master_open() and pty.slave_open().

This patch reverts to the old behaviour where
_open_terminal() is called in master_open() to avoid
the hang later.

Here's a very simple test for the problem:

import pty, os

master_fd, slave_name = pty.master_open()
slave_fd = pty.slave_open(slave_name)
print os.isatty(slave_fd)

In slave_open() the first ioctl raises an IOError,
Invalid Argument 22.

I don't know if this problem affects hpux10.  Hopefully
someone will have a better idea how to really fix this
problem.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 17:23

Message:
Logged In: YES 
user_id=33168

I don't understand what you are asking for.  By 'specific
release', do you mean of Solaris/HP-UX?  I believe on
Solaris there's an exception, but on HP-UX it hangs.  But I
don't recall exactly.  I agree this patch is not optimal.  I
can also try on our Solaris box here.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 05:59

Message:
Logged In: YES 
user_id=21627

I can't reproduce a test failure for Solaris 8 (on the SF
compile farm) for Python 2.3a2. Can you please try that
specific release and report what test fails for you, in
which way?

I'm concerned that the patch isn't that good, e.g. on Linux,
it would cause usage of the old-style interface to
pseudo-terminals, even though an all-singing all-dancing
Unix98 pty support is available in the C library.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-27 19:21

Message:
Logged In: YES 
user_id=33168

I have attached an updated patch.  It seems Solaris 8 (on
the snake farm) also had a test failure.  I have basically
restored the old functionality in this patch. 
_open_terminal is called if /dev/ptmx exists, so
os.openpty() is not called.  This fixes the test
failures/hangs on both solaris and hpux and should be
equivalent to the 2.2 behaviour.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 22:25:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 14:25:38 -0800
Subject: [Patches] [ python-Patches-658327 ] Add inet_pton and inet_ntop to socket
Message-ID: <E18pyNO-0006BL-00@sc8-sf-web2.sourceforge.net>

Patches item #658327, was opened at 2002-12-24 16:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jp Calderone (kuran)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Add inet_pton and inet_ntop to socket

Initial Comment:
Patch is against current CVS and adds two socket module
functions, inet_pton and inet_ntop.  Both of these
should be available on all platforms (because of other
dependancies in the code) so I don't think portability
is a problem.  inet_ntop converts a packed IP address
to a human-readable '.' or ':' separated string
representation of the IP.  inet_pton performs the
reverse operation.

(Potential) problems: inet_pton sets errno to ENOSPC,
which may lead to a confusing error message.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 17:25

Message:
Logged In: YES 
user_id=33168

As I recall, yes, has_ipv6 is only for tests.  There was no
way to distinguish if python was built with IPv6 support,
since AF_INET6 was always defined.

Your second approach sounds like it will work.  I need to
review the code, though.  I've forgotten how it works. :-(

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 05:15

Message:
Logged In: YES 
user_id=21627

The has_ipv6 test is only there for the tests? In that case,
drop it, and just perform AF_INET6 conversions unconditionally.

OTOH, I think we should not expose the emulated inet_pton:
it doesn't set errno correctly, and offers no advantage over
inet_addr. So wrap the entire code with HAVE_INET_PTON, and
only perform the tests if the function is supported.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-04 21:40

Message:
Logged In: YES 
user_id=33168

I was just about to check this in, but then I ran into a
problem.  IPv6 may not be enabled, even if the constant
AF_INET6 exists.  The cleanest way I saw to address this in
the test was to add a has_ipv6 boolean constant to the
socket module.  Martin, do you think this is acceptable?

Attached is a complete patch which should be safe (based on
the discussion below), includes tests and doc changes.

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2003-01-11 12:04

Message:
Logged In: YES 
user_id=366566

Yea, testing for the proper input length is definitely
something that should be done.  The patch looks good, but
for one thing.  If the specified address family is neither
AF_INET nor AF_INET6, the length won't be tested and the
underlying inet_ntop will be called.  This isn't a problem
now (afaik) because only those two address families are
support, but in a future libc version with more supported
address families, it might open a similar hole to the one
you've fixed.  Perhaps the

+       } else {
+               PyErr_SetString(socket_error, "unknown
address family");
+               return NULL;
+       }

should be moved up from the second if-grouping to follow the
first if-grouping.  Everything else looks good to me. 
Thanks for taking the time to look at this :)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-10 22:49

Message:
Logged In: YES 
user_id=33168

JP, do you agree with my comment on 2002-12-30 about the
checks?  I have attached an updated patch.  Please review
and verify this is correct.

Thank you for the additional tests.  Feel free to submit
patches with additional tests for any and all modules!

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-31 11:52

Message:
Logged In: YES 
user_id=366566

Doc, NEWS, and test_socket patch attached.  I didn't notice
any inet_aton/inet_ntoa tests in the module so I added a
couple for those as well (I excluded a test for
inet_ntoa('255.255.255.255') ;) Also included are a couple
IPv6 tests.  I'm not sure if these are appropriate, since
many systems may still lack the required support for them to
pass.  I'll leave it up to you to decide whether they should
be commented out or removed or whatever.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-31 08:17

Message:
Logged In: YES 
user_id=21627

I agree that such a change should be added. Neal, you have
given this patch more attention than I did - please check it
in when you consider it complete. I just like to point out
that it is missing documentation changes (libsocket.tex), a
NEWS entry, and a test case. kuran, please provide those as
a single patch file.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-30 19:11

Message:
Logged In: YES 
user_id=33168

ISTM that in socket_inet_ntop() you need to verify the size
of the packed value passed in.  If the user passes an empty
string, inet_ntop() could read beyond the buffer passed in,
potentially causing a core dump.

The checks could be something like this:

  if (af == AF_INET && len != sizeof(struct in_addr))
  else if (af == AF_INET6 && len != sizeof(struct in6_addr))

Do this make sense?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-27 10:39

Message:
Logged In: YES 
user_id=366566

The use case I have for it at the moment is a DNS server
(Twisted.names).  inet_pton allows me to handle IPv6
addresses, so it allows me to support AAAA and A6 records. 
I believe an IPv6 capable socks proxy would find this useful
as well.  Basically, low level network stuff.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-27 05:23

Message:
Logged In: YES 
user_id=21627

What is the rationale for providing this functionality?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-26 13:32

Message:
Logged In: YES 
user_id=366566

Ooops, I made two, and uploaded the wrong one >:O  Sorry. 
Dunno if it's still helpful, but here's the unified diff.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 13:10

Message:
Logged In: YES 
user_id=33168

Next time, please use context or unified diff.  -c or -u
option to cvs diff:  cvs diff -c ...

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-24 16:05

Message:
Logged In: YES 
user_id=366566

Sourceforge decided not to attach the file the first time...
 Here it is.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470


From noreply@sourceforge.net  Mon Mar  3 23:13:36 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 15:13:36 -0800
Subject: [Patches] [ python-Patches-671384 ] test_pty hanging on hpux11
Message-ID: <E18pz7o-0007u1-00@sc8-sf-web3.sourceforge.net>

Patches item #671384, was opened at 2003-01-20 22:23
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Martin v. L�wis (loewis)
Summary: test_pty hanging on hpux11

Initial Comment:
The attached hack fixes a problem which occurs since
switching the pty code.  isatty() hangs if the slave_fd
is closed and reopened as in the deprecated APIs
pty.master_open() and pty.slave_open().

This patch reverts to the old behaviour where
_open_terminal() is called in master_open() to avoid
the hang later.

Here's a very simple test for the problem:

import pty, os

master_fd, slave_name = pty.master_open()
slave_fd = pty.slave_open(slave_name)
print os.isatty(slave_fd)

In slave_open() the first ioctl raises an IOError,
Invalid Argument 22.

I don't know if this problem affects hpux10.  Hopefully
someone will have a better idea how to really fix this
problem.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 00:13

Message:
Logged In: YES 
user_id=21627

By 'specific release', I mean Python 2.3a2, with no patches.
I can't reproduce an exception on that Python release, for
Solaris 8.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 23:23

Message:
Logged In: YES 
user_id=33168

I don't understand what you are asking for.  By 'specific
release', do you mean of Solaris/HP-UX?  I believe on
Solaris there's an exception, but on HP-UX it hangs.  But I
don't recall exactly.  I agree this patch is not optimal.  I
can also try on our Solaris box here.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:59

Message:
Logged In: YES 
user_id=21627

I can't reproduce a test failure for Solaris 8 (on the SF
compile farm) for Python 2.3a2. Can you please try that
specific release and report what test fails for you, in
which way?

I'm concerned that the patch isn't that good, e.g. on Linux,
it would cause usage of the old-style interface to
pseudo-terminals, even though an all-singing all-dancing
Unix98 pty support is available in the C library.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-28 01:21

Message:
Logged In: YES 
user_id=33168

I have attached an updated patch.  It seems Solaris 8 (on
the snake farm) also had a test failure.  I have basically
restored the old functionality in this patch. 
_open_terminal is called if /dev/ptmx exists, so
os.openpty() is not called.  This fixes the test
failures/hangs on both solaris and hpux and should be
equivalent to the 2.2 behaviour.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 03:41:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 19:41:21 -0800
Subject: [Patches] [ python-Patches-658327 ] Add inet_pton and inet_ntop to socket
Message-ID: <E18q3Iv-0006au-00@sc8-sf-web1.sourceforge.net>

Patches item #658327, was opened at 2002-12-24 16:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jp Calderone (kuran)
>Assigned to: Martin v. L�wis (loewis)
Summary: Add inet_pton and inet_ntop to socket

Initial Comment:
Patch is against current CVS and adds two socket module
functions, inet_pton and inet_ntop.  Both of these
should be available on all platforms (because of other
dependancies in the code) so I don't think portability
is a problem.  inet_ntop converts a packed IP address
to a human-readable '.' or ':' separated string
representation of the IP.  inet_pton performs the
reverse operation.

(Potential) problems: inet_pton sets errno to ENOSPC,
which may lead to a confusing error message.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 22:41

Message:
Logged In: YES 
user_id=33168

I added the #ifdef, but that doesn't address the testing
problem.  If the platform has inet_pton, but doesn't have
IPv6 ENABLED.  The inet_pton will be exported, but there's
no good way to tell if you can pass an IPv6 address.  The
only way to test if IPv6 is enabled would be to call
inet_pton with AF_INET6, catch a socket.error and check if
the exception message is "unknown address family".  Since
this is really a testing issue, perhaps that's best after all?

Do you agree this should be done?
 * Remove has_ipv6
 * Export inet_pton & inet_ntop only if defined for platform
 * Only try to test inet_pton/ntop if defined for platform
 * Modify the tests to pass a valid IPv6 test, catch
socket.error, if the error message is "unknown address
family", don't test ipv6 any further, if the error message
is different, raise TestFailed, if no exception, test all
IPv6 addresses

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 17:25

Message:
Logged In: YES 
user_id=33168

As I recall, yes, has_ipv6 is only for tests.  There was no
way to distinguish if python was built with IPv6 support,
since AF_INET6 was always defined.

Your second approach sounds like it will work.  I need to
review the code, though.  I've forgotten how it works. :-(

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 05:15

Message:
Logged In: YES 
user_id=21627

The has_ipv6 test is only there for the tests? In that case,
drop it, and just perform AF_INET6 conversions unconditionally.

OTOH, I think we should not expose the emulated inet_pton:
it doesn't set errno correctly, and offers no advantage over
inet_addr. So wrap the entire code with HAVE_INET_PTON, and
only perform the tests if the function is supported.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-04 21:40

Message:
Logged In: YES 
user_id=33168

I was just about to check this in, but then I ran into a
problem.  IPv6 may not be enabled, even if the constant
AF_INET6 exists.  The cleanest way I saw to address this in
the test was to add a has_ipv6 boolean constant to the
socket module.  Martin, do you think this is acceptable?

Attached is a complete patch which should be safe (based on
the discussion below), includes tests and doc changes.

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2003-01-11 12:04

Message:
Logged In: YES 
user_id=366566

Yea, testing for the proper input length is definitely
something that should be done.  The patch looks good, but
for one thing.  If the specified address family is neither
AF_INET nor AF_INET6, the length won't be tested and the
underlying inet_ntop will be called.  This isn't a problem
now (afaik) because only those two address families are
support, but in a future libc version with more supported
address families, it might open a similar hole to the one
you've fixed.  Perhaps the

+       } else {
+               PyErr_SetString(socket_error, "unknown
address family");
+               return NULL;
+       }

should be moved up from the second if-grouping to follow the
first if-grouping.  Everything else looks good to me. 
Thanks for taking the time to look at this :)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-10 22:49

Message:
Logged In: YES 
user_id=33168

JP, do you agree with my comment on 2002-12-30 about the
checks?  I have attached an updated patch.  Please review
and verify this is correct.

Thank you for the additional tests.  Feel free to submit
patches with additional tests for any and all modules!

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-31 11:52

Message:
Logged In: YES 
user_id=366566

Doc, NEWS, and test_socket patch attached.  I didn't notice
any inet_aton/inet_ntoa tests in the module so I added a
couple for those as well (I excluded a test for
inet_ntoa('255.255.255.255') ;) Also included are a couple
IPv6 tests.  I'm not sure if these are appropriate, since
many systems may still lack the required support for them to
pass.  I'll leave it up to you to decide whether they should
be commented out or removed or whatever.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-31 08:17

Message:
Logged In: YES 
user_id=21627

I agree that such a change should be added. Neal, you have
given this patch more attention than I did - please check it
in when you consider it complete. I just like to point out
that it is missing documentation changes (libsocket.tex), a
NEWS entry, and a test case. kuran, please provide those as
a single patch file.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-30 19:11

Message:
Logged In: YES 
user_id=33168

ISTM that in socket_inet_ntop() you need to verify the size
of the packed value passed in.  If the user passes an empty
string, inet_ntop() could read beyond the buffer passed in,
potentially causing a core dump.

The checks could be something like this:

  if (af == AF_INET && len != sizeof(struct in_addr))
  else if (af == AF_INET6 && len != sizeof(struct in6_addr))

Do this make sense?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-27 10:39

Message:
Logged In: YES 
user_id=366566

The use case I have for it at the moment is a DNS server
(Twisted.names).  inet_pton allows me to handle IPv6
addresses, so it allows me to support AAAA and A6 records. 
I believe an IPv6 capable socks proxy would find this useful
as well.  Basically, low level network stuff.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-27 05:23

Message:
Logged In: YES 
user_id=21627

What is the rationale for providing this functionality?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-26 13:32

Message:
Logged In: YES 
user_id=366566

Ooops, I made two, and uploaded the wrong one >:O  Sorry. 
Dunno if it's still helpful, but here's the unified diff.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 13:10

Message:
Logged In: YES 
user_id=33168

Next time, please use context or unified diff.  -c or -u
option to cvs diff:  cvs diff -c ...

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-24 16:05

Message:
Logged In: YES 
user_id=366566

Sourceforge decided not to attach the file the first time...
 Here it is.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 04:03:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 20:03:13 -0800
Subject: [Patches] [ python-Patches-696613 ] test options don't work on FreeBSD
Message-ID: <E18q3e5-0007BQ-00@sc8-sf-web3.sourceforge.net>

Patches item #696613, was opened at 2003-03-03 10:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Ben Laurie (benl)
>Assigned to: Jack Jansen (jackjansen)
Summary: test options don't work on FreeBSD

Initial Comment:
test -L is used during make install - I'm guessing it
is supposed to test for a softlink. Sadly, this is -h
under FreeBSD, so the install fails.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 23:03

Message:
Logged In: YES 
user_id=33168

What version of FreeBSD?  I'm on 4.7 (SF compile farm), and
the man page says:

     -h file       True if file exists and is a symbolic
link.  This operator
                   is retained for compatibility with
previous versions of
                   this program.  Do not rely on its
existence; use -L
                   instead.

I tested -h on Linux, HPUX11, and Solaris 8.  -h and -L both
work fine.
Assigning to Jack, since he checked in this code.  I wonder
if there's any issue on the Mac?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 04:26:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 20:26:14 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18q40M-0002Nf-00@sc8-sf-web4.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-03 23:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 14:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-01 21:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 06:49:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 22:49:19 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18q6Ep-0006UC-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 07:05:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 03 Mar 2003 23:05:11 -0800
Subject: [Patches] [ python-Patches-658327 ] Add inet_pton and inet_ntop to socket
Message-ID: <E18q6UB-000738-00@sc8-sf-web4.sourceforge.net>

Patches item #658327, was opened at 2002-12-24 22:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jp Calderone (kuran)
Assigned to: Martin v. L�wis (loewis)
Summary: Add inet_pton and inet_ntop to socket

Initial Comment:
Patch is against current CVS and adds two socket module
functions, inet_pton and inet_ntop.  Both of these
should be available on all platforms (because of other
dependancies in the code) so I don't think portability
is a problem.  inet_ntop converts a packed IP address
to a human-readable '.' or ':' separated string
representation of the IP.  inet_pton performs the
reverse operation.

(Potential) problems: inet_pton sets errno to ENOSPC,
which may lead to a confusing error message.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 08:05

Message:
Logged In: YES 
user_id=21627

My two suggestions aren't exclusive: If you have the native
inet_pton, you can *always* support IPv6 addresses with
that, regardless of whether --enable-ipv6 was passed to
configure or not.

If that is done, it will be a legitime test failure for
inet_pton not to support IPv6 - after all, the primary
reason to define this function was to support IPv6, so if
the native function fails to do so, there is clearly a bug
in the system.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-04 04:41

Message:
Logged In: YES 
user_id=33168

I added the #ifdef, but that doesn't address the testing
problem.  If the platform has inet_pton, but doesn't have
IPv6 ENABLED.  The inet_pton will be exported, but there's
no good way to tell if you can pass an IPv6 address.  The
only way to test if IPv6 is enabled would be to call
inet_pton with AF_INET6, catch a socket.error and check if
the exception message is "unknown address family".  Since
this is really a testing issue, perhaps that's best after all?

Do you agree this should be done?
 * Remove has_ipv6
 * Export inet_pton & inet_ntop only if defined for platform
 * Only try to test inet_pton/ntop if defined for platform
 * Modify the tests to pass a valid IPv6 test, catch
socket.error, if the error message is "unknown address
family", don't test ipv6 any further, if the error message
is different, raise TestFailed, if no exception, test all
IPv6 addresses

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 23:25

Message:
Logged In: YES 
user_id=33168

As I recall, yes, has_ipv6 is only for tests.  There was no
way to distinguish if python was built with IPv6 support,
since AF_INET6 was always defined.

Your second approach sounds like it will work.  I need to
review the code, though.  I've forgotten how it works. :-(

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:15

Message:
Logged In: YES 
user_id=21627

The has_ipv6 test is only there for the tests? In that case,
drop it, and just perform AF_INET6 conversions unconditionally.

OTOH, I think we should not expose the emulated inet_pton:
it doesn't set errno correctly, and offers no advantage over
inet_addr. So wrap the entire code with HAVE_INET_PTON, and
only perform the tests if the function is supported.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-05 03:40

Message:
Logged In: YES 
user_id=33168

I was just about to check this in, but then I ran into a
problem.  IPv6 may not be enabled, even if the constant
AF_INET6 exists.  The cleanest way I saw to address this in
the test was to add a has_ipv6 boolean constant to the
socket module.  Martin, do you think this is acceptable?

Attached is a complete patch which should be safe (based on
the discussion below), includes tests and doc changes.

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2003-01-11 18:04

Message:
Logged In: YES 
user_id=366566

Yea, testing for the proper input length is definitely
something that should be done.  The patch looks good, but
for one thing.  If the specified address family is neither
AF_INET nor AF_INET6, the length won't be tested and the
underlying inet_ntop will be called.  This isn't a problem
now (afaik) because only those two address families are
support, but in a future libc version with more supported
address families, it might open a similar hole to the one
you've fixed.  Perhaps the

+       } else {
+               PyErr_SetString(socket_error, "unknown
address family");
+               return NULL;
+       }

should be moved up from the second if-grouping to follow the
first if-grouping.  Everything else looks good to me. 
Thanks for taking the time to look at this :)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-11 04:49

Message:
Logged In: YES 
user_id=33168

JP, do you agree with my comment on 2002-12-30 about the
checks?  I have attached an updated patch.  Please review
and verify this is correct.

Thank you for the additional tests.  Feel free to submit
patches with additional tests for any and all modules!

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-31 17:52

Message:
Logged In: YES 
user_id=366566

Doc, NEWS, and test_socket patch attached.  I didn't notice
any inet_aton/inet_ntoa tests in the module so I added a
couple for those as well (I excluded a test for
inet_ntoa('255.255.255.255') ;) Also included are a couple
IPv6 tests.  I'm not sure if these are appropriate, since
many systems may still lack the required support for them to
pass.  I'll leave it up to you to decide whether they should
be commented out or removed or whatever.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-31 14:17

Message:
Logged In: YES 
user_id=21627

I agree that such a change should be added. Neal, you have
given this patch more attention than I did - please check it
in when you consider it complete. I just like to point out
that it is missing documentation changes (libsocket.tex), a
NEWS entry, and a test case. kuran, please provide those as
a single patch file.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-31 01:11

Message:
Logged In: YES 
user_id=33168

ISTM that in socket_inet_ntop() you need to verify the size
of the packed value passed in.  If the user passes an empty
string, inet_ntop() could read beyond the buffer passed in,
potentially causing a core dump.

The checks could be something like this:

  if (af == AF_INET && len != sizeof(struct in_addr))
  else if (af == AF_INET6 && len != sizeof(struct in6_addr))

Do this make sense?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-27 16:39

Message:
Logged In: YES 
user_id=366566

The use case I have for it at the moment is a DNS server
(Twisted.names).  inet_pton allows me to handle IPv6
addresses, so it allows me to support AAAA and A6 records. 
I believe an IPv6 capable socks proxy would find this useful
as well.  Basically, low level network stuff.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-27 11:23

Message:
Logged In: YES 
user_id=21627

What is the rationale for providing this functionality?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-26 19:32

Message:
Logged In: YES 
user_id=366566

Ooops, I made two, and uploaded the wrong one >:O  Sorry. 
Dunno if it's still helpful, but here's the unified diff.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 19:10

Message:
Logged In: YES 
user_id=33168

Next time, please use context or unified diff.  -c or -u
option to cvs diff:  cvs diff -c ...

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-24 22:05

Message:
Logged In: YES 
user_id=366566

Sourceforge decided not to attach the file the first time...
 Here it is.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 08:01:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 00:01:56 -0800
Subject: [Patches] [ python-Patches-681780 ] Faster commonprefix (OS independent)
Message-ID: <E18q7N6-0000UH-00@sc8-sf-web4.sourceforge.net>

Patches item #681780, was opened at 2003-02-06 18:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681780&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Nobody/Anonymous (nobody)
Summary: Faster commonprefix (OS independent)

Initial Comment:
This routine is about 20% faster on a test set of 7 sets 
of strings run 100000 times each (I can provide the test 
if requested).  The longer the common prefix is, the 
faster the routine becomes relative to original 
commonprefix.

My only worry is that it might get rejected if it is 
considered too fancy; therefore I wasn't shy on 
commenting.

I think we should also write a commonpathprefix, that 
will do what commonprefix should do, being in the 
*path.py module.  I'll do that if none other does.

The provided patch is for posixpath.py and ntpath.py, 
but since it's OS neutral it should work as is.  It uses 
itertools for speed, though, so it is not backportable, but 
it can be if requested by substituting map for imap and a 
normal slice for islice.

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2003-03-04 09:01

Message:
Logged In: YES 
user_id=498191

I would suggest another possibility. This one use a property
of strings 
ordering: if you have a<=b<=c and c.startswith(a) then
b.startswith(a).

I have tested two implementations :

# a 5 lines function with a really straightforward  code.
# It can degenerate rather badly in the worst case (large
strings 
# with a short common prefix) but is generally quite fast.
def commonprefix1(m):
    if not m: return ''
    prefix, greater = min(m), max(m)
    while not greater.startswith(prefix):
	prefix = prefix[:-1]
    return prefix

# The second use a bissection to avoid the worst case. This make
# the implementation a little more complex but seems to
provide the
# fastest result.
def commonprefix2(m):
    prefix = ''
    if m:
	low, high = min(m), max(m)
	while low:
	    n = len(low)//2 + 1
	    l, h = low[:n], high[:n]
	    if h==l:
		prefix += l
		low, high = low[n:], high[n:]
	    else:
		low, high = l[:-1], h
    return prefix
      
I personally prefer the commonprefix1 implementation: its
the simplest one and it is probably fast enough for the few
commonprefix use-cases (anyway, it is still faster than the
current implementation).      

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-07 12:11

Message:
Logged In: YES 
user_id=539787

I did my homework better, and found out that the buffer object 
quite probably will be deprecated.  So I rewrote the routine 
without the buffer object (using str.startswith), which by the 
way got another 10% speedup (relative to the latest version 
using buffer.)
The commonprefix_nobuf.diff patch applies directly to the 
original posixpath.py, ntpath.py.  I will try to delete the other 
patches, but I don't think I am allowed to do it.

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-06 19:02

Message:
Logged In: YES 
user_id=539787

Best case: comparing this to the old version with a list: 
['/usr/local/lib/python2.3/posixpath.py']*120, 10000 iterations, 
the speed difference is:
old: 319.58 sec
new: 34.43 sec

Since prefix_len always grows in the "while next_bit:" loop, 
applying commonprefix2.diff to the *patched* version does a 
very minor speedup (comparing smaller buffers in every 
iteration); but it is only a matter of overoptimisation (ie it does 
not hurt, but it's a trivial one, just 0.1%).

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-06 18:25

Message:
Logged In: YES 
user_id=33168

As much as I'd like to blame IE, it's a SF bug AFAIK.  
http://sf.net/tracker/?func=detail&atid=200001&aid=675910&group_id=1


----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-06 18:04

Message:
Logged In: YES 
user_id=539787

For some reason, my IE never uploads the file on the first 
attempt.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681780&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 11:04:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 03:04:32 -0800
Subject: [Patches] [ python-Patches-696613 ] test options don't work on FreeBSD
Message-ID: <E18qADo-0004UL-00@sc8-sf-web3.sourceforge.net>

Patches item #696613, was opened at 2003-03-03 16:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470

Category: Build
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Ben Laurie (benl)
Assigned to: Jack Jansen (jackjansen)
Summary: test options don't work on FreeBSD

Initial Comment:
test -L is used during make install - I'm guessing it
is supposed to test for a softlink. Sadly, this is -h
under FreeBSD, so the install fails.


----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 12:04

Message:
Logged In: YES 
user_id=45365

Checked in as Makefile.pre.in 1.116.

Now let's hope there's no platforms out there that only have -h and not -L, but if that is so then it should become clear when 2.3b1 hits the street.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-04 05:03

Message:
Logged In: YES 
user_id=33168

What version of FreeBSD?  I'm on 4.7 (SF compile farm), and
the man page says:

     -h file       True if file exists and is a symbolic
link.  This operator
                   is retained for compatibility with
previous versions of
                   this program.  Do not rely on its
existence; use -L
                   instead.

I tested -h on Linux, HPUX11, and Solaris 8.  -h and -L both
work fine.
Assigning to Jack, since he checked in this code.  I wonder
if there's any issue on the Mac?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 12:32:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 04:32:04 -0800
Subject: [Patches] [ python-Patches-696613 ] test options don't work on FreeBSD
Message-ID: <E18qBaW-0007jA-00@sc8-sf-web3.sourceforge.net>

Patches item #696613, was opened at 2003-03-03 15:19
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470

Category: Build
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Ben Laurie (benl)
Assigned to: Jack Jansen (jackjansen)
Summary: test options don't work on FreeBSD

Initial Comment:
test -L is used during make install - I'm guessing it
is supposed to test for a softlink. Sadly, this is -h
under FreeBSD, so the install fails.


----------------------------------------------------------------------

>Comment By: Ben Laurie (benl)
Date: 2003-03-04 12:32

Message:
Logged In: YES 
user_id=14333

As always, its coz I'm running an ancient version of
FreeBSD. Perhaps its time I built a new machine :-)

Mine is 3.2!

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 11:04

Message:
Logged In: YES 
user_id=45365

Checked in as Makefile.pre.in 1.116.

Now let's hope there's no platforms out there that only have -h and not -L, but if that is so then it should become clear when 2.3b1 hits the street.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-04 04:03

Message:
Logged In: YES 
user_id=33168

What version of FreeBSD?  I'm on 4.7 (SF compile farm), and
the man page says:

     -h file       True if file exists and is a symbolic
link.  This operator
                   is retained for compatibility with
previous versions of
                   this program.  Do not rely on its
existence; use -L
                   instead.

I tested -h on Linux, HPUX11, and Solaris 8.  -h and -L both
work fine.
Assigning to Jack, since he checked in this code.  I wonder
if there's any issue on the Mac?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696613&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:01:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:01:18 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qCys-0006dL-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 16:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 01:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 11:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 10:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 09:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 08:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 08:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 07:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 06:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 06:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 12:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 10:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 13:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 06:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 06:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 05:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 04:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-09 22:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-09 20:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:19:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:19:21 -0800
Subject: [Patches] [ python-Patches-671384 ] test_pty hanging on hpux11
Message-ID: <E18qDGL-0004dR-00@sc8-sf-web1.sourceforge.net>

Patches item #671384, was opened at 2003-01-20 16:23
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470

Category: Modules
Group: Python 2.3
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: test_pty hanging on hpux11

Initial Comment:
The attached hack fixes a problem which occurs since
switching the pty code.  isatty() hangs if the slave_fd
is closed and reopened as in the deprecated APIs
pty.master_open() and pty.slave_open().

This patch reverts to the old behaviour where
_open_terminal() is called in master_open() to avoid
the hang later.

Here's a very simple test for the problem:

import pty, os

master_fd, slave_name = pty.master_open()
slave_fd = pty.slave_open(slave_name)
print os.isatty(slave_fd)

In slave_open() the first ioctl raises an IOError,
Invalid Argument 22.

I don't know if this problem affects hpux10.  Hopefully
someone will have a better idea how to really fix this
problem.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-04 09:19

Message:
Logged In: YES 
user_id=33168

I don't seem to have this problem any more on either HP-UX
or Solaris.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 18:13

Message:
Logged In: YES 
user_id=21627

By 'specific release', I mean Python 2.3a2, with no patches.
I can't reproduce an exception on that Python release, for
Solaris 8.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 17:23

Message:
Logged In: YES 
user_id=33168

I don't understand what you are asking for.  By 'specific
release', do you mean of Solaris/HP-UX?  I believe on
Solaris there's an exception, but on HP-UX it hangs.  But I
don't recall exactly.  I agree this patch is not optimal.  I
can also try on our Solaris box here.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 05:59

Message:
Logged In: YES 
user_id=21627

I can't reproduce a test failure for Solaris 8 (on the SF
compile farm) for Python 2.3a2. Can you please try that
specific release and report what test fails for you, in
which way?

I'm concerned that the patch isn't that good, e.g. on Linux,
it would cause usage of the old-style interface to
pseudo-terminals, even though an all-singing all-dancing
Unix98 pty support is available in the C library.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-27 19:21

Message:
Logged In: YES 
user_id=33168

I have attached an updated patch.  It seems Solaris 8 (on
the snake farm) also had a test failure.  I have basically
restored the old functionality in this patch. 
_open_terminal is called if /dev/ptmx exists, so
os.openpty() is not called.  This fixes the test
failures/hangs on both solaris and hpux and should be
equivalent to the 2.2 behaviour.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=671384&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:31:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:31:41 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qDSH-0004hu-00@sc8-sf-web3.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:40:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:40:39 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qDax-0000W7-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:51:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:51:28 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qDlQ-0006MI-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 14:54:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 06:54:19 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qDoB-0001Ab-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 16:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 09:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 01:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 11:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 10:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 09:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 08:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 08:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 07:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 06:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 06:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 12:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 10:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 13:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 06:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 06:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 05:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 04:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-09 22:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-09 20:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:03:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:03:31 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qDx5-0006Is-00@sc8-sf-web3.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 16:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 10:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 09:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 01:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 11:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 10:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 09:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 08:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 08:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 07:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 06:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 06:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 12:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 10:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 13:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 06:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 06:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 05:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 04:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-09 22:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-09 20:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:07:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:07:37 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qE13-0001pk-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:11:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:11:31 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qE4p-0002Cg-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:15:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:15:14 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qE8Q-0007an-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:19:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:19:54 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18qECw-000784-00@sc8-sf-web3.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 10:19

Message:
Logged In: YES 
user_id=6380

You don't need to update whatsnew23.tex; its editor prefers
to do this himself.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-03 23:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 14:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-01 21:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:44:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:44:40 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qEau-00046h-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 16:44

Message:
Logged In: YES 
user_id=45365

I just did a test (created 254 files with all bytes except / and null in their names on a linux server, mounted the partition over NFS on MacOSX) and indeed MacOSX tries to interpret the bytes as UTF-8 and fails.

I know that conversion works for HFS and HFS+ volumes (which carry a filename encoding with them, or you have to specify it when mounting). I assume it works for AFP and SMB (which also carries encoding info, IIRC) but I can't test this. I haven't a clue about webdav and such.

Something to keep in mind is that we are really trying to solve someone else's problem: the inability of NFS and most unixen to handle file system encodings. If I'm on a latin-1 machine and I nfs-mount your latin-2 partition I will see garbage filenames.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 15:50:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 07:50:18 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qEgM-0004HD-00@sc8-sf-web2.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:50

Message:
Logged In: YES 
user_id=92689

Here's a note about file system encodings on OSX, including a few words about NFS: http://developer.apple.com/qa/qa2001/qa1173.html.

I propose to fall back to a byte string if conversion to unicode fails.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 16:44

Message:
Logged In: YES 
user_id=45365

I just did a test (created 254 files with all bytes except / and null in their names on a linux server, mounted the partition over NFS on MacOSX) and indeed MacOSX tries to interpret the bytes as UTF-8 and fails.

I know that conversion works for HFS and HFS+ volumes (which carry a filename encoding with them, or you have to specify it when mounting). I assume it works for AFP and SMB (which also carries encoding info, IIRC) but I can't test this. I haven't a clue about webdav and such.

Something to keep in mind is that we are really trying to solve someone else's problem: the inability of NFS and most unixen to handle file system encodings. If I'm on a latin-1 machine and I nfs-mount your latin-2 partition I will see garbage filenames.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 16:00:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 08:00:35 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qEqJ-0002Pz-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 17:00

Message:
Logged In: YES 
user_id=21627

I only partially agree that this is somebody else's problem:
On Unix, it is always considered application responsibility
to interpret file names as characters if they need to -
hence the lack of a system-provided encoding strategy. So it
is the problem of Python or the Python application, and I
think we should try to shield the application from these
issues as good as we can.

Therefore, I'm in favour of jvr's latest proposal (use byte
strings as the last resort), hoping that the error case will
be unfrequent.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:50

Message:
Logged In: YES 
user_id=92689

Here's a note about file system encodings on OSX, including a few words about NFS: http://developer.apple.com/qa/qa2001/qa1173.html.

I propose to fall back to a byte string if conversion to unicode fails.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 16:44

Message:
Logged In: YES 
user_id=45365

I just did a test (created 254 files with all bytes except / and null in their names on a linux server, mounted the partition over NFS on MacOSX) and indeed MacOSX tries to interpret the bytes as UTF-8 and fails.

I know that conversion works for HFS and HFS+ volumes (which carry a filename encoding with them, or you have to specify it when mounting). I assume it works for AFP and SMB (which also carries encoding info, IIRC) but I can't test this. I haven't a clue about webdav and such.

Something to keep in mind is that we are really trying to solve someone else's problem: the inability of NFS and most unixen to handle file system encodings. If I'm on a latin-1 machine and I nfs-mount your latin-2 partition I will see garbage filenames.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 16:26:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 08:26:58 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qFFq-0003pw-00@sc8-sf-web1.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 16:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 11:26

Message:
Logged In: YES 
user_id=6380

On the one hand a user who isn't interested in encodings
shouldn't be passing a Unicode argument. On the other hand,
Unicode strings have a way of sneaking into your application
when you least suspect them. E.g. Tkinter returns them, so
does IDLE, and I see them used more and more in Zope 3.

FWIW, I like Just's "fall back to bytestrings" aproach.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 11:00

Message:
Logged In: YES 
user_id=21627

I only partially agree that this is somebody else's problem:
On Unix, it is always considered application responsibility
to interpret file names as characters if they need to -
hence the lack of a system-provided encoding strategy. So it
is the problem of Python or the Python application, and I
think we should try to shield the application from these
issues as good as we can.

Therefore, I'm in favour of jvr's latest proposal (use byte
strings as the last resort), hoping that the error case will
be unfrequent.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 10:50

Message:
Logged In: YES 
user_id=92689

Here's a note about file system encodings on OSX, including a few words about NFS: http://developer.apple.com/qa/qa2001/qa1173.html.

I propose to fall back to a byte string if conversion to unicode fails.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 10:44

Message:
Logged In: YES 
user_id=45365

I just did a test (created 254 files with all bytes except / and null in their names on a linux server, mounted the partition over NFS on MacOSX) and indeed MacOSX tries to interpret the bytes as UTF-8 and fails.

I know that conversion works for HFS and HFS+ volumes (which carry a filename encoding with them, or you have to specify it when mounting). I assume it works for AFP and SMB (which also carries encoding info, IIRC) but I can't test this. I haven't a clue about webdav and such.

Something to keep in mind is that we are really trying to solve someone else's problem: the inability of NFS and most unixen to handle file system encodings. If I'm on a latin-1 machine and I nfs-mount your latin-2 partition I will see garbage filenames.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 10:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 10:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 10:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 10:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 09:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 09:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 09:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 01:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 12:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 11:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 10:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 09:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 08:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 08:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 07:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 06:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 06:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 12:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 10:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 13:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 06:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 06:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 05:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 05:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 04:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 04:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-09 22:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-09 20:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 17:19:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 09:19:38 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18qG4o-0001E3-00@sc8-sf-web4.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 16:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 17:19

Message:
Logged In: YES 
user_id=670441

Okay, here's patchpop2 with the diff'ed dictobject,
UserDict, test_types, test_userdict, NEWS, and
Doc/lib/libstdtypes.  whew.

Let me know if you need any changes.
The change to DictMixin seems a bit
clumsy, but I liked it better than other things
I came up with.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:19

Message:
Logged In: YES 
user_id=6380

You don't need to update whatsnew23.tex; its editor prefers
to do this himself.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-04 04:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 19:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-02 02:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-01 02:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-01 01:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 17:44:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 09:44:28 -0800
Subject: [Patches] [ python-Patches-684256 ] AutoThreadState implementation
Message-ID: <E18qGSq-0008Hn-00@sc8-sf-web1.sourceforge.net>

Patches item #684256, was opened at 2003-02-10 14:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684256&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Mark Hammond (mhammond)
Summary: AutoThreadState implementation

Initial Comment:
An implementation of the AutoThreadState API, mainly
for discussion purposes at this point.  To be a PEP soon.

----------------------------------------------------------------------

Comment By: Greg Chapman (glchapman)
Date: 2003-03-04 08:44

Message:
Logged In: YES 
user_id=86307

It appears to me that PyAutoThreadState_Release calls 
PyThreadState_Clear after releasing the GIL (if the thread 
state was created by PyAutoThreadState_Ensure, then old 
state will be UNLOCKED, so PyEval_ReleaseThread will be 
called).  It looks to me that, if the thread state is going to be 
deleted, the call to Clear it should be moved up to just before 
ReleaseThread, i.e.:

if (oldstate == PyAutoThreadState_UNLOCKED) {
    if (tcur->autothreadstate_counter == 1)
        PyThreadState_Clear(tcur);
    PyEval_ReleaseThread(tcur);
}


----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-02-13 04:05

Message:
Logged In: YES 
user_id=14198

Attaching a new patch that works perfectly.  2 checks remain
in the code that will be debug only, but apart from that, it
is pretty good.  No changes at all to existing semantics.

Tested on Linux and Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684256&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 18:55:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 10:55:26 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18qHZW-0003HD-00@sc8-sf-web1.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 16:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 18:55

Message:
Logged In: YES 
user_id=670441

argh... I put the NEWS item in the wrong place.
Ignore patchpop2(I can't delete it), look at patchpop3.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 17:19

Message:
Logged In: YES 
user_id=670441

Okay, here's patchpop2 with the diff'ed dictobject,
UserDict, test_types, test_userdict, NEWS, and
Doc/lib/libstdtypes.  whew.

Let me know if you need any changes.
The change to DictMixin seems a bit
clumsy, but I liked it better than other things
I came up with.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:19

Message:
Logged In: YES 
user_id=6380

You don't need to update whatsnew23.tex; its editor prefers
to do this himself.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-04 04:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 19:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-02 02:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-01 02:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-01 01:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 19:43:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 11:43:48 -0800
Subject: [Patches] [ python-Patches-683592 ] unicode support for os.listdir()
Message-ID: <E18qIKK-0000ra-00@sc8-sf-web4.sourceforge.net>

Patches item #683592, was opened at 2003-02-09 22:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Just van Rossum (jvr)
Assigned to: Martin v. L�wis (loewis)
Summary: unicode support for os.listdir()

Initial Comment:
The attached patch makes os.listdir() return unicode strings, on plaforms that have Py_FileSystemDefaultEncoding defined as non-NULL.

I'm by no means sure this is the right thing to do; it does seem right on OSX where Py_FileSystemDefaultEncoding is (or rather: will be real soon, I'm waiting for Jack's approval) utf-8. I'd be happy to add the code in an OSX-specific switch.

A more subtle variant could perhaps only return unicode strings if the file name is not ASCII.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-04 20:43

Message:
Logged In: YES 
user_id=92689

I've committed the "fallback-to-byte-strings" behavior.
It's in posixmodule.c rev. 2.290.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 17:26

Message:
Logged In: YES 
user_id=6380

On the one hand a user who isn't interested in encodings
shouldn't be passing a Unicode argument. On the other hand,
Unicode strings have a way of sneaking into your application
when you least suspect them. E.g. Tkinter returns them, so
does IDLE, and I see them used more and more in Zope 3.

FWIW, I like Just's "fall back to bytestrings" aproach.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 17:00

Message:
Logged In: YES 
user_id=21627

I only partially agree that this is somebody else's problem:
On Unix, it is always considered application responsibility
to interpret file names as characters if they need to -
hence the lack of a system-provided encoding strategy. So it
is the problem of Python or the Python application, and I
think we should try to shield the application from these
issues as good as we can.

Therefore, I'm in favour of jvr's latest proposal (use byte
strings as the last resort), hoping that the error case will
be unfrequent.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:50

Message:
Logged In: YES 
user_id=92689

Here's a note about file system encodings on OSX, including a few words about NFS: http://developer.apple.com/qa/qa2001/qa1173.html.

I propose to fall back to a byte string if conversion to unicode fails.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-04 16:44

Message:
Logged In: YES 
user_id=45365

I just did a test (created 254 files with all bytes except / and null in their names on a linux server, mounted the partition over NFS on MacOSX) and indeed MacOSX tries to interpret the bytes as UTF-8 and fails.

I know that conversion works for HFS and HFS+ volumes (which carry a filename encoding with them, or you have to specify it when mounting). I assume it works for AFP and SMB (which also carries encoding info, IIRC) but I can't test this. I haven't a clue about webdav and such.

Something to keep in mind is that we are really trying to solve someone else's problem: the inability of NFS and most unixen to handle file system encodings. If I'm on a latin-1 machine and I nfs-mount your latin-2 partition I will see garbage filenames.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:15

Message:
Logged In: YES 
user_id=21627

Setting the file system encoding on startup should be fine,
except that we need another setlocale/query/restore locale
sequence. This is, in principle, bad, as there is no
guarantee that the restore locale operation really produces
the original state, and may cause problems if other threads
are already running. In practice, it appears to work out
just fine, as we use such sequences already (e.g. to undo
the readline initialization).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 16:11

Message:
Logged In: YES 
user_id=21627

I disagree with the last assertion: In *particular* if the
file system encoding is UTF-8, there is a good chance that
decoding will fail (unlike if it is latin-1; decoding will
then never fail - it may just produce mojibake). 

OS X seems to make a guarantee to always return UTF-8 from
its low-level API, but I distrust this guarantee until I see
it with my own eyes :-) E.g. what happens if you mount an
NFS tree, and the NFS server gives file names in some other
encoding?

I see the following options:

- only enable the code for OS X. I dislike this option, as
it essentially freezes the Unix status to non-Unicode (we
won't get further insights, the de jure status won't change,
de facto, all files will be encoded in the locale's encoding).

- leave the code as-is, documenting the possibility of
exceptions.

- add byte strings instead of Unicode strings into the
result for non-decodable strings. This gives a mixed-type
result, which is fine if you only pass the resulting file
names to stat() or open(), and will likely break the
application if it tries to display the file names somehow.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 16:07

Message:
Logged In: YES 
user_id=92689

I think it would be better to simply return byte strings if the file system encoding isn't know. (This btw. was what my original patch did.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 16:03

Message:
Logged In: YES 
user_id=6380

Maybe the filesystem default encoding should be set to
Latin-1 by default (when nothing better is known about it)?
Then it's hard to imagine how the conversion could fail,
since every Latin-1 byte maps 1-1 to the corresponding
Unicode code point.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:54

Message:
Logged In: YES 
user_id=6380

The setlocale call indeed works.

I think I'd be happier if this was set by default, but I
don't know what other consequences there would be.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:51

Message:
Logged In: YES 
user_id=92689

It would seem that even with a user's locale there's a chance os.listdir() fails when passed a unicode argument. I'm not sure it's reasonable for os.listdir() to fail at all (if the directory to be listed exists and we the right permissions).

If it's all too difficult to get right, I'm happy to put the listdir unicode support in a MacOSX switch. I know nothing about locales so I'm really not in a position to straighten this out. All I know is that if Py_FileSystemDefaultEncoding is known to be utf-8, it's just dumb _not_ to return unicode. You guys figure out the rest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 15:40

Message:
Logged In: YES 
user_id=21627

Guido's scenario was precisely the reason why Unix was left
out from consideration for PEP 277.

However, it is better than it sounds: There is a good chance
that invoking locale.setlocale(locale.LC_CTYPE, "") prior to
invoking listdir will overcome the problem, as the setlocale
call will set the file system encoding to the user's
preference. If \xff is a valid file name in the user's
preferred encoding, then listdir will succeed in converting
this file name to a Unicode string.

It might be useful to set the file system encoding on Unix
to the user's preferred encoding unconditionally (i.e. not
as a side effect of invoking setlocale). It might also be
useful to expose the file system encoding read-only for
inspection.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-04 15:31

Message:
Logged In: YES 
user_id=92689

Would you prefer the error be silenced and a byte string be used instead? If so, should there be a warning?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:01

Message:
Logged In: YES 
user_id=6380

I haven't seen the code, but I have a complaint.

On Linux, when I have a file named '\xff' (i.e. its name is
the single byte with value 255), os.listdir(u'.') gives me a
UnicodeDecodeError.

Is that really progress?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 07:49

Message:
Logged In: YES 
user_id=21627

The current code looks fine to me. Closing this patch.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:56

Message:
Logged In: YES 
user_id=92689

Martin, assigning this item to you. Please close it if you deem the changes in CVS correct.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 18:45

Message:
Logged In: YES 
user_id=92689

Applied to CVS as:
  Modules/posixmodule.c: 2.288
  Doc/lib/libos.tex: 1.115
  Misc/NEWS: 1.687

Unicode errors are propagated as in the original version of the patch, libos.tex mentions Win NT/2k/XP and Unix.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 17:39

Message:
Logged In: YES 
user_id=21627

Clearing the error is bad, I agree. I see two options:
reraise the exception, deleting the result obtained so far
(i.e. as the code did that the latest patch removes), OR add
a byte string instead of the Unicode string into the result.
Even though I have proposed the latter in the past, I could
also accept the former; applications that anticipate that
exception then just need to re-invoke listdir with a byte
string, and deal with the result themselves.

With these changes, the patch is fine with me.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 17:08

Message:
Logged In: YES 
user_id=92689

I think this could be achieved by removing the "Py_FileSystemDefaultEncoding != NULL" part of the condition on line 1805, as indeed passing NULL as the encoding to PyUnicode_FromEncodedObject causes the default encoding to be used. Shall I check it in like that?

I'm not quite happy with the fact that exceptions are silently dropped: should a warning be issued instead? Especially when using the default encoding, exceptions are not unlikely I suppose.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 16:48

Message:
Logged In: YES 
user_id=21627

I see. The right thing, IMO, is to always return Unicode
objects for Unicode arguments, just the same way the "et"
parser works: if the file system encoding is NULL, fall back
to the system default encoding. Then, you can generalize the
docs to [NT and Unix] (with OS X being a flavour of Unix),
or drop the OS reference completely (in which case the other
os modules are effectively buggy).

There might be a function already to fall back to the system
default encoding; perhaps just passing NULL works.

There should be a documentation section on Unicode file
names; I volunteer to write it (Summary: NT+ uses Unicode
natively, W9x uses "mbcs", OS X uses UTF-8, which equates to
"Unicode natively", Unices with nl_langinfo(CODEPAGE) use
that, all others use the system default encoding).

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 15:32

Message:
Logged In: YES 
user_id=92689

Ok, done, including a minor patch to Doc/lib/libos.tex. I also adapted the Misc/NEWS items. I'm not sure how to change the os.listdir() doco to better reflect the actual situation without mentioning Py_FileSystemDefaultEncoding...

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 14:11

Message:
Logged In: YES 
user_id=21627

Looks good, but incomplete: If the argument is Unicode,
*all* results should be Unicode. There should also be
documentation changes.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 14:02

Message:
Logged In: YES 
user_id=92689

I've attached a patch that fixes the bug as well as addresses the unicode arg vs. return value inconsistency that Martin noted. The exception behavior has not yet been changed.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-03 13:22

Message:
Logged In: YES 
user_id=92689

Jack, as noted on #bug 696261, the bug is that os.listdir() doesn't do the right thing with a Unicode string argument (it should use Py_FileSystemDefaultEncoding but it doesn't; I'm working on it.

Martin: I now see that PEP 277 says "Under this proposal, [os.listdir] will return a list of Unicode strings when its path argument is Unicode". I don't like this much (I really think we should push Unicode a little harder onto the users), but I'll look into changing the unix end of os.listdir() to do the same. I'll also review your exception comment.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 12:36

Message:
Logged In: YES 
user_id=21627

I dislike this change, as it introduces inconsistency across
platforms. On Win32, as a result of PEP 277, Unicode file
names are only returned for Unicode directory names. There
was an explicit discussion about this aspect of PEP 277, and
this interface was accepted as The Right Thing. So I think
Unix should follow here: return byte string file names for
byte string directory names, and Unicode file names for
Unicode directory names. Support for Unicode directory names
should also invoke the file system encoding for the
directory name.

I'm also unsure about the exception handling. If there is a
file name that doesn't decode according to the file system
encoding, it raises the Unicode error. This means that all
other file names are lost. This might be acceptable if the
Unicode-in-Unicode-out strategy is used; in its current
form, the change can and will break existing applications
(which find all kinds of funny byte sequences on disk that
don't work with the user's file system encoding).

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-03 12:23

Message:
Logged In: YES 
user_id=45365

I think this patch does more bad than good.

A practical problem is that os.path.walk doesn't work anymore if there are 
non-ascii directories in the directory tree (os.listdir will return these as unicode names, but doesn't accept unicode on input). See bug #696261. An additional problem is that various other methods in posix don't do the unicode conversion, so for instance os.getcwd() will return 8-bit strings in Py_FileSystemDefaultEncoding which are incompatible with the unicode returned by listdir.

My preferred solution would be to do the unicode trick everywhere. Second best would be to retract the whole thing and think about it a bit more for Python 2.4.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 22:52

Message:
Logged In: YES 
user_id=92689

Checked in as rev. 2.287 of Modules/posixmodule.c. Leaving this item open for now, in case MvL has comments when he gets back.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-25 18:22

Message:
Logged In: YES 
user_id=6380

OK, check it in, just be prepared for contingencies. I
really cannot judge whether this is right on all platforms.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-25 16:55

Message:
Logged In: YES 
user_id=92689

Having missed 2.3a2, I'd like to get this in way ahead of 2.3b1. Any objections?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 19:17

Message:
Logged In: YES 
user_id=92689

I'm pretty sure os.path deals just fine with unicode strings (it's all pure string manipulations, isn't it?)

Worries: well, apparently on Windows os.listdir() has been returning unicode for some time, so it's not like we're breaking completely new grounds here.

If anything breaks it's probably good this happens, as it gives an opportunity to fix things... I just found several example of potential breakage: _bsddb.c parses a filename arg with the "z" format specifier. gdbmmodule.c uses "s". bsddbmodule.c and dbmmodule.c as well.

I'm not sure the above modules work on Windows with non-ascii filenames at all, but it doesn't look like it. Besides Windows (for which my patch is not relevant), only OSX sets Py_FileSystemDefaultEncoding, so any new breakage won't reach a mass market right away <wink>.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 18:46

Message:
Logged In: YES 
user_id=38388

Ok, let's look at it from a different
angle: things that you get from os.listdir() should be
compatible 
to (at least) all the os.path tools and os itself.
Converting to 
Unicode has the advantage that slicing and indexing into the
path names will not break the paths (unlike UTF-8 encoded 8-bit
strings which tend to break when you slice them).

That said, I think you're right about the ASCII approach
provided
that the os, os.path tools can actually properly cope with
Unicode.

What I worry about is that if os.listdir() gives back
Unicode for
e.g. Latin-1 filenames and the application then passes the
Unicode
names to a C API using "s", prefectly working code will break...
then again the C code should really use "es" for decoding to
the Py_FileSystemDefaultEncoding as is done in e.g.
fileobject.c.

I really don't know what to do here...

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 17:24

Message:
Logged In: YES 
user_id=92689

Here's an argument for ASCII and against the default encoding: if the default encoding is different from Py_FileSystemDefaultEncoding, things go wrong: an 8-bit string passed to file() will be interpreted as Py_FileSystemDefaultEncoding (more precisely: will not be interpreted at all), not the default encoding...

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 12:24

Message:
Logged In: YES 
user_id=38388

Right, except that injecting Unicode into Unicode-unaware code
can be dangerous (e.g. some code might require a string object
to work on).

E.g. if someone sets the default encoding to Latin-1 he wouldn't
expect os.listdir() to suddenly return Unicode for him.

This may be a problem in general for the change to os.listdir().
We'll just have to see what happens during the alpha and beta
phases.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 12:08

Message:
Logged In: YES 
user_id=92689

On the other hand, if it's not ASCII, wouldn't a unicode string be more appropriate to begin with? If it's encodable with the default encoding, this will happen as soon as the string is used in a piece of unicode-unaware code, right?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:55

Message:
Logged In: YES 
user_id=38388

Good question. The default encoding would better fit 
into the concept, I guess.

Instead of PyUnicode_AsASCIIString(v) you'd
have to use PyUnicode_AsEncodedString(v, NULL, "strict").


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 11:49

Message:
Logged In: YES 
user_id=92689

Ok, I went for your original suggestion: always convert to unicode and then try to convert to ascii. See new patch. Or should this use the default encoding? Hm.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 11:17

Message:
Logged In: YES 
user_id=38388

The file system does not need to support embedded \0 chars
even if it supports UTF-16. It only happens that your test
assumes
that you have one byte per characters encodings which may not
always be true. With UTF-16 your test will see lots of \0 bytes
but not necessarily ones which are ord(x)>=128.

I'm not sure whether other variable length encodings can result
in \0 bytes, e.g. the Asian ones. 

There's also the possibility of the
encoding mapping the ASCII range to other non-ASCII characters,
e.g. ShiftJIS does this for the Yen sign.

If you absolutely want to use the simple test, I'd at least
restrict
the test to an ASCII isalnum(x) test and then try the
encode/decode 
method I described if this test fails.

Note that isalnum() can be locale dependent on some
platforms, so
you have to hard-code it.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:51

Message:
Logged In: YES 
user_id=92689

I don't see hot UTF-16 could be a valid value for Py_FileSystemDefaultEncoding, as for most platforms the file name can't contain null bytes. My looking at the NAMELEN() spaghetti, it seems platforms without HAVE_DIRENT_H might still support embedded null bytes. Any wisdom on this?

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-10 10:24

Message:
Logged In: YES 
user_id=38388

Your test will probably catch most cases, but it could fail
for e.g. UTF-16.

The only true test would be to first convert to Unicode and then
try to convert back to ASCII. If you get an error you can be
sure that
the text is not ASCII compatible. Given that .listdir()
involves lots of
IO I think the added performance hit wouldn't be noticable.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-10 10:12

Message:
Logged In: YES 
user_id=92689

Applied both suggestions.

However, I'm not sure if my ASCII test does the right thing, or at least I don't think it does if Py_FileSystemDefaultEncoding is not a superset of ASCII.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-10 04:07

Message:
Logged In: YES 
user_id=33168

The code which uses unicode APIs should probably be wrapped 
with:

#ifdef Py_USING_UNICODE
 /* code */
#endif


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-10 02:16

Message:
Logged In: YES 
user_id=6380

At the very least, I'd like it to return Unicode only when
the original string isn't just ASCII.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=683592&group_id=5470


From noreply@sourceforge.net  Tue Mar  4 23:24:20 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 04 Mar 2003 15:24:20 -0800
Subject: [Patches] [ python-Patches-697613 ] fix bug #670311: sys.exit and PYTHONINSPECT
Message-ID: <E18qLlk-0002sZ-00@sc8-sf-web4.sourceforge.net>

Patches item #697613, was opened at 2003-03-04 23:24
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697613&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #670311: sys.exit and PYTHONINSPECT

Initial Comment:

So we want to stop SystemExit from causing
an actual exit() when using python -i.
This patch introduces two new API calls,
PyRun_BlockSysExit and PyRun_UnblockSysExit,
to set a flag toavoid the exit() call in
PyErr_PrintEx.

There are several other ways to fix this
bug, but I think all of the others I came up
with would cause more backwards compatibilty
problems and/or be a lot more work.
Some possibilities, if anyone is interested,
would be:
1) Add a new PyCompilerFlags flag
This seems a bit ugly, as it's not really
a "compile flag".
2) Add some special run routines that
block the exit.
3) Add another parameter to existing
run routines.
4) Change PyErr_PrintEx so it doesn't
exit when printing a SystemError, instead
having the run routines responsible for
exiting when catching that exception.

What do you think?  Is my patch good enough,
or would you like something else?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697613&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 11:43:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 03:43:18 -0800
Subject: [Patches] [ python-Patches-697939 ] optparse unit tests + fixes
Message-ID: <E18qXIs-0007oT-00@sc8-sf-web1.sourceforge.net>

Patches item #697939, was opened at 2003-03-05 12:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697939&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Johannes Gijsbers (jlgijsbers)
Assigned to: Nobody/Anonymous (nobody)
Summary: optparse unit tests + fixes

Initial Comment:
Here's a patch that mostly converts the tests from optik 
1.4 to the unittest format and makes it usable in the 
Python library. I've also added some tests, of which five 
fail with current CVS:

test_opt_string_empty
test_opt_string_too_short
test_opt_string_long_invalid
test_opt_string_short_invalid
test_help_long_opts_first

I changed the following to fix the tests:

* format_option_strings_short_first and 
format_option_strings_long_first have been merged into 
one function, format_options, to eliminate the almost 
complete duplication. To make this possible, short_first 
is now an attribute, which conveniently also eases 
changing short_first after instantiation.

* _short_opts and _long_opts are set in the Option 
constructor, instead of in _check_option_strings, to 
prevent an AttributeError which would occur when no 
option strings were passed, making the "at least one 
option string must be supplied" OptionError useless.

* Removed the check that would raise a RuntimeError in 
Option.__str__ when no option strings existed in 
_short_opts or _long_opts. A RuntimeError would be 
raised when an OptionError was raised in 
_set_opt_strings, because, quite logically, no option 
strings were set at that point.

I'm not sure why the check was there, because 
_short_opts and _long_opts are only empty when 
instantation fails, or when somebody set those *internal* 
attributes to false. And the moment you start mucking 
with internal attributes, you're on your own. :)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697939&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 11:48:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 03:48:50 -0800
Subject: [Patches] [ python-Patches-697941 ] optparse OptionGroup docs
Message-ID: <E18qXOE-0003nv-00@sc8-sf-web3.sourceforge.net>

Patches item #697941, was opened at 2003-03-05 12:48
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697941&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Johannes Gijsbers (jlgijsbers)
Assigned to: Nobody/Anonymous (nobody)
Summary: optparse OptionGroup docs

Initial Comment:
A small patch to add a bit about the new OptionGroup, 
added in Optik 1.4 and Python CVS but currently 
undocumented.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697941&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 14:26:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 06:26:50 -0800
Subject: [Patches] [ python-Patches-696645 ] VMS patches, cleaning part
Message-ID: <E18qZr8-0001VX-00@sc8-sf-web3.sourceforge.net>

Patches item #696645, was opened at 2003-03-03 16:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Martin v. L�wis (loewis)
Summary: VMS patches, cleaning part

Initial Comment:
This is the cleaning patches.
I will provide other patches in a separate item.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-05 15:26

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as 

getbuildinfo.c 2.10
main.c 1.73
posixmodule.c 2.291


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696645&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 17:00:33 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 09:00:33 -0800
Subject: [Patches] [ python-Patches-698082 ] Modulefinder and excludes
Message-ID: <E18qcFt-0006tC-00@sc8-sf-web4.sourceforge.net>

Patches item #698082, was opened at 2003-03-05 18:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Just van Rossum (jvr)
Summary: Modulefinder and excludes

Initial Comment:
Modulefinder doesn't exclude modules in packages 
correctly. Attached patch fixes this.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 17:01:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 09:01:35 -0800
Subject: [Patches] [ python-Patches-698082 ] Modulefinder and excludes
Message-ID: <E18qcGt-0006xJ-00@sc8-sf-web4.sourceforge.net>

Patches item #698082, was opened at 2003-03-05 18:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Just van Rossum (jvr)
Summary: Modulefinder and excludes

Initial Comment:
Modulefinder doesn't exclude modules in packages 
correctly. Attached patch fixes this.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2003-03-05 18:01

Message:
Logged In: YES 
user_id=11105

IMO the patch speaks for itself.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 17:35:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 09:35:09 -0800
Subject: [Patches] [ python-Patches-698082 ] Modulefinder and excludes
Message-ID: <E18qcnN-00005v-00@sc8-sf-web4.sourceforge.net>

Patches item #698082, was opened at 2003-03-05 18:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Just van Rossum (jvr)
Summary: Modulefinder and excludes

Initial Comment:
Modulefinder doesn't exclude modules in packages 
correctly. Attached patch fixes this.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-05 18:35

Message:
Logged In: YES 
user_id=92689

Looks good, applied. It's in rev. 1.6 of Lib/modulefinder.py

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-05 18:01

Message:
Logged In: YES 
user_id=11105

IMO the patch speaks for itself.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698082&group_id=5470


From noreply@sourceforge.net  Wed Mar  5 22:16:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 14:16:31 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18qhBf-0004Sv-00@sc8-sf-web4.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 19:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 22:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Thu Mar  6 05:36:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 21:36:31 -0800
Subject: [Patches] [ python-Patches-698505 ] docs tor hotshot module
Message-ID: <E18qo3T-0005IK-00@sc8-sf-web2.sourceforge.net>

Patches item #698505, was opened at 2003-03-06 16:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: docs tor hotshot module

Initial Comment:
The attached provides documentation for the hotshot
module. Assigning to Fred for review.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470


From noreply@sourceforge.net  Thu Mar  6 05:39:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 21:39:54 -0800
Subject: [Patches] [ python-Patches-698505 ] docs tor hotshot module
Message-ID: <E18qo6k-0005LX-00@sc8-sf-web2.sourceforge.net>

Patches item #698505, was opened at 2003-03-06 16:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: docs tor hotshot module

Initial Comment:
The attached provides documentation for the hotshot
module. Assigning to Fred for review.


----------------------------------------------------------------------

>Comment By: Anthony Baxter (anthonybaxter)
Date: 2003-03-06 16:39

Message:
Logged In: YES 
user_id=29957

stupid sourceforge tracker. 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470


From noreply@sourceforge.net  Thu Mar  6 05:40:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 21:40:28 -0800
Subject: [Patches] [ python-Patches-698505 ] docs for hotshot module
Message-ID: <E18qo7I-0005MR-00@sc8-sf-web2.sourceforge.net>

Patches item #698505, was opened at 2003-03-06 16:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Fred L. Drake, Jr. (fdrake)
>Summary: docs for hotshot module

Initial Comment:
The attached provides documentation for the hotshot
module. Assigning to Fred for review.


----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2003-03-06 16:39

Message:
Logged In: YES 
user_id=29957

stupid sourceforge tracker. 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470


From noreply@sourceforge.net  Thu Mar  6 06:37:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 05 Mar 2003 22:37:48 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18qp0m-0000yL-00@sc8-sf-web3.sourceforge.net>

Patches item #698520, was opened at 2003-03-05 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From webmaster@pferdemarkt.ws  Thu Mar  6 14:11:46 2003
From: webmaster@pferdemarkt.ws (webmaster@pferdemarkt.ws)
Date: Thu, 6 Mar 2003 06:11:46 -0800
Subject: [Patches] Pferdemarkt.ws informiert! Newsletter 03/2003 http://www.pferdemarkt.ws
Message-ID: <200303061411.GAA25766@eagle.he.net>

http://www.pferdemarkt.ws

Wir sind in 2003 erfolgreich in des neue \"Pferdejahr 2003 gestartet.

F�r den schnellen Erfolg unseres Marktes m�chten wir uns bei Ihnen bedanken.

Heute am 06.03.2003 sind wir gut 2 Monate Online!

T�glich w�chst unsere Datenbank um  30  Neue Angebote.

Stellen auch Sie als Privatperson Ihre zu verkaufenden Pferde direkt und

vollkommen kostenlos ins Internet.

Zur besseren Sichtbarmachung Ihrer Angebote k�nnen Sie bis zu ein Bild zu Ihrer

Pferdeanzeige kostenlos einstellen!

Wollen Sie direkt auf die erste Seite, dann k�nnen wir Ihnen unser Bonussystem empfehlen.

klicken Sie hier: 

http://www.pferdemarkt.ws/bestellung.html 

Ihr http://Pferdemarkt.ws Team


Klicken Sie hier um sich direkt einzuloggen http://www.Pferdemarkt.ws

Kostenlos Anbieten, Kostenlos Suchen! Direkt von Privat zu Privat!

Haben Sie noch Fragen mailto: webmaster@pferdemarkt.ws


From noreply@sourceforge.net  Thu Mar  6 16:57:30 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 08:57:30 -0800
Subject: [Patches] [ python-Patches-698833 ] ZipFile - support for file decryption
Message-ID: <E18qygU-0007aH-00@sc8-sf-web2.sourceforge.net>

Patches item #698833, was opened at 2003-03-06 17:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698833&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Giovanni Bajo (giovannibajo)
Assigned to: Nobody/Anonymous (nobody)
Summary: ZipFile - support for file decryption

Initial Comment:
The attached patch adds support for the ZIP file 
decryption. Right now, only decryption is supported (not 
encryption), but I will work on this as well if there are no 
problems with this patch.

The ZIP encryption scheme uses 96-bits keys, so there 
might be some US law annoyances (see http://www.info-
zip.org/pub/infozip/FAQ.html#crypto). To me, everything 
seems legit, but I am not a lawyer.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698833&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 00:08:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 16:08:50 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18r5Pu-0004Kx-00@sc8-sf-web2.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 11:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-06 19:08

Message:
Logged In: YES 
user_id=80475

Misc/NEWS 1.69
Objects/dictobject.c 2.141
Doc/lib/libstdtypes.tex 1.120
Lib/UserDict.py 1.24
Lib/test/test_types.py 1.47
Lib/test/test_userdict.py 1.13


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 13:55

Message:
Logged In: YES 
user_id=670441

argh... I put the NEWS item in the wrong place.
Ignore patchpop2(I can't delete it), look at patchpop3.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 12:19

Message:
Logged In: YES 
user_id=670441

Okay, here's patchpop2 with the diff'ed dictobject,
UserDict, test_types, test_userdict, NEWS, and
Doc/lib/libstdtypes.  whew.

Let me know if you need any changes.
The change to DictMixin seems a bit
clumsy, but I liked it better than other things
I came up with.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 10:19

Message:
Logged In: YES 
user_id=6380

You don't need to update whatsnew23.tex; its editor prefers
to do this himself.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-03 23:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 14:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-01 21:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-28 21:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-28 20:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 04:31:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 20:31:31 -0800
Subject: [Patches] [ python-Patches-693753 ] fix for bug 639806: default for dict.pop
Message-ID: <E18r9W7-00017q-00@sc8-sf-web1.sourceforge.net>

Patches item #693753, was opened at 2003-02-26 16:51
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for bug 639806: default for dict.pop

Initial Comment:
This patch adds an optional default value to dict.pop,
so that it parallels dict.get, see discussion in bug
639806.

If no default is given, the old behavior still exists,
so backwards compatibility is no problem.
The new pop must use METH_VARARGS
and PyArg_UnpackTuple, somewhat effecting
efficiency.

If this is considered desirable, I could also
provide the same behavior for list.pop.

----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-07 04:31

Message:
Logged In: YES 
user_id=670441

Thanks for fixing up my UserDict.DictMixin patch.
Much nicer.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 00:08

Message:
Logged In: YES 
user_id=80475

Misc/NEWS 1.69
Objects/dictobject.c 2.141
Doc/lib/libstdtypes.tex 1.120
Lib/UserDict.py 1.24
Lib/test/test_types.py 1.47
Lib/test/test_userdict.py 1.13


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 18:55

Message:
Logged In: YES 
user_id=670441

argh... I put the NEWS item in the wrong place.
Ignore patchpop2(I can't delete it), look at patchpop3.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-04 17:19

Message:
Logged In: YES 
user_id=670441

Okay, here's patchpop2 with the diff'ed dictobject,
UserDict, test_types, test_userdict, NEWS, and
Doc/lib/libstdtypes.  whew.

Let me know if you need any changes.
The change to DictMixin seems a bit
clumsy, but I liked it better than other things
I came up with.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-04 15:19

Message:
Logged In: YES 
user_id=6380

You don't need to update whatsnew23.tex; its editor prefers
to do this himself.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-04 04:26

Message:
Logged In: YES 
user_id=80475

For NEWS, add a new entry (so that it documents a 
difference from Py2.3a2).

For whatsnew23, modify the existing entry (since it is a 
delta from Py2.3).

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-03 19:59

Message:
Logged In: YES 
user_id=670441

Should I make a new NEWS item, or should
I modify the existing NEWS item about dict.pop?

And should I make a new whatsnew23 item or
modify the existing one?

I'm guessing a new NEWS item and a modified
whatsnew item, but I'll post a patch when you tell me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-02 02:40

Message:
Logged In: YES 
user_id=31435

dicts have a .pop() method?  Heh.  I must have slept 
through that one <wink>.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-01 02:59

Message:
Logged In: YES 
user_id=6380

Alex Martelli's argument convinced me, I'm +0.5 on the
feature. The 0.5 is because it's definitely feature bloat.
Given how few use cases there are for dict.pop() in the
first place, I'm not worried about the minor slowdown due to
extra argument parsing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-01 01:30

Message:
Logged In: YES 
user_id=80475

The patch looks fine.  Assigning to Guido for 
pronouncement.

Guido, the patch adds optional get() like functionality for 
dict.pop().  The nearest parallel is the default argument for 
getattr(obj, attr, [default]).  On the plus side, it makes pop 
easier to use and more flexible.  On the minus side, it adds 
more complexity to the mapping interface and it slows 
down the normal case for d.pop(k).

If it is accepted the poster should add test cases, a NEWS 
item, doc updates, and parallel changes to 
UserDict.UserDict and UserDict.DictMixin.  Then, re-assign 
to me and I'll check it all and apply it.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=693753&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 05:52:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 21:52:59 -0800
Subject: [Patches] [ python-Patches-667730 ] More DictMixin
Message-ID: <E18rAmx-0000YO-00@sc8-sf-web4.sourceforge.net>

Patches item #667730, was opened at 2003-01-14 08:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Raymond Hettinger (rhettinger)
Summary: More DictMixin

Initial Comment:
This patch is intended to provide a more consistent
implementation for the various dictionary like objects
of the standard library.

test_userdict has been rewritten, it now use unittest
and define a test-case wich allow to check for
conformity with the dictionary protocol. 

test_shelve and test_weakref have been rewritten to use
the test_userdict test-case.

test_os has been extended: a new test case check for
environ object conformity to the dictionary protocol.

The patch modify the UserDict module:
* The doc says that __contains__ should be one of the
methods to redefine for better efficiency but the
implementation make __contains__ dependent of has_key
definition. The patch reverse methods dependencies.
* Change iterkey = __iter__ to def iterkey(self):
return self.__iter__() to make iterkey able to use
overiden __iter__ methods. 
* I have also a added __init__, copy and  __repr__
methods to DictMixin. 
* The UserDict.UserDict class is a subclass of
DictMixin, this allow to simplify UserDict
implementation. The patch is rather conservative since
a lot of methods definition could still be removed from
UserDict.

In the weakref module, the  patch make
WeakValueDictionnary and WeakKeyDictionnary subclasses
of UserDict.DictMixin. It also use nested scopes, the
new generators syntax  for iterator methods and rewrite
WeakKeyDictionnary.__delitem__ . All of this allow to
decrease the 
module size by 50%.

In the shelve module, the patch add a copy() method
which return a dictionary with the keys and values of
the database.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 00:52

Message:
Logged In: YES 
user_id=80475

The patch looks good.
Please make two adjustments and re-submit.

1) Change the test_func docstrings to comment blocks.  If 
a docstring is present, test support will print them in the 
summary instead of the test name.

2) Change the logic for mapping.pop() to accommodate 
the new default argument option which was added 
yesterday.  The format is m.pop(key[, default]).

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2003-03-03 10:27

Message:
Logged In: YES 
user_id=498191

I have downloaded a new version of the patch updated to
Python2.3a2

I hope to have removed all the stuff which could break
backward compatibility since the new proposed patch contain
now only the testing stuff (well, almost since I have also
added a pop method to the weak dictionary classes to make
them compatible with the test case).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:50

Message:
Logged In: YES 
user_id=80475

Also, +1 on consolidating the test cases though it should 
be done after any other changes to the files so we can 
make sure that nothing got broken.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:35

Message:
Logged In: YES 
user_id=80475

* UserDict.UserDict should not change. As Martin pointed-
out, inheriting from object changes the semantics in a non-
backward compatible way.  Also, the class is efficiently 
implemented in terms an internal dictionary and would be 
slowed down by the nest of calls in Mixin.  Also, I think the 
code in incorrect in defining __iter__, there was a reason it 
was pulled out into a separate subclass -- that was done in 
Py2.2. and is not an easily reversible decision.

*  -0 on the changes to has_key() and __contains__(). 
has_key() was put at a lower level than __contains__ 
because the older dict-style interfaces all define has_key.

* +1 for changing iterkeys() to a full definition (and +1 for 
doing the same for __iter__()).  Sabastien is correct is 
pointing out the advantages for propagating an overridden 
method.

* -1 for altering repr() implementation.  The current 
approach is shorter, cleaner, and faster.

* -1 for adding __nonzero__().  Even dictionaries don't 
implement this method; they let len() do the talking.

* -1 for adding __init__() and copy().  Both need to make 
assumptions about the order and number of parameters 
in the constructor of the class using the mixin.  I think they 
are rarely helpful and are sometime harmful in introducing 
surprising, hard-to-find errors.  People who need an init() 
or copy() can code them more cleanly and directly in the 
extending class.  Also, I don't think the code is correct 
since DictMixin will be a base class, the use of super() is 
not what is wanted here -- *if* you were going to do this, 
try something like self.__class__().  Further, adding these 
methods violates my original intent for this class which 
was to extrapolate four basic mapping methods into a full 
mapping interface.  It was not intended as a stand-alone 
class.  Also, copy() cannot guarantee that it is copying all 
the relevant data for the sub-class and that violates the 
definition of what copy() is supposed to do.  If something 
like this were attempted, it should be its own mixin 
(automatically adding copy support to any class) and it 
should be rather sophisticated about how to perfectly 
replicate itself (not easily done if the underlying data is in a 
file, database, or in a distributed app).

* +0 on changing weakdicts provided it is done minimally 
and carefully with attention to leaving semantics 
unchanged and not slowing performance.  The advantage 
goes beyond consistency, it removes code duplication, 
keeps well thought-out logic in one place, and provides an 
automatic interface update from DictMixin if the dictionary 
interface ever sprouts another method.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-14 16:43

Message:
Logged In: YES 
user_id=21627

This patch breaks backwards compatibility. UserDict is an
oldstyle class on purpose, since changing it to a newstyle
class will certainly break the compatibility in subtle ways
(e.g. by changing what type(userdictinstance) is).

Unless you can bring forward a better rationale than
consistency, this patch will be rejected.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 06:22:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 22:22:19 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rBFL-0004UM-00@sc8-sf-web1.sourceforge.net>

Patches item #698520, was opened at 2003-03-06 01:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 01:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 06:43:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 06 Mar 2003 22:43:50 -0800
Subject: [Patches] [ python-Patches-698505 ] docs for hotshot module
Message-ID: <E18rBaA-0001h7-00@sc8-sf-web4.sourceforge.net>

Patches item #698505, was opened at 2003-03-06 00:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: docs for hotshot module

Initial Comment:
The attached provides documentation for the hotshot
module. Assigning to Fred for review.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 01:43

Message:
Logged In: YES 
user_id=80475

The TeX markup checks out fine.

Consider documenting lineevents and linetimings which 
are exposed upon:  import hotshot.

For the example, consider adding a comment line at
the beginning with hints that the example produces large 
files and takes a long time to run.


----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2003-03-06 00:39

Message:
Logged In: YES 
user_id=29957

stupid sourceforge tracker. 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 14:25:06 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 06:25:06 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18rImY-0000lz-00@sc8-sf-web2.sourceforge.net>

Patches item #675422, was opened at 2003-01-27 08:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 09:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 16:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 16:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 07:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-20 23:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 15:04:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 07:04:50 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18rJP0-0004cq-00@sc8-sf-web4.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 16:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 10:04

Message:
Logged In: YES 
user_id=6380

Everything looks fine, except subtracting 1 from the
expression in the PyHeapType_GET_MEMBERS() macro. Thart
makes the first members slot overlap with the 'name' and
'slots' struct members. I'll get rid of the "-1" part.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-03 09:57

Message:
Logged In: YES 
user_id=6380

I'll look at this on Friday.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 15:24:23 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 07:24:23 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18rJhv-0002cU-00@sc8-sf-web3.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 16:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 10:24

Message:
Logged In: YES 
user_id=6380

Checked in, with that one fix.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 10:04

Message:
Logged In: YES 
user_id=6380

Everything looks fine, except subtracting 1 from the
expression in the PyHeapType_GET_MEMBERS() macro. Thart
makes the first members slot overlap with the 'name' and
'slots' struct members. I'll get rid of the "-1" part.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-03 09:57

Message:
Logged In: YES 
user_id=6380

I'll look at this on Friday.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 15:51:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 07:51:35 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18rK8F-0001ky-00@sc8-sf-web1.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 22:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

>Comment By: Christian Tismer (tismer)
Date: 2003-03-07 16:51

Message:
Logged In: YES 
user_id=105700

Oops! You are right.
I forgot to back-port that change into the future. My
2.2.2 version already reads like this:

/* access macro to the members which are floating "behind"
the object */
#define PyHeapType_GET_MEMBERS(etype) \
	((PyMemberDef *)(((char *)etype) +
(etype)->type.ob_type->tp_basicsize))

Thanks for taking care -- chris

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 16:24

Message:
Logged In: YES 
user_id=6380

Checked in, with that one fix.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 16:04

Message:
Logged In: YES 
user_id=6380

Everything looks fine, except subtracting 1 from the
expression in the PyHeapType_GET_MEMBERS() macro. Thart
makes the first members slot overlap with the 'name' and
'slots' struct members. I'll get rid of the "-1" part.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-03 15:57

Message:
Logged In: YES 
user_id=6380

I'll look at this on Friday.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 17:37:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 09:37:27 -0800
Subject: [Patches] [ python-Patches-696193 ] Enable __slots__ for meta-types
Message-ID: <E18rLmh-0001Pr-00@sc8-sf-web3.sourceforge.net>

Patches item #696193, was opened at 2003-03-02 16:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Christian Tismer (tismer)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Enable __slots__ for meta-types

Initial Comment:
The new type system allows non-empty __slots__ only
for fixed-size objects.

Meta-types are types which instances are also types.
types are variable-sized, because they take the slot
definitions for their instances, so the cannot have
extra members from their meta-type.

The proposed solution allows for two things:
a) meta-types can have slots
b) extensions get access to the whole type object and
    can create extended types with private fields.

The changes providing this are quite simple:
- replace the internal hidden "etype" and turn it into
  an explicit PyHeapTypeObject in object.h
- instead of a fixed offset into the former etype, the
slots
  calculation is based upon tp_basicsize.

To keep things easy, I added a macro which does this
calculation, and member access read now like so:

before:
	type->tp_members = et->members;
after:
	type->tp_members = PyHeapType_GET_MEMBERS(et);

This patch has been tested thoroughly in my own code since
Python 2.2, and I think it is ripe to get into the
distribution.
It has almost no impact on speed or simlicity.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 12:37

Message:
Logged In: YES 
user_id=6380

You're welcome. That's what I'm here for. :-)

----------------------------------------------------------------------

Comment By: Christian Tismer (tismer)
Date: 2003-03-07 10:51

Message:
Logged In: YES 
user_id=105700

Oops! You are right.
I forgot to back-port that change into the future. My
2.2.2 version already reads like this:

/* access macro to the members which are floating "behind"
the object */
#define PyHeapType_GET_MEMBERS(etype) \
	((PyMemberDef *)(((char *)etype) +
(etype)->type.ob_type->tp_basicsize))

Thanks for taking care -- chris

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 10:24

Message:
Logged In: YES 
user_id=6380

Checked in, with that one fix.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 10:04

Message:
Logged In: YES 
user_id=6380

Everything looks fine, except subtracting 1 from the
expression in the PyHeapType_GET_MEMBERS() macro. Thart
makes the first members slot overlap with the 'name' and
'slots' struct members. I'll get rid of the "-1" part.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-03 09:57

Message:
Logged In: YES 
user_id=6380

I'll look at this on Friday.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696193&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 21:18:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 13:18:09 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rPEH-00053l-00@sc8-sf-web4.sourceforge.net>

Patches item #698520, was opened at 2003-03-05 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 13:18

Message:
Logged In: YES 
user_id=357491

The quotes thing was just a slip-up.  t' fixed in my local copy and thus it will show up when I upload another patch.

I will write up patches to the docs, although the docs guarantee certain methods that are actually conditionally added to the object; should I go ahead and just change the docs to reflect this or rip out the conditionality of the adding of the methods since the file object, if using a socket, is coming from socket.makefile() (I think; urllib seems to be from the 1.5 days and thus is using httplib.HTTP() and thus had to read the code)?

I will also come up with a news item to be pasted into Misc/NEWS by the person who checks this in.

As for the test, though, test_urllib only tests quote().  The module itself has some tests that can be run when the module is __main__, but all it does is fetch various pages and print the output; nothing really there that wouldn't be caught from people using it day-to-day.  In other words there is no good place to put a test since there basically are no tests for this part of the module.  =)  Yes, I could fix this, but that would be a completely separate patch since the quote() tests are not even a PyUnit testing suite.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-06 22:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 23:17:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 15:17:56 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rR6C-0007Hd-00@sc8-sf-web3.sourceforge.net>

Patches item #698520, was opened at 2003-03-06 01:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 18:17

Message:
Logged In: YES 
user_id=80475

That's fine.  Go ahead and load the patch without the 
tests.  Keep it on your todo list.  It would be nice to have 
some good PyUnit tests for this module.

Assign it to me when it's ready and I'll load it.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 16:18

Message:
Logged In: YES 
user_id=357491

The quotes thing was just a slip-up.  t' fixed in my local copy and thus it will show up when I upload another patch.

I will write up patches to the docs, although the docs guarantee certain methods that are actually conditionally added to the object; should I go ahead and just change the docs to reflect this or rip out the conditionality of the adding of the methods since the file object, if using a socket, is coming from socket.makefile() (I think; urllib seems to be from the 1.5 days and thus is using httplib.HTTP() and thus had to read the code)?

I will also come up with a news item to be pasted into Misc/NEWS by the person who checks this in.

As for the test, though, test_urllib only tests quote().  The module itself has some tests that can be run when the module is __main__, but all it does is fetch various pages and print the output; nothing really there that wouldn't be caught from people using it day-to-day.  In other words there is no good place to put a test since there basically are no tests for this part of the module.  =)  Yes, I could fix this, but that would be a completely separate patch since the quote() tests are not even a PyUnit testing suite.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 01:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 23:58:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 15:58:21 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rRjJ-0000Pv-00@sc8-sf-web3.sourceforge.net>

Patches item #698520, was opened at 2003-03-05 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 15:58

Message:
Logged In: YES 
user_id=357491

OK, the new patch has the quote fix.  I also added a single line to the urllib doc saying that it supports the iterator protocol.  I don't know how the naming works for the \ref{} tex directive so I didn't put that in for referencing the iterator type although I suspect it wouldn't hurt.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 15:17

Message:
Logged In: YES 
user_id=80475

That's fine.  Go ahead and load the patch without the 
tests.  Keep it on your todo list.  It would be nice to have 
some good PyUnit tests for this module.

Assign it to me when it's ready and I'll load it.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 13:18

Message:
Logged In: YES 
user_id=357491

The quotes thing was just a slip-up.  t' fixed in my local copy and thus it will show up when I upload another patch.

I will write up patches to the docs, although the docs guarantee certain methods that are actually conditionally added to the object; should I go ahead and just change the docs to reflect this or rip out the conditionality of the adding of the methods since the file object, if using a socket, is coming from socket.makefile() (I think; urllib seems to be from the 1.5 days and thus is using httplib.HTTP() and thus had to read the code)?

I will also come up with a news item to be pasted into Misc/NEWS by the person who checks this in.

As for the test, though, test_urllib only tests quote().  The module itself has some tests that can be run when the module is __main__, but all it does is fetch various pages and print the output; nothing really there that wouldn't be caught from people using it day-to-day.  In other words there is no good place to put a test since there basically are no tests for this part of the module.  =)  Yes, I could fix this, but that would be a completely separate patch since the quote() tests are not even a PyUnit testing suite.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-06 22:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Fri Mar  7 23:58:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 15:58:17 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rRjF-0001r1-00@sc8-sf-web4.sourceforge.net>

Patches item #698520, was opened at 2003-03-05 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 15:58

Message:
Logged In: YES 
user_id=357491

OK, the new patch has the quote fix.  I also added a single line to the urllib doc saying that it supports the iterator protocol.  I don't know how the naming works for the \ref{} tex directive so I didn't put that in for referencing the iterator type although I suspect it wouldn't hurt.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 15:17

Message:
Logged In: YES 
user_id=80475

That's fine.  Go ahead and load the patch without the 
tests.  Keep it on your todo list.  It would be nice to have 
some good PyUnit tests for this module.

Assign it to me when it's ready and I'll load it.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 13:18

Message:
Logged In: YES 
user_id=357491

The quotes thing was just a slip-up.  t' fixed in my local copy and thus it will show up when I upload another patch.

I will write up patches to the docs, although the docs guarantee certain methods that are actually conditionally added to the object; should I go ahead and just change the docs to reflect this or rip out the conditionality of the adding of the methods since the file object, if using a socket, is coming from socket.makefile() (I think; urllib seems to be from the 1.5 days and thus is using httplib.HTTP() and thus had to read the code)?

I will also come up with a news item to be pasted into Misc/NEWS by the person who checks this in.

As for the test, though, test_urllib only tests quote().  The module itself has some tests that can be run when the module is __main__, but all it does is fetch various pages and print the output; nothing really there that wouldn't be caught from people using it day-to-day.  In other words there is no good place to put a test since there basically are no tests for this part of the module.  =)  Yes, I could fix this, but that would be a completely separate patch since the quote() tests are not even a PyUnit testing suite.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-06 22:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Sat Mar  8 04:42:20 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 07 Mar 2003 20:42:20 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18rWA8-0000yu-00@sc8-sf-web4.sourceforge.net>

Patches item #675422, was opened at 2003-01-28 00:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Stuart Bishop (zenzen)
Date: 2003-03-08 15:42

Message:
Logged In: YES 
user_id=46639

Leave it commented out or remove that line. It is testing
unimportant behaviour that looks more platform dependant
than I suspected (and now I look at it again, what tzname
should be set to if the timezone is unknow is unspecified by
the tzset(3) docs). The important behaviour is that:

a) the system silently falls back to UTC if the timezone is
unknown, and this is tested elsewhere 

b) calling tzset resets tzname, which is also tested elsewhere.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-08 01:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-22 08:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-22 08:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 23:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 15:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Sat Mar  8 12:06:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 04:06:27 -0800
Subject: [Patches] [ python-Patches-658327 ] Add inet_pton and inet_ntop to socket
Message-ID: <E18rd5v-0002OD-00@sc8-sf-web4.sourceforge.net>

Patches item #658327, was opened at 2002-12-24 22:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jp Calderone (kuran)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Add inet_pton and inet_ntop to socket

Initial Comment:
Patch is against current CVS and adds two socket module
functions, inet_pton and inet_ntop.  Both of these
should be available on all platforms (because of other
dependancies in the code) so I don't think portability
is a problem.  inet_ntop converts a packed IP address
to a human-readable '.' or ':' separated string
representation of the IP.  inet_pton performs the
reverse operation.

(Potential) problems: inet_pton sets errno to ENOSPC,
which may lead to a confusing error message.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-04 08:05

Message:
Logged In: YES 
user_id=21627

My two suggestions aren't exclusive: If you have the native
inet_pton, you can *always* support IPv6 addresses with
that, regardless of whether --enable-ipv6 was passed to
configure or not.

If that is done, it will be a legitime test failure for
inet_pton not to support IPv6 - after all, the primary
reason to define this function was to support IPv6, so if
the native function fails to do so, there is clearly a bug
in the system.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-04 04:41

Message:
Logged In: YES 
user_id=33168

I added the #ifdef, but that doesn't address the testing
problem.  If the platform has inet_pton, but doesn't have
IPv6 ENABLED.  The inet_pton will be exported, but there's
no good way to tell if you can pass an IPv6 address.  The
only way to test if IPv6 is enabled would be to call
inet_pton with AF_INET6, catch a socket.error and check if
the exception message is "unknown address family".  Since
this is really a testing issue, perhaps that's best after all?

Do you agree this should be done?
 * Remove has_ipv6
 * Export inet_pton & inet_ntop only if defined for platform
 * Only try to test inet_pton/ntop if defined for platform
 * Modify the tests to pass a valid IPv6 test, catch
socket.error, if the error message is "unknown address
family", don't test ipv6 any further, if the error message
is different, raise TestFailed, if no exception, test all
IPv6 addresses

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-03 23:25

Message:
Logged In: YES 
user_id=33168

As I recall, yes, has_ipv6 is only for tests.  There was no
way to distinguish if python was built with IPv6 support,
since AF_INET6 was always defined.

Your second approach sounds like it will work.  I need to
review the code, though.  I've forgotten how it works. :-(

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-03 11:15

Message:
Logged In: YES 
user_id=21627

The has_ipv6 test is only there for the tests? In that case,
drop it, and just perform AF_INET6 conversions unconditionally.

OTOH, I think we should not expose the emulated inet_pton:
it doesn't set errno correctly, and offers no advantage over
inet_addr. So wrap the entire code with HAVE_INET_PTON, and
only perform the tests if the function is supported.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-05 03:40

Message:
Logged In: YES 
user_id=33168

I was just about to check this in, but then I ran into a
problem.  IPv6 may not be enabled, even if the constant
AF_INET6 exists.  The cleanest way I saw to address this in
the test was to add a has_ipv6 boolean constant to the
socket module.  Martin, do you think this is acceptable?

Attached is a complete patch which should be safe (based on
the discussion below), includes tests and doc changes.

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2003-01-11 18:04

Message:
Logged In: YES 
user_id=366566

Yea, testing for the proper input length is definitely
something that should be done.  The patch looks good, but
for one thing.  If the specified address family is neither
AF_INET nor AF_INET6, the length won't be tested and the
underlying inet_ntop will be called.  This isn't a problem
now (afaik) because only those two address families are
support, but in a future libc version with more supported
address families, it might open a similar hole to the one
you've fixed.  Perhaps the

+       } else {
+               PyErr_SetString(socket_error, "unknown
address family");
+               return NULL;
+       }

should be moved up from the second if-grouping to follow the
first if-grouping.  Everything else looks good to me. 
Thanks for taking the time to look at this :)


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-11 04:49

Message:
Logged In: YES 
user_id=33168

JP, do you agree with my comment on 2002-12-30 about the
checks?  I have attached an updated patch.  Please review
and verify this is correct.

Thank you for the additional tests.  Feel free to submit
patches with additional tests for any and all modules!

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-31 17:52

Message:
Logged In: YES 
user_id=366566

Doc, NEWS, and test_socket patch attached.  I didn't notice
any inet_aton/inet_ntoa tests in the module so I added a
couple for those as well (I excluded a test for
inet_ntoa('255.255.255.255') ;) Also included are a couple
IPv6 tests.  I'm not sure if these are appropriate, since
many systems may still lack the required support for them to
pass.  I'll leave it up to you to decide whether they should
be commented out or removed or whatever.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-31 14:17

Message:
Logged In: YES 
user_id=21627

I agree that such a change should be added. Neal, you have
given this patch more attention than I did - please check it
in when you consider it complete. I just like to point out
that it is missing documentation changes (libsocket.tex), a
NEWS entry, and a test case. kuran, please provide those as
a single patch file.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-31 01:11

Message:
Logged In: YES 
user_id=33168

ISTM that in socket_inet_ntop() you need to verify the size
of the packed value passed in.  If the user passes an empty
string, inet_ntop() could read beyond the buffer passed in,
potentially causing a core dump.

The checks could be something like this:

  if (af == AF_INET && len != sizeof(struct in_addr))
  else if (af == AF_INET6 && len != sizeof(struct in6_addr))

Do this make sense?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-27 16:39

Message:
Logged In: YES 
user_id=366566

The use case I have for it at the moment is a DNS server
(Twisted.names).  inet_pton allows me to handle IPv6
addresses, so it allows me to support AAAA and A6 records. 
I believe an IPv6 capable socks proxy would find this useful
as well.  Basically, low level network stuff.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-27 11:23

Message:
Logged In: YES 
user_id=21627

What is the rationale for providing this functionality?

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-26 19:32

Message:
Logged In: YES 
user_id=366566

Ooops, I made two, and uploaded the wrong one >:O  Sorry. 
Dunno if it's still helpful, but here's the unified diff.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 19:10

Message:
Logged In: YES 
user_id=33168

Next time, please use context or unified diff.  -c or -u
option to cvs diff:  cvs diff -c ...

----------------------------------------------------------------------

Comment By: Jp Calderone (kuran)
Date: 2002-12-24 22:05

Message:
Logged In: YES 
user_id=366566

Sourceforge decided not to attach the file the first time...
 Here it is.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658327&group_id=5470


From noreply@sourceforge.net  Sat Mar  8 19:15:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 11:15:56 -0800
Subject: [Patches] [ python-Patches-700047 ] unicode object leaks refcount on resizing
Message-ID: <E18rjnY-0006Qz-00@sc8-sf-web3.sourceforge.net>

Patches item #700047, was opened at 2003-03-09 04:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700047&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode object leaks refcount on resizing

Initial Comment:
This code duplicates the situation:

static PyObject *
leaktest(PyObject *self, PyObject *args)
{
    PyObject *u;

    u = PyUnicode_FromUnicode(NULL, 1);
    if (u == NULL)
        return NULL;

    if (PyUnicode_Resize(&u, 0) == -1)
        return NULL;

    return u;
1
}

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700047&group_id=5470


From noreply@sourceforge.net  Sat Mar  8 20:02:42 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 12:02:42 -0800
Subject: [Patches] [ python-Patches-684677 ] Allow freeze to exclude implicits
Message-ID: <E18rkWo-00082G-00@sc8-sf-web3.sourceforge.net>

Patches item #684677, was opened at 2003-02-11 16:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684677&group_id=5470

Category: Demos and tools
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Lawrence Hudson (lhudson)
Assigned to: Just van Rossum (jvr)
Summary: Allow freeze to exclude implicits

Initial Comment:
Freeze always freezes site and exceptions.  This patch
allows these implicit modules to be excluded using the
-x switch.


----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-08 21:02

Message:
Logged In: YES 
user_id=92689

Applied, it's in rev. 1.43 of freeze.py.

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-16 19:36

Message:
Logged In: YES 
user_id=92689

The patch looks good, I'll have a closer look later.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684677&group_id=5470


From noreply@sourceforge.net  Sun Mar  9 05:46:02 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 21:46:02 -0800
Subject: [Patches] [ python-Patches-698520 ] Iterator for urllib.URLOpener
Message-ID: <E18rtdK-0001Sb-00@sc8-sf-web2.sourceforge.net>

Patches item #698520, was opened at 2003-03-06 01:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Raymond Hettinger (rhettinger)
Summary: Iterator for urllib.URLOpener

Initial Comment:
4 line patch to give urllib.URLOpener an iterator.  Follows design of module and adds methods only if the file object used internally has __iter__ and adds 'next' only if __iter__ was added.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-09 00:46

Message:
Logged In: YES 
user_id=80475

Committed as:

Lib/urllib.py 1.155
Misc/NEWS 1.693
Doc/lib/liburllib.tex 1.45


----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 18:58

Message:
Logged In: YES 
user_id=357491

OK, the new patch has the quote fix.  I also added a single line to the urllib doc saying that it supports the iterator protocol.  I don't know how the naming works for the \ref{} tex directive so I didn't put that in for referencing the iterator type although I suspect it wouldn't hurt.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 18:17

Message:
Logged In: YES 
user_id=80475

That's fine.  Go ahead and load the patch without the 
tests.  Keep it on your todo list.  It would be nice to have 
some good PyUnit tests for this module.

Assign it to me when it's ready and I'll load it.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-07 16:18

Message:
Logged In: YES 
user_id=357491

The quotes thing was just a slip-up.  t' fixed in my local copy and thus it will show up when I upload another patch.

I will write up patches to the docs, although the docs guarantee certain methods that are actually conditionally added to the object; should I go ahead and just change the docs to reflect this or rip out the conditionality of the adding of the methods since the file object, if using a socket, is coming from socket.makefile() (I think; urllib seems to be from the 1.5 days and thus is using httplib.HTTP() and thus had to read the code)?

I will also come up with a news item to be pasted into Misc/NEWS by the person who checks this in.

As for the test, though, test_urllib only tests quote().  The module itself has some tests that can be run when the module is __main__, but all it does is fetch various pages and print the output; nothing really there that wouldn't be caught from people using it day-to-day.  In other words there is no good place to put a test since there basically are no tests for this part of the module.  =)  Yes, I could fix this, but that would be a completely separate patch since the quote() tests are not even a PyUnit testing suite.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 01:22

Message:
Logged In: YES 
user_id=80475

Looks good.
Tests out okay.
Use double quotes throughout.
Consider adding a news item, docs, and a test.
Assign back to me when you think it's ready to go. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698520&group_id=5470


From noreply@sourceforge.net  Sun Mar  9 07:23:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 23:23:31 -0800
Subject: [Patches] [ python-Patches-667730 ] More DictMixin
Message-ID: <E18rv9f-0001fQ-00@sc8-sf-web3.sourceforge.net>

Patches item #667730, was opened at 2003-01-14 08:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Sebastien Keim (s_keim)
Assigned to: Raymond Hettinger (rhettinger)
Summary: More DictMixin

Initial Comment:
This patch is intended to provide a more consistent
implementation for the various dictionary like objects
of the standard library.

test_userdict has been rewritten, it now use unittest
and define a test-case wich allow to check for
conformity with the dictionary protocol. 

test_shelve and test_weakref have been rewritten to use
the test_userdict test-case.

test_os has been extended: a new test case check for
environ object conformity to the dictionary protocol.

The patch modify the UserDict module:
* The doc says that __contains__ should be one of the
methods to redefine for better efficiency but the
implementation make __contains__ dependent of has_key
definition. The patch reverse methods dependencies.
* Change iterkey = __iter__ to def iterkey(self):
return self.__iter__() to make iterkey able to use
overiden __iter__ methods. 
* I have also a added __init__, copy and  __repr__
methods to DictMixin. 
* The UserDict.UserDict class is a subclass of
DictMixin, this allow to simplify UserDict
implementation. The patch is rather conservative since
a lot of methods definition could still be removed from
UserDict.

In the weakref module, the  patch make
WeakValueDictionnary and WeakKeyDictionnary subclasses
of UserDict.DictMixin. It also use nested scopes, the
new generators syntax  for iterator methods and rewrite
WeakKeyDictionnary.__delitem__ . All of this allow to
decrease the 
module size by 50%.

In the shelve module, the patch add a copy() method
which return a dictionary with the keys and values of
the database.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-09 02:23

Message:
Logged In: YES 
user_id=80475

Accepted patch.  Made the suggested fix-ups.  Fixed 
spelling.  Replace _tested_class method with an 
equivalent class variable.

Applied as:
Lib/weakref.py 1.19
Lib/test/test_userdict.py 1.14
Lib/test/test_os.py 1.14
Lib/test/test_shelve.py 1.3
Lib/test/test_weakref.py 1.22

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 00:52

Message:
Logged In: YES 
user_id=80475

The patch looks good.
Please make two adjustments and re-submit.

1) Change the test_func docstrings to comment blocks.  If 
a docstring is present, test support will print them in the 
summary instead of the test name.

2) Change the logic for mapping.pop() to accommodate 
the new default argument option which was added 
yesterday.  The format is m.pop(key[, default]).

----------------------------------------------------------------------

Comment By: Sebastien Keim (s_keim)
Date: 2003-03-03 10:27

Message:
Logged In: YES 
user_id=498191

I have downloaded a new version of the patch updated to
Python2.3a2

I hope to have removed all the stuff which could break
backward compatibility since the new proposed patch contain
now only the testing stuff (well, almost since I have also
added a pop method to the weak dictionary classes to make
them compatible with the test case).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:50

Message:
Logged In: YES 
user_id=80475

Also, +1 on consolidating the test cases though it should 
be done after any other changes to the files so we can 
make sure that nothing got broken.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-15 21:35

Message:
Logged In: YES 
user_id=80475

* UserDict.UserDict should not change. As Martin pointed-
out, inheriting from object changes the semantics in a non-
backward compatible way.  Also, the class is efficiently 
implemented in terms an internal dictionary and would be 
slowed down by the nest of calls in Mixin.  Also, I think the 
code in incorrect in defining __iter__, there was a reason it 
was pulled out into a separate subclass -- that was done in 
Py2.2. and is not an easily reversible decision.

*  -0 on the changes to has_key() and __contains__(). 
has_key() was put at a lower level than __contains__ 
because the older dict-style interfaces all define has_key.

* +1 for changing iterkeys() to a full definition (and +1 for 
doing the same for __iter__()).  Sabastien is correct is 
pointing out the advantages for propagating an overridden 
method.

* -1 for altering repr() implementation.  The current 
approach is shorter, cleaner, and faster.

* -1 for adding __nonzero__().  Even dictionaries don't 
implement this method; they let len() do the talking.

* -1 for adding __init__() and copy().  Both need to make 
assumptions about the order and number of parameters 
in the constructor of the class using the mixin.  I think they 
are rarely helpful and are sometime harmful in introducing 
surprising, hard-to-find errors.  People who need an init() 
or copy() can code them more cleanly and directly in the 
extending class.  Also, I don't think the code is correct 
since DictMixin will be a base class, the use of super() is 
not what is wanted here -- *if* you were going to do this, 
try something like self.__class__().  Further, adding these 
methods violates my original intent for this class which 
was to extrapolate four basic mapping methods into a full 
mapping interface.  It was not intended as a stand-alone 
class.  Also, copy() cannot guarantee that it is copying all 
the relevant data for the sub-class and that violates the 
definition of what copy() is supposed to do.  If something 
like this were attempted, it should be its own mixin 
(automatically adding copy support to any class) and it 
should be rather sophisticated about how to perfectly 
replicate itself (not easily done if the underlying data is in a 
file, database, or in a distributed app).

* +0 on changing weakdicts provided it is done minimally 
and carefully with attention to leaving semantics 
unchanged and not slowing performance.  The advantage 
goes beyond consistency, it removes code duplication, 
keeps well thought-out logic in one place, and provides an 
automatic interface update from DictMixin if the dictionary 
interface ever sprouts another method.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-14 16:43

Message:
Logged In: YES 
user_id=21627

This patch breaks backwards compatibility. UserDict is an
oldstyle class on purpose, since changing it to a newstyle
class will certainly break the compatibility in subtle ways
(e.g. by changing what type(userdictinstance) is).

Unless you can bring forward a better rationale than
consistency, this patch will be rejected.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667730&group_id=5470


From noreply@sourceforge.net  Sun Mar  9 07:42:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 23:42:58 -0800
Subject: [Patches] [ python-Patches-700047 ] unicode object leaks refcount on resizing
Message-ID: <E18rvSU-00023c-00@sc8-sf-web3.sourceforge.net>

Patches item #700047, was opened at 2003-03-08 14:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700047&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: unicode object leaks refcount on resizing

Initial Comment:
This code duplicates the situation:

static PyObject *
leaktest(PyObject *self, PyObject *args)
{
    PyObject *u;

    u = PyUnicode_FromUnicode(NULL, 1);
    if (u == NULL)
        return NULL;

    if (PyUnicode_Resize(&u, 0) == -1)
        return NULL;

    return u;
1
}

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-09 02:42

Message:
Logged In: YES 
user_id=80475

Applied patch as:  Objects/unicodeobject.c 2.184

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700047&group_id=5470


From noreply@sourceforge.net  Sun Mar  9 07:56:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 23:56:49 -0800
Subject: [Patches] [ python-Patches-691928 ] Use datetime in _strptime
Message-ID: <E18rvft-0002Gy-00@sc8-sf-web3.sourceforge.net>

Patches item #691928, was opened at 2003-02-23 19:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Use datetime in _strptime

Initial Comment:
To prevent code duplication, I patched _strptime to use datetime's date object to do Julian day, Gregorian, and day of the week calculations (Tim's code has to be more reliable than mine  =).  Patch also includes new regression tests to test results and calculation gets triggered.

Very minor comment changes and my contact email are also changed.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-09 02:56

Message:
Logged In: YES 
user_id=80475

Applied patch as:

Lib/_strptime.py 1.13
Lib/test/test_strptime.py 1.10

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-03 16:14

Message:
Logged In: YES 
user_id=357491

Response to meta comment - I would normally delete it, Skip, but last time I tried I was told I didn't have the proper rights to do it.  Unless SF has changed their setup to allow patch creators to manage the files regardless of whether they have CVS access I can't.

Response to comment comment - The reason I am doing this is that I want to make sure that the returned time tuple is a valid date.  If strptime is going to have default values I want those values to lead to a valid time that does not require someone to have to do more processing or wonder whether it is valid.

Now currently the docs say you can't expect anything back in the time tuple but what was in the data string, so doing this does not go against the docs.  But if strptime becomes the only strptime implementation, then I will write a doc patch to make the docs say that all returned time tuples will be valid dates.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-03-03 10:03

Message:
Logged In: YES 
user_id=44345

Meta comment - I think that when uploading successive patches it's useful
to either name them differently or delete the prior one to avoid confusion.
In this case it's not a big deal, especially since the submission dates are
different, but after a few revisions it can sometimes be a challenge to
figure out which patch should be downloaded.
 
Comment comment - Unless there's some evidence the elided functions
have been used, I suspect it best to just let people use the relevant
datetime functions.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-02-25 16:51

Message:
Logged In: YES 
user_id=357491

Only in the module (which was removed).  None of the helper functions have ever been publicly advertised (although I think the locale date info might be helpful in locale; MvL wasn't interested, though).

I uploaded a new diff that removes one more line that I forgot to remove when I eliminated the ability to pass in a regex object.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-23 19:56

Message:
Logged In: YES 
user_id=33168

Brett, is there any doc for the functions that were removed?
   firstjulian, gregorian, julianday, dayofweek

Otherwise, the patch seemed fine (but I didn't look that
closely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470


From noreply@sourceforge.net  Sun Mar  9 07:57:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 08 Mar 2003 23:57:09 -0800
Subject: [Patches] [ python-Patches-691928 ] Use datetime in _strptime
Message-ID: <E18rvgD-0002Hj-00@sc8-sf-web3.sourceforge.net>

Patches item #691928, was opened at 2003-02-23 19:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: Use datetime in _strptime

Initial Comment:
To prevent code duplication, I patched _strptime to use datetime's date object to do Julian day, Gregorian, and day of the week calculations (Tim's code has to be more reliable than mine  =).  Patch also includes new regression tests to test results and calculation gets triggered.

Very minor comment changes and my contact email are also changed.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-09 02:56

Message:
Logged In: YES 
user_id=80475

Applied patch as:

Lib/_strptime.py 1.13
Lib/test/test_strptime.py 1.10

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-03 16:14

Message:
Logged In: YES 
user_id=357491

Response to meta comment - I would normally delete it, Skip, but last time I tried I was told I didn't have the proper rights to do it.  Unless SF has changed their setup to allow patch creators to manage the files regardless of whether they have CVS access I can't.

Response to comment comment - The reason I am doing this is that I want to make sure that the returned time tuple is a valid date.  If strptime is going to have default values I want those values to lead to a valid time that does not require someone to have to do more processing or wonder whether it is valid.

Now currently the docs say you can't expect anything back in the time tuple but what was in the data string, so doing this does not go against the docs.  But if strptime becomes the only strptime implementation, then I will write a doc patch to make the docs say that all returned time tuples will be valid dates.

----------------------------------------------------------------------

Comment By: Skip Montanaro (montanaro)
Date: 2003-03-03 10:03

Message:
Logged In: YES 
user_id=44345

Meta comment - I think that when uploading successive patches it's useful
to either name them differently or delete the prior one to avoid confusion.
In this case it's not a big deal, especially since the submission dates are
different, but after a few revisions it can sometimes be a challenge to
figure out which patch should be downloaded.
 
Comment comment - Unless there's some evidence the elided functions
have been used, I suspect it best to just let people use the relevant
datetime functions.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-02-25 16:51

Message:
Logged In: YES 
user_id=357491

Only in the module (which was removed).  None of the helper functions have ever been publicly advertised (although I think the locale date info might be helpful in locale; MvL wasn't interested, though).

I uploaded a new diff that removes one more line that I forgot to remove when I eliminated the ability to pass in a regex object.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-23 19:56

Message:
Logged In: YES 
user_id=33168

Brett, is there any doc for the functions that were removed?
   firstjulian, gregorian, julianday, dayofweek

Otherwise, the patch seemed fine (but I didn't look that
closely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=691928&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 14:22:23 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 06:22:23 -0800
Subject: [Patches] [ python-Patches-700839 ] various gettext fixes
Message-ID: <E18sOAZ-000535-00@sc8-sf-web4.sourceforge.net>

Patches item #700839, was opened at 2003-03-10 15:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700839&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Juan David Ib��ez Palomar (jdavid)
Assigned to: Nobody/Anonymous (nobody)
Summary: various gettext fixes

Initial Comment:
>From a message from Bruno Haible [1] here there is a
patch that fixes several gettext bugs:

- The ! operator was treated incorrectly if not
  followed by an space.

- Now unbalanced parentheses in a plural forms
  expression give a more meaningful error.

- Provide a plural forms expression default as
  libintl and msgfmt do.

- Don't test that the header entry starts with
  'Project-Id-Version:', the PO format does not
  require it.


[1]
http://mail.python.org/pipermail/i18n-sig/2003-February/001543.html

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700839&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:08:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:08:24 -0800
Subject: [Patches] [ python-Patches-700858 ] Replacing and deleting files in a zipfile archive.
Message-ID: <E18sOt6-0007Oi-00@sc8-sf-web4.sourceforge.net>

Patches item #700858, was opened at 2003-03-11 01:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Nev Delap (nevdelap)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replacing and deleting files in a zipfile archive.

Initial Comment:
Addition of replace, replacestr and delete methods into 
zipfile.py.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:14:51 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:14:51 -0800
Subject: [Patches] [ python-Patches-700858 ] Replacing and deleting files in a zipfile archive.
Message-ID: <E18sOzL-0007VV-00@sc8-sf-web4.sourceforge.net>

Patches item #700858, was opened at 2003-03-11 01:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Nev Delap (nevdelap)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replacing and deleting files in a zipfile archive.

Initial Comment:
Addition of replace, replacestr and delete methods into 
zipfile.py.

----------------------------------------------------------------------

>Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:14

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:15:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:15:10 -0800
Subject: [Patches] [ python-Patches-700858 ] Replacing and deleting files in a zipfile archive.
Message-ID: <E18sOze-0007W0-00@sc8-sf-web4.sourceforge.net>

Patches item #700858, was opened at 2003-03-11 01:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Nev Delap (nevdelap)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replacing and deleting files in a zipfile archive.

Initial Comment:
Addition of replace, replacestr and delete methods into 
zipfile.py.

----------------------------------------------------------------------

>Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:15

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:14

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:16:57 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:16:57 -0800
Subject: [Patches] [ python-Patches-700858 ] Replacing and deleting files in a zipfile archive.
Message-ID: <E18sP1N-0007Xg-00@sc8-sf-web4.sourceforge.net>

Patches item #700858, was opened at 2003-03-11 01:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Nev Delap (nevdelap)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replacing and deleting files in a zipfile archive.

Initial Comment:
Addition of replace, replacestr and delete methods into 
zipfile.py.

----------------------------------------------------------------------

>Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:16

Message:
Logged In: YES 
user_id=730416

.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:15

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:14

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:19:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:19:26 -0800
Subject: [Patches] [ python-Patches-700858 ] Replacing and deleting files in a zipfile archive.
Message-ID: <E18sP3m-0007b3-00@sc8-sf-web4.sourceforge.net>

Patches item #700858, was opened at 2003-03-11 01:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Nev Delap (nevdelap)
Assigned to: Nobody/Anonymous (nobody)
Summary: Replacing and deleting files in a zipfile archive.

Initial Comment:
Addition of replace, replacestr and delete methods into 
zipfile.py.

----------------------------------------------------------------------

>Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:19

Message:
Logged In: YES 
user_id=730416

OK, so after refreshing it finally decided to show the files I'd 
added.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:16

Message:
Logged In: YES 
user_id=730416

.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:15

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

Comment By: Nev Delap (nevdelap)
Date: 2003-03-11 01:14

Message:
Logged In: YES 
user_id=730416

The file upload say "Successful" but the file isn't listed!? I've 
tried it several times and yes I've checked the checkbox.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700858&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 15:28:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 07:28:48 -0800
Subject: [Patches] [ python-Patches-649762 ] Fix: asynchat.py: endless loop
Message-ID: <E18sPCq-000146-00@sc8-sf-web1.sourceforge.net>

Patches item #649762, was opened at 2002-12-06 16:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649762&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Bernhard Reiter (ber)
Assigned to: A.M. Kuchling (akuchling)
Summary: Fix: asynchat.py: endless loop

Initial Comment:
Patch against asynchat.py revision 1.19 in Python SF CVS. 
 
Fixes endless loop when terminator='' is used. 
 
Diagnosis: 
If we do not catch the empty string no buffer will be consumed 
lin line 134 and the while loop does not terminate. 
 
Cure:  
Go back to old behaviour and call collect everything with '' and None. 
 
Background: 
Especially annoying because early versions (rev 1.1, coming with 
python1.5) 
did not have this bug and the comment in set_terminator()  
says that strings of all length are okay (among other things). 
The bug was introduced in rev 1.2. 
 
	Bernhard Reiter <bernhard@intevation.de> 

----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2003-03-10 10:28

Message:
Logged In: YES 
user_id=11375

A fix has been checked in as rev.1.21 of asynchat.py in the CVS tree.  Thanks for your help!


----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2003-02-03 14:39

Message:
Logged In: YES 
user_id=113859

Yes, of course.
I stopped experimenting with numeric and empty string
terminators
after hitting this bug, so I uploaded the flawed fix.

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2003-02-03 14:34

Message:
Logged In: YES 
user_id=11375

Surely in your patched version, the code should be 
'if not terminator: ...'.  Otherwise the patch reverses the sense of the test.


----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2002-12-06 17:37

Message:
Logged In: YES 
user_id=113859

The patch also fixes the terminator=0 problem 
which is similiar. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649762&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 16:14:05 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 08:14:05 -0800
Subject: [Patches] [ python-Patches-700839 ] various gettext fixes
Message-ID: <E18sPuf-0003wM-00@sc8-sf-web2.sourceforge.net>

Patches item #700839, was opened at 2003-03-10 15:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700839&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Juan David Ib��ez Palomar (jdavid)
Assigned to: Nobody/Anonymous (nobody)
Summary: various gettext fixes

Initial Comment:
>From a message from Bruno Haible [1] here there is a
patch that fixes several gettext bugs:

- The ! operator was treated incorrectly if not
  followed by an space.

- Now unbalanced parentheses in a plural forms
  expression give a more meaningful error.

- Provide a plural forms expression default as
  libintl and msgfmt do.

- Don't test that the header entry starts with
  'Project-Id-Version:', the PO format does not
  require it.


[1]
http://mail.python.org/pipermail/i18n-sig/2003-February/001543.html

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-10 17:14

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Committed as gettext.py 1.17.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=700839&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 17:02:22 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 09:02:22 -0800
Subject: [Patches] [ python-Patches-698505 ] docs for hotshot module
Message-ID: <E18sQfO-0006EJ-00@sc8-sf-web2.sourceforge.net>

Patches item #698505, was opened at 2003-03-06 00:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470

Category: Documentation
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Anthony Baxter (anthonybaxter)
>Assigned to: Anthony Baxter (anthonybaxter)
Summary: docs for hotshot module

Initial Comment:
The attached provides documentation for the hotshot
module. Assigning to Fred for review.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2003-03-10 12:02

Message:
Logged In: YES 
user_id=3066

Please commit.  Further changes can be made in CVS.

Thanks!

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-07 01:43

Message:
Logged In: YES 
user_id=80475

The TeX markup checks out fine.

Consider documenting lineevents and linetimings which 
are exposed upon:  import hotshot.

For the example, consider adding a comment line at
the beginning with hints that the example produces large 
files and takes a long time to run.


----------------------------------------------------------------------

Comment By: Anthony Baxter (anthonybaxter)
Date: 2003-03-06 00:39

Message:
Logged In: YES 
user_id=29957

stupid sourceforge tracker. 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=698505&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 17:52:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 09:52:39 -0800
Subject: [Patches] [ python-Patches-663369 ] (email) Escape backslashes in specialsre and escapesre
Message-ID: <E18sRS3-00073K-00@sc8-sf-web3.sourceforge.net>

Patches item #663369, was opened at 2003-01-06 17:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Matthew Woodcraft (mhf)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: (email) Escape backslashes in specialsre and escapesre

Initial Comment:
(email/Utils.py) Escape backslashes in character
classes in specialsre and escapesre.

Patch against sourceforge CVS as of 2003-01-06
python/dist/src/Lib/email/Utils.py  rev 1.21
python/dist/src/Lib/email/test/test_email.py  rev 1.29


----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-10 12:52

Message:
Logged In: YES 
user_id=12800

This patch doesn't look right.  First, we're using raw
strings so we don't need to escape backslashes.  Second, why
did you add backslashes around the word Silly in the test case?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 18:49:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 10:49:32 -0800
Subject: [Patches] [ python-Patches-663369 ] (email) Escape backslashes in specialsre and escapesre
Message-ID: <E18sSL6-0003CA-00@sc8-sf-web1.sourceforge.net>

Patches item #663369, was opened at 2003-01-06 22:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Matthew Woodcraft (mhf)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: (email) Escape backslashes in specialsre and escapesre

Initial Comment:
(email/Utils.py) Escape backslashes in character
classes in specialsre and escapesre.

Patch against sourceforge CVS as of 2003-01-06
python/dist/src/Lib/email/Utils.py  rev 1.21
python/dist/src/Lib/email/test/test_email.py  rev 1.29


----------------------------------------------------------------------

>Comment By: Matthew Woodcraft (mhf)
Date: 2003-03-10 18:49

Message:
Logged In: YES 
user_id=57248

The backslashes need to be escaped, not for the Python string
interpreter, but for the regular expression compiler --
backslashes in
character classes need to be doubled in order to stand for
themselves.
Currently, the backslashes in the character classes are
'escaping' the
following open parenthesis characters, and effectively being
ignored.

The change to the testcase is there in order to test for the
bug being
fixed: backslashes in quoted-strings must be escaped (rfc822
3.3 /
rfc2822 3.2.5).


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-10 17:52

Message:
Logged In: YES 
user_id=12800

This patch doesn't look right.  First, we're using raw
strings so we don't need to escape backslashes.  Second, why
did you add backslashes around the word Silly in the test case?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470


From noreply@sourceforge.net  Mon Mar 10 19:30:29 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 10 Mar 2003 11:30:29 -0800
Subject: [Patches] [ python-Patches-663369 ] (email) Escape backslashes in specialsre and escapesre
Message-ID: <E18sSyj-0002mK-00@sc8-sf-web3.sourceforge.net>

Patches item #663369, was opened at 2003-01-06 17:37
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470

Category: Library (Lib)
Group: None
Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Matthew Woodcraft (mhf)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: (email) Escape backslashes in specialsre and escapesre

Initial Comment:
(email/Utils.py) Escape backslashes in character
classes in specialsre and escapesre.

Patch against sourceforge CVS as of 2003-01-06
python/dist/src/Lib/email/Utils.py  rev 1.21
python/dist/src/Lib/email/test/test_email.py  rev 1.29


----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-10 14:30

Message:
Logged In: YES 
user_id=12800

Gotcha, thanks.  The unittest patch isn't right but I'll
commit a correct one.

----------------------------------------------------------------------

Comment By: Matthew Woodcraft (mhf)
Date: 2003-03-10 13:49

Message:
Logged In: YES 
user_id=57248

The backslashes need to be escaped, not for the Python string
interpreter, but for the regular expression compiler --
backslashes in
character classes need to be doubled in order to stand for
themselves.
Currently, the backslashes in the character classes are
'escaping' the
following open parenthesis characters, and effectively being
ignored.

The change to the testcase is there in order to test for the
bug being
fixed: backslashes in quoted-strings must be escaped (rfc822
3.3 /
rfc2822 3.2.5).


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-10 12:52

Message:
Logged In: YES 
user_id=12800

This patch doesn't look right.  First, we're using raw
strings so we don't need to escape backslashes.  Second, why
did you add backslashes around the word Silly in the test case?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=663369&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 08:08:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 00:08:34 -0800
Subject: [Patches] [ python-Patches-701395 ] Wrong prototype for PyUnicode_Splitlines on documentation
Message-ID: <E18seoM-0004IH-00@sc8-sf-web4.sourceforge.net>

Patches item #701395, was opened at 2003-03-11 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: Wrong prototype for PyUnicode_Splitlines on documentation

Initial Comment:
A mismatch of prototype and description between 
documentation and implementation. 
 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 09:18:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 01:18:31 -0800
Subject: [Patches] [ python-Patches-701395 ] Wrong prototype for PyUnicode_Splitlines on documentation
Message-ID: <E18sfu3-00076T-00@sc8-sf-web4.sourceforge.net>

Patches item #701395, was opened at 2003-03-11 09:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Wrong prototype for PyUnicode_Splitlines on documentation

Initial Comment:
A mismatch of prototype and description between 
documentation and implementation. 
 

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-11 10:18

Message:
Logged In: YES 
user_id=38388

Looks good. Assigned to Fred.

Thanks.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 12:32:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 04:32:59 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18siwF-00081A-00@sc8-sf-web3.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 14:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Nobody/Anonymous (nobody)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 18:34:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 10:34:53 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18soaT-0005lw-00@sc8-sf-web1.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 13:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Nobody/Anonymous (nobody)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 19:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 18:59:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 10:59:27 -0800
Subject: [Patches] [ python-Patches-701743 ] Reloading pseudo modules
Message-ID: <E18soyF-0002bM-00@sc8-sf-web4.sourceforge.net>

Patches item #701743, was opened at 2003-03-11 19:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reloading pseudo modules

Initial Comment:
Python allows to put something that is not a module in
sys.modules. Unfortunately reload() does not work wth
such a pseudo module ("TypeError: reload() argument
must be module" is raised). This patch changes
Python/import.c::PyImport_ReloadModule() so that it
works with anything that has a __name__ attribute that
can be found in sys.modules.keys().

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470


From noreply@sourceforge.net  Tue Mar 11 19:15:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 11:15:24 -0800
Subject: [Patches] [ python-Patches-662807 ] Port tests to unittest
Message-ID: <E18spDg-0008C8-00@sc8-sf-web1.sourceforge.net>

Patches item #662807, was opened at 2003-01-05 21:50
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662807&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: Port tests to unittest

Initial Comment:
This patch ports the three tests test_pow.py, 
test_charmapcodec.py and test_userdict.py to unittest.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 20:15

Message:
Logged In: YES 
user_id=89016

Here's the next one: test___all__.py ported to PyUnit and
updated.

A better solution might be to replace __builtin__.__import__
in regrtest.py and test for the __all__ attribute there.
Additionally this might allow us to check which modules are
imported by regrtest.py and which are not and require
additional tests.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-26 16:08

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/test_ucn.py 1.12
Lib/test/test_unicodedata.py 1.7
Lib/test/output/test_ucn delete
Lib/test/output/test_unicodedata delete


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-26 14:42

Message:
Logged In: YES 
user_id=38388

test_ucn and test_unicodedata look OK.


----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-25 18:53

Message:
Logged In: YES 
user_id=89016

OK, here are the next few ports: test_ucn and
test_unicodedata. I'm not actually sure, whether changing
test_unicodedata (which uses the comparison of generated
output with expected output) is a good thing, as now updates
to the database require manual changes. I've added a few
error checks which increase coverage in unicodedata.c from
87% to 95%.

Marc-Andr� can you check if this is OK?

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-21 14:05

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/string_tests.py 1.27
Lib/test/test_str.py 1.1
Lib/test/test_string.py 1.24
Lib/test/test_unicode.py 1.79
Lib/test/test_userstring.py 1.10
Lib/test/output/test_string delete

I've removed the sets import and renamed the mixin tests to
contain the relevant class/module names (e.g.
MixinStrStringUserStringTest)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-02-21 04:39

Message:
Logged In: YES 
user_id=80475

* test_string.py imports sets but does not use it.

* the names of the mixin classes could possibly
   be made clearer so I won't have to search into
   the comments to find-out which mixins are
   appropriate for each class.

Overall, it looks like a nice factoring job and ought
to go a long ways toward keeping these guys 
in sync in the future.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-17 19:29

Message:
Logged In: YES 
user_id=89016

Here is the next bunch of ports: the string tests have been
ported to PyUnit and made as reusable as possible. Tests are
now shared between str, unicode, UserString and the string
module. As a result of reusing a part of the unicode tests
for str, the coverage in stringobject.c goes from 83% to
86%. Furthermore it should help keep the API consistent
between str and unicode (Example: "%c" % 0xffffffff raises
OverflowError, u"%c" % 0xffffffff raises ValueError)

Raymond can look look through the scripts and check that
everything is OK?

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-16 10:33

Message:
Logged In: YES 
user_id=89016

I'm currently working on a PyUnit port of the string tests (i.e. 
str, unicode, UserString and the string module). Uploading 
the result to this patch would be easier, as it already has a 
establsihed audience: But I can open a new patch for that if 
you want.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-14 21:11

Message:
Logged In: YES 
user_id=33168

Walter, can this patch be closed now?

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-14 12:30

Message:
Logged In: YES 
user_id=89016

Checked in as:
Lib/test/output/test_charmapcodec delete
Lib/test/test_charmapcodec.py 1.6


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-02-14 09:52

Message:
Logged In: YES 
user_id=38388

test_charmapcodec looks OK. Just remove
the DOS-lineends before checking it in.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-13 19:16

Message:
Logged In: YES 
user_id=89016

OK, checked in as test_userlist.py 1.7.

Assigned back to MAL for the review of test_charmapcodec.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-13 19:08

Message:
Logged In: YES 
user_id=6380

Walter, feel free to check in test_userlist.py!

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-13 19:02

Message:
Logged In: YES 
user_id=89016

Here's another one: test_userlist has been ported to PyUnit
and a few tests have been added to increase coverage.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-13 04:12

Message:
Logged In: YES 
user_id=33168

MAL, could you look at the test_charmapcodec.py?  I think
that's the only file outstanding from this patch.  It's a
pretty straightforward test.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-04 00:13

Message:
Logged In: YES 
user_id=89016

OK, test_sys.py is checked in.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-03 23:56

Message:
Logged In: YES 
user_id=6380

I think you can check this in -- if it fails with Jython,
Finn or Samuele will quickly patch it. :-)

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-03 23:44

Message:
Logged In: YES 
user_id=89016

OK, here's a new test_sys.py

> test_sys.py:
>
> - I agree that it's not worth testing the code 
> paths that will invoke a custom __displayhook__ or
> __excepthook__, but I regret it nevertheless. :-)
> maybe this deserves a comment?

Testing a custom displayhook is now done (via compile(...,
"single")/exec). Testing a custom excepthook seems to be
trickier. This could probably be done by calling the
interpreter recursively via os.system() or os.popen(). I've
added a comment for now that this isn't tested.
Unfortunately this leaves a large block in
Python/pythonrun.c uncovered.

> - sys.exit() should also be callable with a string

OK, done.

> - you could check that the value of the SystemExit exception
> has the right exit code

Done.

- Have you checked this with Jython? I don't know if it
implements all
   of these; in particular I doubt it has getrefcount().

I haven't tested Jython yet, but I guess test_sys.py will
have to many many exceptions for Jython. I'll try this tomorrow.

- I presume you've tested this on Windows?

Linux & Windows

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-03 22:10

Message:
Logged In: YES 
user_id=6380

test_sys.py:

- I agree that it's not worth testing the code paths that
will invoke a custom
   __displayhook__ or __excepthook__, but I regret it
nevertheless. :-)
   maybe this deserves a comment?

- sys.exit() should also be callable with a string

- you could check that the value of the SystemExit exception
has the right
   exit code

- Have you checked this with Jython? I don't know if it
implements all
   of these; in particular I doubt it has getrefcount().

- I presume you've tested this on Windows?

Sorry, I can't help you with charmapcodec


----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-02-03 21:36

Message:
Logged In: YES 
user_id=89016

Here's a new one: test_sys.py tests Python/sysmodule.c.
Coverage goes from 68% to 77%.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-19 15:46

Message:
Logged In: YES 
user_id=80475

All are approved except test_charmapcodec.py -- 
someone else should look at that one.

Be sure to follow GvR's advice and replace assertEquals 
with assertEqual.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-16 21:47

Message:
Logged In: YES 
user_id=89016

test_unicode is ported and enhanced (coverage goes from
80.81% to 85.05%)

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-10 18:17

Message:
Logged In: YES 
user_id=89016

> In general, don't do tests that hardwire implementation 
details

So should we remove
self.assertEquals(reduce(42, "1"), "1")
self.assertEquals(reduce(42, "", "1"), "1")
from test_filter?

BTW, you should look at test_builtin first, as the others
are still simply ports to PyUnit.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-01-10 18:03

Message:
Logged In: YES 
user_id=80475

Good to hear the news on increasing the coverage.

In general, don't do tests that hardwire implementation 
details.  Test it if it is a documented variable,  exposed 
through __all__, is a key constact (like the magic numbers 
in random.py), or a variable that a module user is likely to 
be relying upon.  Otherwise, no -- it should be possible to 
improve an implementation without crashing the suite.

I'll try to review a few of these over the next few days.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-10 17:53

Message:
Logged In: YES 
user_id=89016

test_builtin.py is now updated to test more error
situations. This increases the coverage of bltinmodule.c
from 75.13% to 92.20%, and it actually revealed one or two
potential bugs:
http://www.python.org/sf/665761 and
http://www.python.org/sf/665835

I'm not 100% sure that test_intern() and test_execfile() do
the right thing.

I'm not sure, whether the test script should check for
undocumented implementation artefacts, like:

a = 1
self.assert_(min(a, 1L) is a)

but in this way at least we get notified if something is
changed unintentionally.


----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-08 20:05

Message:
Logged In: YES 
user_id=89016

test_b1 and test_b2 are combined into test_builtin now

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-01-08 15:03

Message:
Logged In: YES 
user_id=6380

Two random suggestions:

- a blank line before each method, even trivial ones, even
the first one

- use assertEqual, not assertEquals

BTW, I see you've picked up on the convention that unit test
methods should not have doc strings. Good! (But they may
have comments.)

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-07 17:37

Message:
Logged In: YES 
user_id=89016

test_b1.py has been ported too.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-01-05 21:56

Message:
Logged In: YES 
user_id=89016

The patch is hard to read, so I'll upload all three test scripts.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662807&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 01:38:00 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 17:38:00 -0800
Subject: [Patches] [ python-Patches-701907 ] More use of fast_next_opcode
Message-ID: <E18svBw-0000Zh-00@sc8-sf-web3.sourceforge.net>

Patches item #701907, was opened at 2003-03-11 20:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: More use of fast_next_opcode

Initial Comment:
Applies "goto fast_next_opcode" instead of continue in 
op codes that don't make intervening C calls.  Makes 
the common tiny quick opcodes just a little quicker.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 01:41:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 17:41:48 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18svFc-0000hV-00@sc8-sf-web3.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 07:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Nobody/Anonymous (nobody)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 13:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 01:44:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 17:44:41 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18svIP-0000lb-00@sc8-sf-web3.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 14:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 17:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 02:35:08 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 11 Mar 2003 18:35:08 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18sw5E-0003iM-00@sc8-sf-web4.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 19:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Raymond Hettinger (rhettinger)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-12 02:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 01:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 22:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 08:46:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 00:46:27 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18t1sZ-000616-00@sc8-sf-web4.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 14:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Nobody/Anonymous (nobody)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 10:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 03:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 20:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 20:14:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 12:14:47 -0800
Subject: [Patches] [ python-Patches-702463 ] AE Enum and Attribute support fixes
Message-ID: <E18tCch-0001le-00@sc8-sf-web1.sourceforge.net>

Patches item #702463, was opened at 2003-03-12 11:14
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702463&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Enum and Attribute support fixes

Initial Comment:
This patch contains two somewhat unrelated minor patches to python's AppleEvent infrastructure. Details:

1) Currently, Enum parameters are encoded as four character strings in the actual AppleEvent. The vast majority of applications and events seem to be able to handle this, but to be correct and support those applications that don't, we should simply wrap the four-character-code string with an Enum before encoding the event. This is the fix at the bottom of the attached patch.

2) Currently, AppleEvent attributes which may be passed to any of the methods generated by gensuitemodule are encoded using AEPutParamDesc. This has probably not been an issue because there are almost no cases where Python code wants to attach attributes to an AppleEvent. However, to be correct, attributes should be attached to an AppleEvent using AEPutAttributeDesc.

Donovan

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702463&group_id=5470


From noreply@sourceforge.net  Wed Mar 12 20:44:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 12:44:26 -0800
Subject: [Patches] [ python-Patches-702463 ] AE Enum and Attribute support fixes
Message-ID: <E18tD5O-0002TO-00@sc8-sf-web2.sourceforge.net>

Patches item #702463, was opened at 2003-03-12 11:14
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702463&group_id=5470

Category: Macintosh
Group: Python 2.3
>Status: Closed
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Enum and Attribute support fixes

Initial Comment:
This patch contains two somewhat unrelated minor patches to python's AppleEvent infrastructure. Details:

1) Currently, Enum parameters are encoded as four character strings in the actual AppleEvent. The vast majority of applications and events seem to be able to handle this, but to be correct and support those applications that don't, we should simply wrap the four-character-code string with an Enum before encoding the event. This is the fix at the bottom of the attached patch.

2) Currently, AppleEvent attributes which may be passed to any of the methods generated by gensuitemodule are encoded using AEPutParamDesc. This has probably not been an issue because there are almost no cases where Python code wants to attach attributes to an AppleEvent. However, to be correct, attributes should be attached to an AppleEvent using AEPutAttributeDesc.

Donovan

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 11:44

Message:
Logged In: YES 
user_id=111050

I note that this has already been patched, sorry :(

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702463&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 00:07:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 16:07:04 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18tGFU-0008PU-00@sc8-sf-web3.sourceforge.net>

Patches item #702620, was opened at 2003-03-12 15:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 00:08:08 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 16:08:08 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18tGGW-0008SI-00@sc8-sf-web3.sourceforge.net>

Patches item #702620, was opened at 2003-03-12 15:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 00:08:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 12 Mar 2003 16:08:55 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18tGHH-0008Ud-00@sc8-sf-web3.sourceforge.net>

Patches item #702620, was opened at 2003-03-12 15:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 11:19:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 03:19:17 -0800
Subject: [Patches] [ python-Patches-697939 ] optparse unit tests + fixes
Message-ID: <E18tQk1-0005AR-00@sc8-sf-web4.sourceforge.net>

Patches item #697939, was opened at 2003-03-05 12:43
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697939&group_id=5470

Category: Tests
Group: Python 2.3
>Status: Closed
>Resolution: Invalid
Priority: 5
Submitted By: Johannes Gijsbers (jlgijsbers)
Assigned to: Nobody/Anonymous (nobody)
Summary: optparse unit tests + fixes

Initial Comment:
Here's a patch that mostly converts the tests from optik 
1.4 to the unittest format and makes it usable in the 
Python library. I've also added some tests, of which five 
fail with current CVS:

test_opt_string_empty
test_opt_string_too_short
test_opt_string_long_invalid
test_opt_string_short_invalid
test_help_long_opts_first

I changed the following to fix the tests:

* format_option_strings_short_first and 
format_option_strings_long_first have been merged into 
one function, format_options, to eliminate the almost 
complete duplication. To make this possible, short_first 
is now an attribute, which conveniently also eases 
changing short_first after instantiation.

* _short_opts and _long_opts are set in the Option 
constructor, instead of in _check_option_strings, to 
prevent an AttributeError which would occur when no 
option strings were passed, making the "at least one 
option string must be supplied" OptionError useless.

* Removed the check that would raise a RuntimeError in 
Option.__str__ when no option strings existed in 
_short_opts or _long_opts. A RuntimeError would be 
raised when an OptionError was raised in 
_set_opt_strings, because, quite logically, no option 
strings were set at that point.

I'm not sure why the check was there, because 
_short_opts and _long_opts are only empty when 
instantation fails, or when somebody set those *internal* 
attributes to false. And the moment you start mucking 
with internal attributes, you're on your own. :)

----------------------------------------------------------------------

>Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2003-03-13 12:19

Message:
Logged In: YES 
user_id=469548

I should have submitted a patch to the Optik code, according 
to Greg, I'll close this one and resubmit to him.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697939&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 11:19:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 03:19:39 -0800
Subject: [Patches] [ python-Patches-697941 ] optparse OptionGroup docs
Message-ID: <E18tQkN-0005C9-00@sc8-sf-web4.sourceforge.net>

Patches item #697941, was opened at 2003-03-05 12:48
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697941&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Johannes Gijsbers (jlgijsbers)
>Assigned to: Greg Ward (gward)
Summary: optparse OptionGroup docs

Initial Comment:
A small patch to add a bit about the new OptionGroup, 
added in Optik 1.4 and Python CVS but currently 
undocumented.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=697941&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 13:10:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 05:10:31 -0800
Subject: [Patches] [ python-Patches-702933 ] Kill off docs for unsafe macros
Message-ID: <E18tSTf-0002Si-00@sc8-sf-web3.sourceforge.net>

Patches item #702933, was opened at 2003-03-13 08:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702933&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: David Abrahams (david_abrahams)
Assigned to: Nobody/Anonymous (nobody)
Summary: Kill off docs for unsafe macros

Initial Comment:
I'll also attach the patch, but a message body is 
required:
=========================================
==========================
RCS 
file: /cvsroot/python/python/dist/src/Doc/api/memory.tex,
v
retrieving revision 1.2
diff -w -u -r1.2 memory.tex
--- memory.tex	6 Apr 2002 09:14:33 -0000
	1.2
+++ memory.tex	13 Mar 2003 12:56:26 -0000
@@ -195,9 +195,7 @@
 In addition to the functions aimed at handling raw 
memory blocks from
 the Python heap, objects in Python are allocated and 
released with
 \cfunction{PyObject_New()}, \cfunction
{PyObject_NewVar()} and
-\cfunction{PyObject_Del()}, or with their corresponding 
macros
-\cfunction{PyObject_NEW()}, \cfunction
{PyObject_NEW_VAR()} and
-\cfunction{PyObject_DEL()}.
+\cfunction{PyObject_Del()}.
 
 These will be explained in the next chapter on defining 
and
 implementing new object types in C.
Index: newtypes.tex
=========================================
==========================
RCS 
file: /cvsroot/python/python/dist/src/Doc/api/newtypes.te
x,v
retrieving revision 1.21
diff -w -u -r1.21 newtypes.tex
--- newtypes.tex	10 Feb 2003 19:18:21 -0000
	1.21
+++ newtypes.tex	13 Mar 2003 12:56:27 -0000
@@ -62,23 +62,6 @@
   after this call as the memory is no longer a valid 
Python object.
 \end{cfuncdesc}
 
-\begin{cfuncdesc}{\var{TYPE}*}{PyObject_NEW}{TYPE, 
PyTypeObject *type}
-  Macro version of \cfunction{PyObject_New()}, to gain 
performance at
-  the expense of safety.  This does not check \var{type} 
for a \NULL{}
-  value.
-\end{cfuncdesc}
-
-\begin{cfuncdesc}{\var{TYPE}*}{PyObject_NEW_VAR}
{TYPE, PyTypeObject *type,
-                                                int size}
-  Macro version of \cfunction{PyObject_NewVar()}, to 
gain performance
-  at the expense of safety.  This does not check \var
{type} for a
-  \NULL{} value.
-\end{cfuncdesc}
-
-\begin{cfuncdesc}{void}{PyObject_DEL}{PyObject *op}
-  Macro version of \cfunction{PyObject_Del()}.
-\end{cfuncdesc}
-
 \begin{cfuncdesc}{PyObject*}{Py_InitModule}{char 
*name,
                                             PyMethodDef *methods}
   Create a new module object based on a name and 
table of functions,


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702933&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 13:11:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 05:11:31 -0800
Subject: [Patches] [ python-Patches-702933 ] Kill off docs for unsafe macros
Message-ID: <E18tSUd-0001UI-00@sc8-sf-web4.sourceforge.net>

Patches item #702933, was opened at 2003-03-13 08:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702933&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: David Abrahams (david_abrahams)
>Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Kill off docs for unsafe macros

Initial Comment:
I'll also attach the patch, but a message body is 
required:
=========================================
==========================
RCS 
file: /cvsroot/python/python/dist/src/Doc/api/memory.tex,
v
retrieving revision 1.2
diff -w -u -r1.2 memory.tex
--- memory.tex	6 Apr 2002 09:14:33 -0000
	1.2
+++ memory.tex	13 Mar 2003 12:56:26 -0000
@@ -195,9 +195,7 @@
 In addition to the functions aimed at handling raw 
memory blocks from
 the Python heap, objects in Python are allocated and 
released with
 \cfunction{PyObject_New()}, \cfunction
{PyObject_NewVar()} and
-\cfunction{PyObject_Del()}, or with their corresponding 
macros
-\cfunction{PyObject_NEW()}, \cfunction
{PyObject_NEW_VAR()} and
-\cfunction{PyObject_DEL()}.
+\cfunction{PyObject_Del()}.
 
 These will be explained in the next chapter on defining 
and
 implementing new object types in C.
Index: newtypes.tex
=========================================
==========================
RCS 
file: /cvsroot/python/python/dist/src/Doc/api/newtypes.te
x,v
retrieving revision 1.21
diff -w -u -r1.21 newtypes.tex
--- newtypes.tex	10 Feb 2003 19:18:21 -0000
	1.21
+++ newtypes.tex	13 Mar 2003 12:56:27 -0000
@@ -62,23 +62,6 @@
   after this call as the memory is no longer a valid 
Python object.
 \end{cfuncdesc}
 
-\begin{cfuncdesc}{\var{TYPE}*}{PyObject_NEW}{TYPE, 
PyTypeObject *type}
-  Macro version of \cfunction{PyObject_New()}, to gain 
performance at
-  the expense of safety.  This does not check \var{type} 
for a \NULL{}
-  value.
-\end{cfuncdesc}
-
-\begin{cfuncdesc}{\var{TYPE}*}{PyObject_NEW_VAR}
{TYPE, PyTypeObject *type,
-                                                int size}
-  Macro version of \cfunction{PyObject_NewVar()}, to 
gain performance
-  at the expense of safety.  This does not check \var
{type} for a
-  \NULL{} value.
-\end{cfuncdesc}
-
-\begin{cfuncdesc}{void}{PyObject_DEL}{PyObject *op}
-  Macro version of \cfunction{PyObject_Del()}.
-\end{cfuncdesc}
-
 \begin{cfuncdesc}{PyObject*}{Py_InitModule}{char 
*name,
                                             PyMethodDef *methods}
   Create a new module object based on a name and 
table of functions,


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702933&group_id=5470


From noreply@sourceforge.net  Thu Mar 13 21:40:57 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 13:40:57 -0800
Subject: [Patches] [ python-Patches-701907 ] More use of fast_next_opcode
Message-ID: <E18taRd-0000qj-00@sc8-sf-web4.sourceforge.net>

Patches item #701907, was opened at 2003-03-12 02:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: More use of fast_next_opcode

Initial Comment:
Applies "goto fast_next_opcode" instead of continue in 
op codes that don't make intervening C calls.  Makes 
the common tiny quick opcodes just a little quicker.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-13 22:40

Message:
Logged In: YES 
user_id=38388

Sorry, not much time to look at this.
>From a quick scan, it looks OK.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 01:51:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 13 Mar 2003 17:51:32 -0800
Subject: [Patches] [ python-Patches-701907 ] More use of fast_next_opcode
Message-ID: <E18teM8-0003z0-00@sc8-sf-web3.sourceforge.net>

Patches item #701907, was opened at 2003-03-11 20:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: More use of fast_next_opcode

Initial Comment:
Applies "goto fast_next_opcode" instead of continue in 
op codes that don't make intervening C calls.  Makes 
the common tiny quick opcodes just a little quicker.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-13 20:51

Message:
Logged In: YES 
user_id=80475

Thanks for the second look.  It's a low risk patch, so I'll go 
ahead and load it.  See ceval.c 2.354.


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-13 16:40

Message:
Logged In: YES 
user_id=38388

Sorry, not much time to look at this.
>From a quick scan, it looks OK.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701907&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 08:18:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 00:18:44 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tkOq-0004mH-00@sc8-sf-web4.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 17:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 15:19:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 07:19:21 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tqxt-0007er-00@sc8-sf-web3.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 16:35:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 08:35:21 -0800
Subject: [Patches] [ python-Patches-669683 ] HTMLParser -- allow comma in unquoted attribute values
Message-ID: <E18ts9R-0003tz-00@sc8-sf-web3.sourceforge.net>

Patches item #669683, was opened at 2003-01-17 06:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=669683&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: j paulson (fantoozler)
Assigned to: Fred L. Drake, Jr. (fdrake)
>Summary: HTMLParser -- allow comma in unquoted attribute values

Initial Comment:
An HTML document in the wild had the tag:

<font size="-1" color=rgb(175,18,3)>

and HTMLParser was choking on the "," after the 
"175".  By adding "," to the list of allowed characters 
in attribute values, HTMLParser accepts the document.


----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2003-03-14 11:35

Message:
Logged In: YES 
user_id=3066

The regression test does not appear to have been attached; I
see four copies of essentially the same patch.

That's OK though; I have enough to go on.  I've incorporated
the patch in both Lib/HTMLParser.py 1.12 and Lib/sgmllib.py
1.42 (so it also fixes htmllib).  Regression tests have been
added to Lib/test/test_htmlparser.py 1.10 and
Lib/test/test_sgmllib.py 1.5.

----------------------------------------------------------------------

Comment By: j paulson (fantoozler)
Date: 2003-01-24 19:18

Message:
Logged In: YES 
user_id=690612

Added test case to Lib/test/test_htmlparser.py in addition to the 
HTMLParser.py patch

----------------------------------------------------------------------

Comment By: j paulson (fantoozler)
Date: 2003-01-17 13:46

Message:
Logged In: YES 
user_id=690612

I'll attach the patch file again.  

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-01-17 09:17

Message:
Logged In: YES 
user_id=33168

Was this supposed to be a patch or a bug report?  There is
no patch attached.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=669683&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 16:36:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 08:36:47 -0800
Subject: [Patches] [ python-Patches-674448 ] test_htmlparser.py -- &quot;,&quot; in attributes
Message-ID: <E18tsAp-0001sw-00@sc8-sf-web4.sourceforge.net>

Patches item #674448, was opened at 2003-01-24 23:00
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=674448&group_id=5470

Category: Library (Lib)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: j paulson (fantoozler)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: test_htmlparser.py -- &quot;,&quot; in attributes

Initial Comment:
Added a test verifying patch #669683 works.

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2003-03-14 11:36

Message:
Logged In: YES 
user_id=3066

Ok, I've already added a nearly identical test; I didn't see
this patch while I was looking at 669683.  Closing as
accepted, since my patch was almost identical.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=674448&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 16:48:12 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 08:48:12 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tsLs-0004PL-00@sc8-sf-web2.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 17:05:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 09:05:59 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tsd5-0003HL-00@sc8-sf-web4.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 17:07:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 09:07:41 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tsej-0003Mm-00@sc8-sf-web4.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 17:14:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 09:14:34 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tslO-0007qZ-00@sc8-sf-web1.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 17:23:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 09:23:26 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tsty-0006BA-00@sc8-sf-web2.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 18:05:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 10:05:59 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ttZ9-0008Gu-00@sc8-sf-web2.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 18:17:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 10:17:13 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ttk1-0000cu-00@sc8-sf-web3.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Barry A. Warsaw (bwarsaw)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 18:19:45 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 10:19:45 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ttmT-0000Xb-00@sc8-sf-web2.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-14 13:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 19:25:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 11:25:15 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18tunr-00063W-00@sc8-sf-web1.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-14 14:25

Message:
Logged In: YES 
user_id=12800

why wouldn't calling it on garbage data raise
binascii.Error?  i think i'd feel more comfortable about the
patch if it did that instead (to be consistent with
incomplete padding errors and such).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 13:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Fri Mar 14 22:03:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 14 Mar 2003 14:03:58 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18txHS-0004P3-00@sc8-sf-web1.sourceforge.net>

Patches item #675422, was opened at 2003-01-27 08:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-14 17:03

Message:
Logged In: YES 
user_id=6380

OK, checked in with that line removed.

Thanks!

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-07 23:42

Message:
Logged In: YES 
user_id=46639

Leave it commented out or remove that line. It is testing
unimportant behaviour that looks more platform dependant
than I suspected (and now I look at it again, what tzname
should be set to if the timezone is unknow is unspecified by
the tzset(3) docs). The important behaviour is that:

a) the system silently falls back to UTC if the timezone is
unknown, and this is tested elsewhere 

b) calling tzset resets tzname, which is also tested elsewhere.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 09:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 16:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 16:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 07:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-20 23:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Sat Mar 15 13:51:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 15 Mar 2003 05:51:44 -0800
Subject: [Patches] [ python-Patches-701743 ] Reloading pseudo modules
Message-ID: <E18uC4e-00034S-00@sc8-sf-web2.sourceforge.net>

Patches item #701743, was opened at 2003-03-11 19:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reloading pseudo modules

Initial Comment:
Python allows to put something that is not a module in
sys.modules. Unfortunately reload() does not work wth
such a pseudo module ("TypeError: reload() argument
must be module" is raised). This patch changes
Python/import.c::PyImport_ReloadModule() so that it
works with anything that has a __name__ attribute that
can be found in sys.modules.keys().

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-15 14:51

Message:
Logged In: YES 
user_id=21627

I think the exceptions need to be reworked: "must be a
module" now only occurs if m is NULL. Under what
circumstances could that happen? Failure to provide __name__
is passed through; shouldn't this get diagnosed in a better way?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470


From noreply@sourceforge.net  Sun Mar 16 22:02:02 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 16 Mar 2003 14:02:02 -0800
Subject: [Patches] [ python-Patches-704676 ] add direct access to MD5 compression function to md5 module
Message-ID: <E18ugCg-0006qG-00@sc8-sf-web1.sourceforge.net>

Patches item #704676, was opened at 2003-03-17 00:02
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=704676&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Reuben Sumner (rasumner)
Assigned to: Nobody/Anonymous (nobody)
Summary: add direct access to MD5 compression function to md5 module

Initial Comment:
Access to the MD5 compression function allows doing
things like calculating NMAC (see
http://www.cs.ucsd.edu/users/mihir/papers/hmac.html). 
This patch gives such access.  If accepted I am happy
to do the same for SHA-1.  I didn't update the doc
strings very well, and I didn't update the
documentation source at all.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=704676&group_id=5470


From noreply@sourceforge.net  Sun Mar 16 22:24:45 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 16 Mar 2003 14:24:45 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ugYf-0004ex-00@sc8-sf-web3.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-16 23:24

Message:
Logged In: YES 
user_id=34209

Well, the patch restores the behaviour of Python 2.1 and
earlier (at least as far back as 1.5.2.) Also, binascii
ignores any errors *except* padding errors, and 'padding
errors' mean 'valid base64-characters left over'. Invalid
characters inside the base64 stream are silently ignored,
and in fact the base64 test explicitly tests this
behaviour... I think ignoring anything but whitespace in the
first place is the problem here, but that's not the problem
the patch tries to solve, and I don't know enough about the
official specs to say whether this is mandatory or not.

Added the changes Tim wanted to the patch.


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-14 20:25

Message:
Logged In: YES 
user_id=12800

why wouldn't calling it on garbage data raise
binascii.Error?  i think i'd feel more comfortable about the
patch if it did that instead (to be consistent with
incomplete padding errors and such).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 19:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Sun Mar 16 22:42:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 16 Mar 2003 14:42:28 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18ugpo-0003cy-00@sc8-sf-web4.sourceforge.net>

Patches item #702620, was opened at 2003-03-13 01:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-16 23:42

Message:
Logged In: YES 
user_id=45365

Donovan,
in as far as I understand the matter (in which area you are clearly my superior:-) I think the idea of the fix is correct, but I have one misgiving: if a class has no properties then v._propdict will still be empty after getbaseclasses(). This will result in the next call of getbaseclasses (if this class is the base class of another) going through the motions again.

Is this a problem?

Also, do we really need _superclassnames, can't we do this with __bases__? I vaguely remember we went through this issue before, but I can't remember fully...

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Sun Mar 16 22:49:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 16 Mar 2003 14:49:35 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ugwh-0005cI-00@sc8-sf-web3.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Barry A. Warsaw (bwarsaw)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-16 17:49

Message:
Logged In: YES 
user_id=31435

The bug here is exposing recycled memory, so let's fix that 
first.  Changing under which conditions the function raises 
exceptions is a different can of worms, and is probably off 
the table for 2.2 backporting regardless.  Barry, if you're 
happy with the way the patch fixes the reported bug, 
please accept it and assign it back to Thomas; if you want 
to go on to change error-raising behavior for 2.3, better to 
open another report.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-16 17:24

Message:
Logged In: YES 
user_id=34209

Well, the patch restores the behaviour of Python 2.1 and
earlier (at least as far back as 1.5.2.) Also, binascii
ignores any errors *except* padding errors, and 'padding
errors' mean 'valid base64-characters left over'. Invalid
characters inside the base64 stream are silently ignored,
and in fact the base64 test explicitly tests this
behaviour... I think ignoring anything but whitespace in the
first place is the problem here, but that's not the problem
the patch tries to solve, and I don't know enough about the
official specs to say whether this is mandatory or not.

Added the changes Tim wanted to the patch.


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-14 14:25

Message:
Logged In: YES 
user_id=12800

why wouldn't calling it on garbage data raise
binascii.Error?  i think i'd feel more comfortable about the
patch if it did that instead (to be consistent with
incomplete padding errors and such).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 13:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Mon Mar 17 05:26:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 16 Mar 2003 21:26:18 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18un8c-0007Ur-00@sc8-sf-web3.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 03:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Tim Peters (tim_one)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-17 00:26

Message:
Logged In: YES 
user_id=12800

I'm happy with the patch for the reported error.  I guess
the inconsistency in behavior is just the price to pay for
the age of the api.  Someday <wink> we'll design a better
interface.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-16 17:49

Message:
Logged In: YES 
user_id=31435

The bug here is exposing recycled memory, so let's fix that 
first.  Changing under which conditions the function raises 
exceptions is a different can of worms, and is probably off 
the table for 2.2 backporting regardless.  Barry, if you're 
happy with the way the patch fixes the reported bug, 
please accept it and assign it back to Thomas; if you want 
to go on to change error-raising behavior for 2.3, better to 
open another report.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-16 17:24

Message:
Logged In: YES 
user_id=34209

Well, the patch restores the behaviour of Python 2.1 and
earlier (at least as far back as 1.5.2.) Also, binascii
ignores any errors *except* padding errors, and 'padding
errors' mean 'valid base64-characters left over'. Invalid
characters inside the base64 stream are silently ignored,
and in fact the base64 test explicitly tests this
behaviour... I think ignoring anything but whitespace in the
first place is the problem here, but that's not the problem
the patch tries to solve, and I don't know enough about the
official specs to say whether this is mandatory or not.

Added the changes Tim wanted to the patch.


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-14 14:25

Message:
Logged In: YES 
user_id=12800

why wouldn't calling it on garbage data raise
binascii.Error?  i think i'd feel more comfortable about the
patch if it did that instead (to be consistent with
incomplete padding errors and such).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 13:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 13:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 12:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 12:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 11:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 10:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Mon Mar 17 11:48:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 17 Mar 2003 03:48:11 -0800
Subject: [Patches] [ python-Patches-703471 ] (Security Problem) base64.decodestring exposes garbage value
Message-ID: <E18ut6B-0003bj-00@sc8-sf-web2.sourceforge.net>

Patches item #703471, was opened at 2003-03-14 09:18
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 8
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Thomas Wouters (twouters)
Summary: (Security Problem) base64.decodestring exposes garbage value

Initial Comment:
>>> import base64
>>> base64.decodestring("###################")
'\x0cD\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> base64.decodestring(".....")
'ps2\x00\x00t'
>>> base64.decodestring("........................")
'\x0cF\x1a\x08\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>>
base64.decodestring(".................................................")
'.............................."\x00\x00\x00\x00\x00\x00\x00\x00'

This exposes unexpected values that deallocated recently.
(some my cgi script showed garbage that contains a
database password in offensive query)


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-17 12:48

Message:
Logged In: YES 
user_id=34209

Checked into HEAD and 2.2-maint as:
Modules/binascii.c: 2.39 and 2.33.4.4
Lib/test/test_binascii.py: 1.16 and 1.11.10.1

Also added Hye-Shik Chang to ACKS. Thanks :-)


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-17 06:26

Message:
Logged In: YES 
user_id=12800

I'm happy with the patch for the reported error.  I guess
the inconsistency in behavior is just the price to pay for
the age of the api.  Someday <wink> we'll design a better
interface.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-16 23:49

Message:
Logged In: YES 
user_id=31435

The bug here is exposing recycled memory, so let's fix that 
first.  Changing under which conditions the function raises 
exceptions is a different can of worms, and is probably off 
the table for 2.2 backporting regardless.  Barry, if you're 
happy with the way the patch fixes the reported bug, 
please accept it and assign it back to Thomas; if you want 
to go on to change error-raising behavior for 2.3, better to 
open another report.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-16 23:24

Message:
Logged In: YES 
user_id=34209

Well, the patch restores the behaviour of Python 2.1 and
earlier (at least as far back as 1.5.2.) Also, binascii
ignores any errors *except* padding errors, and 'padding
errors' mean 'valid base64-characters left over'. Invalid
characters inside the base64 stream are silently ignored,
and in fact the base64 test explicitly tests this
behaviour... I think ignoring anything but whitespace in the
first place is the problem here, but that's not the problem
the patch tries to solve, and I don't know enough about the
official specs to say whether this is mandatory or not.

Added the changes Tim wanted to the patch.


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-03-14 20:25

Message:
Logged In: YES 
user_id=12800

why wouldn't calling it on garbage data raise
binascii.Error?  i think i'd feel more comfortable about the
patch if it did that instead (to be consistent with
incomplete padding errors and such).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 19:19

Message:
Logged In: YES 
user_id=31435

I'd like it fine if

1. It did rv = PyString_FromString("") and then fell thru to the 
existing "return rv;", instead of creating another return point.

2. Add a comment about why this convolution is needed:  this 
part of the function has been implicating in two bugs so far.

The base64 stuff silently skips over garbage characters 
because everything would break now if it didn't <wink>.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:17

Message:
Logged In: YES 
user_id=34209

Assigning to Barry for review, on the 'last urinated'
principle <wink>.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 19:05

Message:
Logged In: YES 
user_id=34209

Version of the patch with a test attached. It looks sane to
me, and it seems to work. I'm not sure why binascii isn't
raising an exception when receiving invalid data, but this
is python2.1-and-earlier behaviour, and I'm not about to
break that.


----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:23

Message:
Logged In: YES 
user_id=34209

Hm, I see. I figured it was PyString_FromStringAndSize()'s
fault for not honoring the NULL source-string in the case of
a zero-length request, but I see how that might be intended.

How about this patch instead ?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:14

Message:
Logged In: YES 
user_id=31435

The thing I'm worried about is that _PyString_Resize must 
not be called on a string that's empty to begin with (resizing 
will fail because the empty string is shared, and the resize 
routine checks for that).

The *last* patch to this function inserted the bin_len > 0 test 
for what appears to be that very reason -- but that also 
created the problem we're seeing now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 18:07

Message:
Logged In: YES 
user_id=34209

Ah, it is not. I'll see about fixing it (and writing the
testcase etc etc yahdah yahdah.)


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 18:05

Message:
Logged In: YES 
user_id=31435

I raised the priority so someone would look at it.  That part 
worked <wink>.  I'm unsure about the patch, but don't have 
time to explain that now.

----------------------------------------------------------------------

Comment By: Thomas Wouters (twouters)
Date: 2003-03-14 17:48

Message:
Logged In: YES 
user_id=34209

The patch seems to me to be the correct fix. Did you have a
reason to raise the priority but not check it in, Tim ?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-14 16:19

Message:
Logged In: YES 
user_id=31435

Yikes!  Boosted priority way up.  A quick check shows that 
my Python 2.2.2 also appears to "decode" free'd RAM here 
on Windows.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=703471&group_id=5470


From noreply@sourceforge.net  Mon Mar 17 14:25:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 17 Mar 2003 06:25:14 -0800
Subject: [Patches] [ python-Patches-701743 ] Reloading pseudo modules
Message-ID: <E18uvYA-0001i0-00@sc8-sf-web3.sourceforge.net>

Patches item #701743, was opened at 2003-03-11 19:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reloading pseudo modules

Initial Comment:
Python allows to put something that is not a module in
sys.modules. Unfortunately reload() does not work wth
such a pseudo module ("TypeError: reload() argument
must be module" is raised). This patch changes
Python/import.c::PyImport_ReloadModule() so that it
works with anything that has a __name__ attribute that
can be found in sys.modules.keys().

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-17 15:25

Message:
Logged In: YES 
user_id=89016

PyImport_ReloadModule() is only called by the implementation
of the reload builtin, so it seems that m==NULL can only
happen with broken extension modules. I've updated the patch
accordingly (raising a SystemError) and changed the error
case for a missing __name__ attribute to raise a TypeError
when an AttributeError is detected. Unfortunately this might
mask exceptions (e.g. when __name__ is implemented as a
property.)

Another problem is that reload() seems to repopulate the
existing module object when reloading real modules. Example:
Write a simple foo.py which contains "x = 1" and then:
>>> import foo
>>> foo.x
1
[ Now open your editor and change foo.py to "x = 2" ]
>>> foo2 = reload(foo)
>>> foo.x
2
>>> foo2.x
2
>>> print id(foo), id(foo2)
1077466884 1077466884
>>> 

Of course this can't work with pseudo modules. I wonder why
reload() has a return value at all, as it always modifies
its parameter for real modules.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-15 14:51

Message:
Logged In: YES 
user_id=21627

I think the exceptions need to be reworked: "must be a
module" now only occurs if m is NULL. Under what
circumstances could that happen? Failure to provide __name__
is passed through; shouldn't this get diagnosed in a better way?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 00:55:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 17 Mar 2003 16:55:25 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18v5O1-0003yG-00@sc8-sf-web3.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 14:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-17 19:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-11 21:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 17:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 14:26:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 18 Mar 2003 06:26:17 -0800
Subject: [Patches] [ python-Patches-696392 ] allow proxy server authentication with pimp
Message-ID: <E18vI2j-0000UY-00@sc8-sf-web2.sourceforge.net>

Patches item #696392, was opened at 2003-03-03 08:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696392&group_id=5470

Category: Macintosh
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Andrew Straw (astraw)
Assigned to: Jack Jansen (jackjansen)
Summary: allow proxy server authentication with pimp

Initial Comment:
The urllib module does not support http proxy authentication with passwords.  The urllib2 module does, so I changed pimp.py to use urllib2.  I have tested the patch below after setting my http_proxy environment variable to the form "http://user:pass@proxy.com:1234".

It may be possible to remove the dependency on urllib entirely by sustituting a urllib2 work-alike for a call to urllib.url2pathname().

This may affect the exception(s) raised when unable to connect.  For example, PackageManager.py catches an IOError, but I believe urllib2 raises a socket.gaierror when unable to resolve the name of the URL. I have not resolved this issue.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-18 15:26

Message:
Logged In: YES 
user_id=45365

Checked into CVS.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=696392&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 14:38:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 18 Mar 2003 06:38:18 -0800
Subject: [Patches] [ python-Patches-578667 ] Put IDE scripts in ~/Library
Message-ID: <E18vIEM-0004Cv-00@sc8-sf-web1.sourceforge.net>

Patches item #578667, was opened at 2002-07-08 15:40
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578667&group_id=5470

Category: Macintosh
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Just van Rossum (jvr)
Summary: Put IDE scripts in ~/Library

Initial Comment:
Just,
here's a patch that was part of a larger set and this one was unrelated to the rest(unfortunately I've forgotten who sent it). The patch moves the IDE scripts folder to ~/Library when running on OSX.

This is a good idea, because it allows people to have their own private set of IDE scripts, even if a sysadmin has installed Python. But: the patch as-is is probably not good enough, as there is no place for system-wide scripts anymore. (Scripts will also be shared between MacPython IDE and MachoPython IDE, which is also nice)

You may want to look at providing two scripts folders, one in the normal location (i.e. somewhere in the Python tree) and one in ~/Library.

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-18 15:38

Message:
Logged In: YES 
user_id=45365

Just,
shouldn't this be closed?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2002-07-08 16:47

Message:
Logged In: YES 
user_id=92689

It was Tony Lownds. I'm all for the intentions of the patch, but I see it will 
fail on MacPython, which doesn't support os.environ["HOME"]. But I 
guess that statement could simply be replaced by the appropriate 
FindFolder() call.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578667&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 18:00:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 18 Mar 2003 10:00:55 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18vLOR-0001VD-00@sc8-sf-web4.sourceforge.net>

Patches item #702620, was opened at 2003-03-12 15:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2003-03-18 09:00

Message:
Logged In: YES 
user_id=111050

Jack,
Thanks for taking a look at this.

You are correct, if a class has no properties then v._propdict will still be empty, and we will do unneccessary work the next time getbaseclasses is called. I suppose it could be "if not v._propdict and not v._elemdict:" which would reduce the unnecessary work down to when a base class has neither properties nor elements; frankly the if is not really required at all; it was just an attempt to prevent work that has already been performed from being performed again unnecessarily. Suggestions welcome.

Re _superclassnames, like everything else done with gensuitemodule, we need to be really careful about circular references, references to things that haven't been defined yet, etc. Everything generated by gensuitemodule is either a ComponentItem or an NProperty, and they don't actually inherit from each other in Python because doing so would be too hairy. So we can't use __bases__ because there is none :-)

The thing about _superclassnames is that it's just what it sounds like; a list of strings that indicate superclasses of the current class. By deferring getbaseclasses to import time, we ensure all of the base classes are defined by then.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-16 13:42

Message:
Logged In: YES 
user_id=45365

Donovan,
in as far as I understand the matter (in which area you are clearly my superior:-) I think the idea of the fix is correct, but I have one misgiving: if a class has no properties then v._propdict will still be empty after getbaseclasses(). This will result in the next call of getbaseclasses (if this class is the base class of another) going through the motions again.

Is this a problem?

Also, do we really need _superclassnames, can't we do this with __bases__? I vaguely remember we went through this issue before, but I can't remember fully...

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 18:14:12 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 18 Mar 2003 10:14:12 -0800
Subject: [Patches] [ python-Patches-578667 ] Put IDE scripts in ~/Library
Message-ID: <E18vLbI-0002pz-00@sc8-sf-web2.sourceforge.net>

Patches item #578667, was opened at 2002-07-08 15:40
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578667&group_id=5470

Category: Macintosh
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Jack Jansen (jackjansen)
Assigned to: Just van Rossum (jvr)
Summary: Put IDE scripts in ~/Library

Initial Comment:
Just,
here's a patch that was part of a larger set and this one was unrelated to the rest(unfortunately I've forgotten who sent it). The patch moves the IDE scripts folder to ~/Library when running on OSX.

This is a good idea, because it allows people to have their own private set of IDE scripts, even if a sysadmin has installed Python. But: the patch as-is is probably not good enough, as there is no place for system-wide scripts anymore. (Scripts will also be shared between MacPython IDE and MachoPython IDE, which is also nice)

You may want to look at providing two scripts folders, one in the normal location (i.e. somewhere in the Python tree) and one in ~/Library.

----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-18 19:14

Message:
Logged In: YES 
user_id=92689

I guess -- it's not realistic that I'll look into this anytime soon.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-18 15:38

Message:
Logged In: YES 
user_id=45365

Just,
shouldn't this be closed?

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2002-07-08 16:47

Message:
Logged In: YES 
user_id=92689

It was Tony Lownds. I'm all for the intentions of the patch, but I see it will 
fail on MacPython, which doesn't support os.environ["HOME"]. But I 
guess that statement could simply be replaced by the appropriate 
FindFolder() call.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=578667&group_id=5470


From noreply@sourceforge.net  Tue Mar 18 18:22:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 18 Mar 2003 10:22:38 -0800
Subject: [Patches] [ python-Patches-681927 ] bundlebuilder: Add dylibs, frameworks to the bundle
Message-ID: <E18vLjS-0003IW-00@sc8-sf-web2.sourceforge.net>

Patches item #681927, was opened at 2003-02-06 22:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Robin Dunn (robind)
Assigned to: Just van Rossum (jvr)
Summary: bundlebuilder: Add dylibs, frameworks to the bundle

Initial Comment:
This patch adds the ability to specify that shared
libraries and Frameworks (the last is untested as of
yet) to the bundle.  It is mostly by Kevin Olliver with
some suggestions by me.

In addition to copying the files into the bundle the
launcher script in the bundle is modified to set the
DYLD_LIBRARY_PATH to the right place.


----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-18 19:22

Message:
Logged In: YES 
user_id=92689

Having a manual option is a fine start. But can any of you rework the patch so it doesn't mess with whitespace, and update it for current CVS?

----------------------------------------------------------------------

Comment By: Kevin Ollivier (kollivier)
Date: 2003-02-07 21:52

Message:
Logged In: YES 
user_id=248468

I'll take a look at otool and see if it does what we need. As Robin mentioned, I think giving both the manual and auto options is the best approach. 

I'll also check into the dependency on Apple's Dev Tools, but even if it is dependent we could just switch off auto-detection if users don't have it and spit out a warning. Another possible way to alleviate this problem may be to integrate with distutils. (i.e. make a 'buildbundle' option) That should at least allow us to find and include any libraries the developer linked against. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-07 09:59

Message:
Logged In: YES 
user_id=92689

I use tabs for indentation and use spaces for alignment... So things look nice _and_ wont screw up with different tab settings. But I admit that using such a non-standard way is asking for trouble. I'll convert to spaces after this patch has been done (unless you prefer I do it _before_ ;-).

(Btw. it might be that otool is only available with the apple dev tools, which would be a shame since we otherwise don't depend in dev tools being available. Hm.)

----------------------------------------------------------------------

Comment By: Robin Dunn (robind)
Date: 2003-02-07 00:20

Message:
Logged In: YES 
user_id=53955

Oops, sorry for the witespace patches.  I noticed that my
lines used spaces but the lines around them were using tabs
so I just ran a tabify on the whole file without taking
another look at the resulting patch file after that.  Looks
like some of other lines that wre added since 2.3a1 have
spaces too and that is where the problem comes from.  I'll
redo the patch but the whole file should probably be either
tabified or untabified after you are done applying it.

I didn't know about otool. I'll pass that on to Kevin.  We
discussed about doing automatic finding of libs but didn't
know how to go about it so thought that this would be a good
start.  Also we figured that even if there was a way to do
it that you would probably want a way to inlcude other files
that may not get automatically found, or to exclude some
that were, so there should be command line options for it
anyway. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-06 23:35

Message:
Logged In: YES 
user_id=92689

Cool. There's a problem with the patch, though: although I apologize for using tabs to begin with, please keep the tab usage consistent. There are quite a few hunks in the patch that only touch whitespace and that's both undesirable as well as blurring the intent of the patch... Could you upload a cleaner one?

Btw. for the --standalone build mode it would be possible to calculate all framework/dylib dependencies with the otool tool. If this were implemented perhaps the --lib option wouldn't even be needed? Another question remains: if we include a framework, is there a way to strip it from redunant files, eg. headers? If we would use this mechanism to include Python.framework we would definitely need a way to trim it down, eg. all of lib is taken care of by modulefinder anyway. If you (or Kevin) have any ideas about that, pls contact me off line.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 15:55:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 07:55:41 -0800
Subject: [Patches] [ python-Patches-706338 ] Fix a few broken links in pydoc
Message-ID: <E18vfun-0000dx-00@sc8-sf-web3.sourceforge.net>

Patches item #706338, was opened at 2003-03-19 06:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706338&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Nobody/Anonymous (nobody)
Summary: Fix a few broken links in pydoc

Initial Comment:
Patch to fix a few of the help files references in 
pydoc.Helper.  I'm not sure what was originally 
in 'ref/execframe' (which does not exist in the 2.3 
documentation set), but, since 'ref/naming' seems the 
best file for NAMESPACES, I converted both references 
to 'ref/execframe' to 'ref/naming'.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706338&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 17:46:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 09:46:15 -0800
Subject: [Patches] [ python-Patches-706406 ] fix bug #685846: raw_input defers signals
Message-ID: <E18vhdn-0006O6-00@sc8-sf-web3.sourceforge.net>

Patches item #706406, was opened at 2003-03-19 17:46
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706406&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #685846: raw_input defers signals

Initial Comment:
This patch attempts to fix raw_input so it
can be interrupted by signals.  In the process
it allows SIGINT handling to be honored by
raw_input.  (right now SIGINT always interrupts
regardless of any installed handlers)

Effects:
Signals are handled with their installed handlers
and when those handlers raise exceptions those
exceptions are raised by raw_input.  If an
exception is not raised, raw_input continues
collecting input as if nothing had happened.
This can be problematic if the signal causes
output to appear on the screen, messing up
the input line, or if someone using the
readline module was in the middle of a
complex operation, like a reverse search,
in which case that operation will be
cancelled.  It would be easy to instead
print a message ("Signal Interruption")
and continue input on a new line for the
readline library, but this couldn't happen
in myreadline.c as we can't retrieve the
partially entered input.

Backwards compatibility:
This patch requires the readline handler
(either call_readline or PyOS_StdioReadline
generally) to be called while holding the
global interpreter lock.  It is then
responsible for releasing the GIL before doing
blocking input. This will cause problems
for anyone who has written an extension
that installs a custom readline handler.
In python code, anyone using signals and
expecting raw_input not to be interrupted
by them will have problems (but this seems
unlikely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706406&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 17:47:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 09:47:17 -0800
Subject: [Patches] [ python-Patches-706406 ] fix bug #685846: raw_input defers signals
Message-ID: <E18vhen-0006Sf-00@sc8-sf-web3.sourceforge.net>

Patches item #706406, was opened at 2003-03-19 17:46
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706406&group_id=5470

>Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #685846: raw_input defers signals

Initial Comment:
This patch attempts to fix raw_input so it
can be interrupted by signals.  In the process
it allows SIGINT handling to be honored by
raw_input.  (right now SIGINT always interrupts
regardless of any installed handlers)

Effects:
Signals are handled with their installed handlers
and when those handlers raise exceptions those
exceptions are raised by raw_input.  If an
exception is not raised, raw_input continues
collecting input as if nothing had happened.
This can be problematic if the signal causes
output to appear on the screen, messing up
the input line, or if someone using the
readline module was in the middle of a
complex operation, like a reverse search,
in which case that operation will be
cancelled.  It would be easy to instead
print a message ("Signal Interruption")
and continue input on a new line for the
readline library, but this couldn't happen
in myreadline.c as we can't retrieve the
partially entered input.

Backwards compatibility:
This patch requires the readline handler
(either call_readline or PyOS_StdioReadline
generally) to be called while holding the
global interpreter lock.  It is then
responsible for releasing the GIL before doing
blocking input. This will cause problems
for anyone who has written an extension
that installs a custom readline handler.
In python code, anyone using signals and
expecting raw_input not to be interrupted
by them will have problems (but this seems
unlikely).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706406&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 18:39:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 10:39:01 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18viSr-00009U-00@sc8-sf-web3.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 19:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 18:39

Message:
Logged In: YES 
user_id=670441

I'm not sure I understand your concern with the new
tp_iter slot, it just makes cStringIO a self iterator
as requested on python-dev, going for
the analogy with file objects, right?  Actually it should
probably use the still-being-debated GenericGetIter
or whatever it will be called, but not until the debate is
over.

I think the get/setattrs are okay.  Everything they
did is done by the default get/set attrs, once we
set up the appropriate methods and members
(there's just the one member, softspace).  I thought
replacing them by the defaults would be clearer
and easier to maintain.  Also, it is in analogy with
fileobject.c, so I thought making the cStringIO
implementation more like file's would be good.

As for the creating a new tuple every time and
the 0,0,0,0 style, you're absolutely right, I've attached
a new patch that fixes those up per your suggestions.
I was creating a new tuple every time in analogy
with iterobject.c's calliter_iternext.  Perhaps that
should be changed as well?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-18 00:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-12 02:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 01:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 22:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 19:54:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 11:54:56 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18vjeK-00040p-00@sc8-sf-web2.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 14:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-19 14:54

Message:
Logged In: YES 
user_id=80475

It looks good to me, compiles okay, passes tests, etc.  I do 
prefer that you get one more reviewer to look at it.  Neal or 
MvL might be a good choice.

GvR picked PyObject_SelfIter to be the name of the 
iterator's tp_iter slot filler.  So you can go ahead and use it 
to eliminate IO_getiter.

One nit, when you load the next patch, copy in the 
unchanged lines from the original.  There are many lines 
marked as having a change but the content is the same.  
This means that something changed in the whitespace.  
It's not big deal but it makes the patch harder to review.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 13:39

Message:
Logged In: YES 
user_id=670441

I'm not sure I understand your concern with the new
tp_iter slot, it just makes cStringIO a self iterator
as requested on python-dev, going for
the analogy with file objects, right?  Actually it should
probably use the still-being-debated GenericGetIter
or whatever it will be called, but not until the debate is
over.

I think the get/setattrs are okay.  Everything they
did is done by the default get/set attrs, once we
set up the appropriate methods and members
(there's just the one member, softspace).  I thought
replacing them by the defaults would be clearer
and easier to maintain.  Also, it is in analogy with
fileobject.c, so I thought making the cStringIO
implementation more like file's would be good.

As for the creating a new tuple every time and
the 0,0,0,0 style, you're absolutely right, I've attached
a new patch that fixes those up per your suggestions.
I was creating a new tuple every time in analogy
with iterobject.c's calliter_iternext.  Perhaps that
should be changed as well?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-17 19:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-11 21:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 17:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 21:17:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 13:17:18 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18vkw2-00050s-00@sc8-sf-web1.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 19:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 21:17

Message:
Logged In: YES 
user_id=670441

Okay, patchstrio4 uses PyObject_SelfIter and
doesn't have as much of my prettification, so
there aren't any whitespace-only diff lines. (I think)
Should I assign this patch to either Neal or MvL
for further review, or would that be impolite?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-19 19:54

Message:
Logged In: YES 
user_id=80475

It looks good to me, compiles okay, passes tests, etc.  I do 
prefer that you get one more reviewer to look at it.  Neal or 
MvL might be a good choice.

GvR picked PyObject_SelfIter to be the name of the 
iterator's tp_iter slot filler.  So you can go ahead and use it 
to eliminate IO_getiter.

One nit, when you load the next patch, copy in the 
unchanged lines from the original.  There are many lines 
marked as having a change but the content is the same.  
This means that something changed in the whitespace.  
It's not big deal but it makes the patch harder to review.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 18:39

Message:
Logged In: YES 
user_id=670441

I'm not sure I understand your concern with the new
tp_iter slot, it just makes cStringIO a self iterator
as requested on python-dev, going for
the analogy with file objects, right?  Actually it should
probably use the still-being-debated GenericGetIter
or whatever it will be called, but not until the debate is
over.

I think the get/setattrs are okay.  Everything they
did is done by the default get/set attrs, once we
set up the appropriate methods and members
(there's just the one member, softspace).  I thought
replacing them by the defaults would be clearer
and easier to maintain.  Also, it is in analogy with
fileobject.c, so I thought making the cStringIO
implementation more like file's would be good.

As for the creating a new tuple every time and
the 0,0,0,0 style, you're absolutely right, I've attached
a new patch that fixes those up per your suggestions.
I was creating a new tuple every time in analogy
with iterobject.c's calliter_iternext.  Perhaps that
should be changed as well?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-18 00:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-12 02:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 01:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 22:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 21:27:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 13:27:24 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18vl5o-00070v-00@sc8-sf-web3.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 14:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-19 16:27

Message:
Logged In: YES 
user_id=33168

I don't think it's impolite.  I'll try to take a look later,
unless someone beats me to it. :-)

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 16:17

Message:
Logged In: YES 
user_id=670441

Okay, patchstrio4 uses PyObject_SelfIter and
doesn't have as much of my prettification, so
there aren't any whitespace-only diff lines. (I think)
Should I assign this patch to either Neal or MvL
for further review, or would that be impolite?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-19 14:54

Message:
Logged In: YES 
user_id=80475

It looks good to me, compiles okay, passes tests, etc.  I do 
prefer that you get one more reviewer to look at it.  Neal or 
MvL might be a good choice.

GvR picked PyObject_SelfIter to be the name of the 
iterator's tp_iter slot filler.  So you can go ahead and use it 
to eliminate IO_getiter.

One nit, when you load the next patch, copy in the 
unchanged lines from the original.  There are many lines 
marked as having a change but the content is the same.  
This means that something changed in the whitespace.  
It's not big deal but it makes the patch harder to review.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 13:39

Message:
Logged In: YES 
user_id=670441

I'm not sure I understand your concern with the new
tp_iter slot, it just makes cStringIO a self iterator
as requested on python-dev, going for
the analogy with file objects, right?  Actually it should
probably use the still-being-debated GenericGetIter
or whatever it will be called, but not until the debate is
over.

I think the get/setattrs are okay.  Everything they
did is done by the default get/set attrs, once we
set up the appropriate methods and members
(there's just the one member, softspace).  I thought
replacing them by the defaults would be clearer
and easier to maintain.  Also, it is in analogy with
fileobject.c, so I thought making the cStringIO
implementation more like file's would be good.

As for the creating a new tuple every time and
the 0,0,0,0 style, you're absolutely right, I've attached
a new patch that fixes those up per your suggestions.
I was creating a new tuple every time in analogy
with iterobject.c's calliter_iternext.  Perhaps that
should be changed as well?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-17 19:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-11 21:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 17:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 22:55:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 14:55:21 -0800
Subject: [Patches] [ python-Patches-706590 ] Adds Mock Objet support to unittest.TestCase
Message-ID: <E18vmSv-0001qI-00@sc8-sf-web3.sourceforge.net>

Patches item #706590, was opened at 2003-03-19 22:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthew Russell (mattruss)
Assigned to: Nobody/Anonymous (nobody)
Summary: Adds Mock Objet support to unittest.TestCase

Initial Comment:
Mock objects can greatly improve unittests (If used in 
the correct context), especially for code that relis upon 
resource hungry test (connections to databases, socket 
servers etc).

The module/patch (to unittest) which I am submitting 
helps to introspect calls to code whilst maintaing 
transparency and funcionality with your code.

I had previously written a similar module for my present 
employers, and myself and fellow XP partners agree 
that it has made the XP testing cycle consderably 
easier.  Having googol-ed-out alternatives on the web, I 
have not found a solution that provides the same level of 
flexibility. (hope that doesn't sound arrogant)

The tests for this module should highlight usage, but i 
will supply dummy code if this idea is accepted.

If unfamiliar with XP/MockObject ideas, please see :
http://www.xprogramming.com/xpmag/virtualMockObject
s.htm#N78


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470


From noreply@sourceforge.net  Wed Mar 19 23:11:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 15:11:09 -0800
Subject: [Patches] [ python-Patches-706590 ] Adds Mock Object support to unittest.TestCase
Message-ID: <E18vmiD-0002Sw-00@sc8-sf-web3.sourceforge.net>

Patches item #706590, was opened at 2003-03-19 22:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthew Russell (mattruss)
Assigned to: Nobody/Anonymous (nobody)
>Summary: Adds Mock Object support to unittest.TestCase

Initial Comment:
Mock objects can greatly improve unittests (If used in 
the correct context), especially for code that relis upon 
resource hungry test (connections to databases, socket 
servers etc).

The module/patch (to unittest) which I am submitting 
helps to introspect calls to code whilst maintaing 
transparency and funcionality with your code.

I had previously written a similar module for my present 
employers, and myself and fellow XP partners agree 
that it has made the XP testing cycle consderably 
easier.  Having googol-ed-out alternatives on the web, I 
have not found a solution that provides the same level of 
flexibility. (hope that doesn't sound arrogant)

The tests for this module should highlight usage, but i 
will supply dummy code if this idea is accepted.

If unfamiliar with XP/MockObject ideas, please see :
http://www.xprogramming.com/xpmag/virtualMockObject
s.htm#N78


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 04:18:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 20:18:50 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18vrVy-0006Kt-00@sc8-sf-web4.sourceforge.net>

Patches item #675422, was opened at 2003-01-27 08:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-19 23:18

Message:
Logged In: YES 
user_id=33168

test_time is now failing on Solaris 8.  altzone is -3600,
but should be 0.  Also, is there a reason to compare
timezone to altzone, but then check that each is 0 (line
78)?  Can you provide any suggestions for where to look for
the problem?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-14 17:03

Message:
Logged In: YES 
user_id=6380

OK, checked in with that line removed.

Thanks!

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-07 23:42

Message:
Logged In: YES 
user_id=46639

Leave it commented out or remove that line. It is testing
unimportant behaviour that looks more platform dependant
than I suspected (and now I look at it again, what tzname
should be set to if the timezone is unknow is unspecified by
the tzset(3) docs). The important behaviour is that:

a) the system silently falls back to UTC if the timezone is
unknown, and this is tested elsewhere 

b) calling tzset resets tzname, which is also tested elsewhere.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 09:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 16:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 16:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 07:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-20 23:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 04:57:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 20:57:47 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18vs7f-0003Tk-00@sc8-sf-web1.sourceforge.net>

Patches item #706707, was opened at 2003-03-20 15:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Nobody/Anonymous (nobody)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 05:06:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 19 Mar 2003 21:06:54 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18vsGU-0006gx-00@sc8-sf-web2.sourceforge.net>

Patches item #675422, was opened at 2003-01-28 00:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Stuart Bishop (zenzen)
Date: 2003-03-20 16:06

Message:
Logged In: YES 
user_id=46639

An update to this patch is now available:
http://www.python.org/sf/706707

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-20 15:18

Message:
Logged In: YES 
user_id=33168

test_time is now failing on Solaris 8.  altzone is -3600,
but should be 0.  Also, is there a reason to compare
timezone to altzone, but then check that each is 0 (line
78)?  Can you provide any suggestions for where to look for
the problem?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-15 09:03

Message:
Logged In: YES 
user_id=6380

OK, checked in with that line removed.

Thanks!

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-08 15:42

Message:
Logged In: YES 
user_id=46639

Leave it commented out or remove that line. It is testing
unimportant behaviour that looks more platform dependant
than I suspected (and now I look at it again, what tzname
should be set to if the timezone is unknow is unspecified by
the tzset(3) docs). The important behaviour is that:

a) the system silently falls back to UTC if the timezone is
unknown, and this is tested elsewhere 

b) calling tzset resets tzname, which is also tested elsewhere.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-08 01:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-22 08:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-22 08:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 23:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 15:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 15:26:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 07:26:53 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18w1wT-0002aq-00@sc8-sf-web3.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 13:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 16:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 09:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 02:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 19:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 21:50:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 13:50:13 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18w7vR-0007ar-00@sc8-sf-web4.sourceforge.net>

Patches item #706707, was opened at 2003-03-19 23:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Stuart Bishop (zenzen)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-20 16:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 22:13:45 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 14:13:45 -0800
Subject: [Patches] [ python-Patches-707167 ] fix bug #682813: dircache.listdir doesn't signal error
Message-ID: <E18w8ID-0002SF-00@sc8-sf-web1.sourceforge.net>

Patches item #707167, was opened at 2003-03-20 22:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #682813: dircache.listdir doesn't signal error

Initial Comment:

Attached small patch makes dircache.listdir
raise OSError when encountered in
os.stat or os.listdir.  This certainly seems
like the right thing to do to be consistent
with os.listdir, though there may have been
a reason not to raise the exception I don't know
about, as it is obviously being purposefully caught
right now.  If there is a reason, someone let me
know and I'll submit a patch to change dircache's
documentation to reflect its behavior.

The test case is also updated by the patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470


From noreply@sourceforge.net  Thu Mar 20 22:14:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 14:14:38 -0800
Subject: [Patches] [ python-Patches-707167 ] fix bug #682813: dircache.listdir doesn't signal error
Message-ID: <E18w8J4-0000Sq-00@sc8-sf-web4.sourceforge.net>

Patches item #707167, was opened at 2003-03-20 22:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470

>Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #682813: dircache.listdir doesn't signal error

Initial Comment:

Attached small patch makes dircache.listdir
raise OSError when encountered in
os.stat or os.listdir.  This certainly seems
like the right thing to do to be consistent
with os.listdir, though there may have been
a reason not to raise the exception I don't know
about, as it is obviously being purposefully caught
right now.  If there is a reason, someone let me
know and I'll submit a patch to change dircache's
documentation to reflect its behavior.

The test case is also updated by the patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 01:03:02 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 17:03:02 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wAw2-0006UZ-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 01:11:42 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 17:11:42 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18wB4Q-0005vJ-00@sc8-sf-web2.sourceforge.net>

Patches item #706707, was opened at 2003-03-19 23:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 7
Submitted By: Stuart Bishop (zenzen)
>Assigned to: Nobody/Anonymous (nobody)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-20 20:11

Message:
Logged In: YES 
user_id=6380

Unassigning, as I won't hve time for this. But it is
important - someone else should make sure this goes into 2.3b1!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-20 16:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 01:18:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 17:18:13 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18wBAj-00067y-00@sc8-sf-web2.sourceforge.net>

Patches item #706707, was opened at 2003-03-19 23:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Stuart Bishop (zenzen)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-20 20:18

Message:
Logged In: YES 
user_id=33168

I'll try to get to this soon.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-20 20:11

Message:
Logged In: YES 
user_id=6380

Unassigning, as I won't hve time for this. But it is
important - someone else should make sure this goes into 2.3b1!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-20 16:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 01:56:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 17:56:55 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wBmB-0002KU-00@sc8-sf-web3.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 17:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 17:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 02:03:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 18:03:40 -0800
Subject: [Patches] [ python-Patches-707167 ] fix bug #682813: dircache.listdir doesn't signal error
Message-ID: <E18wBsi-0007yc-00@sc8-sf-web4.sourceforge.net>

Patches item #707167, was opened at 2003-03-20 14:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug #682813: dircache.listdir doesn't signal error

Initial Comment:

Attached small patch makes dircache.listdir
raise OSError when encountered in
os.stat or os.listdir.  This certainly seems
like the right thing to do to be consistent
with os.listdir, though there may have been
a reason not to raise the exception I don't know
about, as it is obviously being purposefully caught
right now.  If there is a reason, someone let me
know and I'll submit a patch to change dircache's
documentation to reflect its behavior.

The test case is also updated by the patch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 18:03

Message:
Logged In: YES 
user_id=357491

Patch looks good.

Don't let the wording in the description mislead you, though.  No exception is specifically raised; it just is not caught anymore.

As for whether this patch should be applied or not I have no clue since I never use the module.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707167&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 02:11:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 18:11:25 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wC0D-0002fW-00@sc8-sf-web3.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 02:47:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 18:47:28 -0800
Subject: [Patches] [ python-Patches-681927 ] bundlebuilder: Add dylibs, frameworks to the bundle
Message-ID: <E18wCZ6-0000m4-00@sc8-sf-web4.sourceforge.net>

Patches item #681927, was opened at 2003-02-06 13:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Robin Dunn (robind)
Assigned to: Just van Rossum (jvr)
Summary: bundlebuilder: Add dylibs, frameworks to the bundle

Initial Comment:
This patch adds the ability to specify that shared
libraries and Frameworks (the last is untested as of
yet) to the bundle.  It is mostly by Kevin Olliver with
some suggestions by me.

In addition to copying the files into the bundle the
launcher script in the bundle is modified to set the
DYLD_LIBRARY_PATH to the right place.


----------------------------------------------------------------------

>Comment By: Robin Dunn (robind)
Date: 2003-03-20 18:47

Message:
Logged In: YES 
user_id=53955

New patch attached

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-18 10:22

Message:
Logged In: YES 
user_id=92689

Having a manual option is a fine start. But can any of you rework the patch so it doesn't mess with whitespace, and update it for current CVS?

----------------------------------------------------------------------

Comment By: Kevin Ollivier (kollivier)
Date: 2003-02-07 12:52

Message:
Logged In: YES 
user_id=248468

I'll take a look at otool and see if it does what we need. As Robin mentioned, I think giving both the manual and auto options is the best approach. 

I'll also check into the dependency on Apple's Dev Tools, but even if it is dependent we could just switch off auto-detection if users don't have it and spit out a warning. Another possible way to alleviate this problem may be to integrate with distutils. (i.e. make a 'buildbundle' option) That should at least allow us to find and include any libraries the developer linked against. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-07 00:59

Message:
Logged In: YES 
user_id=92689

I use tabs for indentation and use spaces for alignment... So things look nice _and_ wont screw up with different tab settings. But I admit that using such a non-standard way is asking for trouble. I'll convert to spaces after this patch has been done (unless you prefer I do it _before_ ;-).

(Btw. it might be that otool is only available with the apple dev tools, which would be a shame since we otherwise don't depend in dev tools being available. Hm.)

----------------------------------------------------------------------

Comment By: Robin Dunn (robind)
Date: 2003-02-06 15:20

Message:
Logged In: YES 
user_id=53955

Oops, sorry for the witespace patches.  I noticed that my
lines used spaces but the lines around them were using tabs
so I just ran a tabify on the whole file without taking
another look at the resulting patch file after that.  Looks
like some of other lines that wre added since 2.3a1 have
spaces too and that is where the problem comes from.  I'll
redo the patch but the whole file should probably be either
tabified or untabified after you are done applying it.

I didn't know about otool. I'll pass that on to Kevin.  We
discussed about doing automatic finding of libs but didn't
know how to go about it so thought that this would be a good
start.  Also we figured that even if there was a way to do
it that you would probably want a way to inlcude other files
that may not get automatically found, or to exclude some
that were, so there should be command line options for it
anyway. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-06 14:35

Message:
Logged In: YES 
user_id=92689

Cool. There's a problem with the patch, though: although I apologize for using tabs to begin with, please keep the tab usage consistent. There are quite a few hunks in the patch that only touch whitespace and that's both undesirable as well as blurring the intent of the patch... Could you upload a cleaner one?

Btw. for the --standalone build mode it would be possible to calculate all framework/dylib dependencies with the otool tool. If this were implemented perhaps the --lib option wouldn't even be needed? Another question remains: if we include a framework, is there a way to strip it from redunant files, eg. headers? If we would use this mechanism to include Python.framework we would definitely need a way to trim it down, eg. all of lib is taken care of by modulefinder anyway. If you (or Kevin) have any ideas about that, pls contact me off line.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 02:49:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 18:49:26 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18wCb0-0003sg-00@sc8-sf-web3.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 04:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 18:49

Message:
Logged In: YES 
user_id=357491

I went through Walter's diff by hand and found two places where more clean-up could be done and two show-stoppers.  In case I don't get my version of the patch up fast enough for people, the files that have spots that could use some more minor clean-up are Lib/lib-tk/Tix.py and Lib/lib-tk/Tkinter.py .  The showstoppers are in Lib/lib-tk/tkCommonDialog.py (method call that didn't get *'ed) and Lib/test/test_builtin.py (test_builtin.py should not even be patched since the affected lines are in the tests for apply() itself).

I will have my version up before the weekend.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 07:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 00:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 17:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 10:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 05:14:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 21:14:13 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18wEr7-0004eV-00@sc8-sf-web4.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 07:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 00:14

Message:
Logged In: YES 
user_id=80475

Good job Brett :-)

I'll wait for your next post before going through this one 
with a fine toothed comb.

-- R

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 21:49

Message:
Logged In: YES 
user_id=357491

I went through Walter's diff by hand and found two places where more clean-up could be done and two show-stoppers.  In case I don't get my version of the patch up fast enough for people, the files that have spots that could use some more minor clean-up are Lib/lib-tk/Tix.py and Lib/lib-tk/Tkinter.py .  The showstoppers are in Lib/lib-tk/tkCommonDialog.py (method call that didn't get *'ed) and Lib/test/test_builtin.py (test_builtin.py should not even be patched since the affected lines are in the tests for apply() itself).

I will have my version up before the weekend.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 10:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 03:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 13:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 07:42:23 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 23:42:23 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18wHAV-00057N-00@sc8-sf-web1.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 04:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 23:42

Message:
Logged In: YES 
user_id=357491

Well, I have now run into my first issue of not having commit priveleges; I can't upload my diff.  So you will have to get it from http://www.ocf.berkeley.edu/~bac/apply3.diff .  The only difference between my diff and Walter's is that I changed three files and removed the diff for test_builtin.py .

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:14

Message:
Logged In: YES 
user_id=80475

Good job Brett :-)

I'll wait for your next post before going through this one 
with a fine toothed comb.

-- R

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 18:49

Message:
Logged In: YES 
user_id=357491

I went through Walter's diff by hand and found two places where more clean-up could be done and two show-stoppers.  In case I don't get my version of the patch up fast enough for people, the files that have spots that could use some more minor clean-up are Lib/lib-tk/Tix.py and Lib/lib-tk/Tkinter.py .  The showstoppers are in Lib/lib-tk/tkCommonDialog.py (method call that didn't get *'ed) and Lib/test/test_builtin.py (test_builtin.py should not even be patched since the affected lines are in the tests for apply() itself).

I will have my version up before the weekend.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 07:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 00:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 17:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 10:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 07:43:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 20 Mar 2003 23:43:37 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wHBh-0000Bd-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 17:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 23:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 18:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 17:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 08:02:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 00:02:15 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wHTj-00010o-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-21 02:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2003-03-21 09:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 08:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 03:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 08:16:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 00:16:40 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18wHhg-0004Gq-00@sc8-sf-web3.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 13:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 09:16

Message:
Logged In: YES 
user_id=89016

This shouldn't have anything to do with commit privileges.
I'm uploading your apply3.diff so it doesn't get lost. If
test_builtin calls apply it should probably make sure that
both the PendingDeprecationWarning and the
DeprecationWarning that might be issued some day are
switched off.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 08:42

Message:
Logged In: YES 
user_id=357491

Well, I have now run into my first issue of not having commit priveleges; I can't upload my diff.  So you will have to get it from http://www.ocf.berkeley.edu/~bac/apply3.diff .  The only difference between my diff and Walter's is that I changed three files and removed the diff for test_builtin.py .

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 06:14

Message:
Logged In: YES 
user_id=80475

Good job Brett :-)

I'll wait for your next post before going through this one 
with a fine toothed comb.

-- R

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 03:49

Message:
Logged In: YES 
user_id=357491

I went through Walter's diff by hand and found two places where more clean-up could be done and two show-stoppers.  In case I don't get my version of the patch up fast enough for people, the files that have spots that could use some more minor clean-up are Lib/lib-tk/Tix.py and Lib/lib-tk/Tkinter.py .  The showstoppers are in Lib/lib-tk/tkCommonDialog.py (method call that didn't get *'ed) and Lib/test/test_builtin.py (test_builtin.py should not even be patched since the affected lines are in the tests for apply() itself).

I will have my version up before the weekend.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 16:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 09:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-12 02:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 19:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 08:21:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 00:21:44 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wHma-0006jK-00@sc8-sf-web1.sourceforge.net>

Patches item #707257, was opened at 2003-03-21 02:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 09:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 09:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 08:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 03:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 08:32:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 00:32:15 -0800
Subject: [Patches] [ python-Patches-701494 ] more apply removals
Message-ID: <E18wHwl-0002Bm-00@sc8-sf-web4.sourceforge.net>

Patches item #701494, was opened at 2003-03-11 04:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Raymond Hettinger (rhettinger)
Summary: more apply removals

Initial Comment:
More apply() removals from the following files:
./compiler/transformer.py
./curses/wrapper.py
./distutils/command/build_ext.py
./distutils/command/build_py.py
./distutils/archive_util.py
./distutils/dir_util.py
./distutils/filelist.py
./distutils/util.py
./bsddb/test/test_basics.py
./bsddb/test/test_dbobj.py
./bsddb/dbobj.py
./bsddb/dbshelve.py
./lib-tk/Canvas.py
./lib-tk/Dialog.py
./lib-tk/ScrolledText.py
./lib-tk/Tix.py
./lib-tk/Tkinter.py
./lib-tk/tkColorChooser.py
./lib-tk/tkCommonDialog.py
./lib-tk/tkFont.py
./lib-tk/tkMessageBox.py
./lib-tk/tkSimpleDialog.py
./lib-tk/turtle.py
./test/reperf.py
./test/test_b1.py
./test/test_builtin.py
./test/test_curses.py
./logging/__init__.py
./logging/config.py
./xml/dom/minidom.py
./plat-mac/Carbon/MediaDescr.py
./plat-mac/EasyDialogs.py
./plat-mac/FrameWork.py
./plat-mac/MiniAEFrame.py
./plat-mac/argvemulator.py
./plat-mac/icopen.py

I know that the edited files are syntactically correct (ie 
compileall.compile_dir throws no errors), but please help 
testing that functionality is the same.  I am testing at 
the moment for lib-tk changes.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 00:32

Message:
Logged In: YES 
user_id=357491

Well, then SF is broken right now because I don't have an option to upload.

As for the PendingDeprecationWarning check, I think that is a good idea.  Shouldn't that be a separate patch, though?  I personally can't do it any time soon because of PyCon plus I have updating test_urllib on my todo list (thanks, Raymond  =).

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 00:16

Message:
Logged In: YES 
user_id=89016

This shouldn't have anything to do with commit privileges.
I'm uploading your apply3.diff so it doesn't get lost. If
test_builtin calls apply it should probably make sure that
both the PendingDeprecationWarning and the
DeprecationWarning that might be issued some day are
switched off.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 23:42

Message:
Logged In: YES 
user_id=357491

Well, I have now run into my first issue of not having commit priveleges; I can't upload my diff.  So you will have to get it from http://www.ocf.berkeley.edu/~bac/apply3.diff .  The only difference between my diff and Walter's is that I changed three files and removed the diff for test_builtin.py .

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:14

Message:
Logged In: YES 
user_id=80475

Good job Brett :-)

I'll wait for your next post before going through this one 
with a fine toothed comb.

-- R

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 18:49

Message:
Logged In: YES 
user_id=357491

I went through Walter's diff by hand and found two places where more clean-up could be done and two show-stoppers.  In case I don't get my version of the patch up fast enough for people, the files that have spots that could use some more minor clean-up are Lib/lib-tk/Tix.py and Lib/lib-tk/Tkinter.py .  The showstoppers are in Lib/lib-tk/tkCommonDialog.py (method call that didn't get *'ed) and Lib/test/test_builtin.py (test_builtin.py should not even be patched since the affected lines are in the tests for apply() itself).

I will have my version up before the weekend.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-20 07:26

Message:
Logged In: YES 
user_id=89016

I've gone over the patch and simplyfied it a bit (e.g.
replacing f(*(1,2,3) + args) with f(1,2,3, *args)). I've
also removed the patches for distutils, logging and bsddb
(code at the start of bsddb/dbutils.py seems to indicate
that it should be usable with versions prior to 2.3).

Raymond, do you have time to recheck the patch?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-03-12 00:46

Message:
Logged In: YES 
user_id=539787

Walter: I untargzipped the python-latest.tgz of 2003-03-10 
over an older directory (I think about a month ago), therefore 
the existence of test_b1.py.  All files that exist in the current 
dist were also current.
Raymond: you are correct about my not reading the file 
headers (it was a multifile vi session with a +/"apply(" 
option...)
I just had a little time available for non-creative work, so I 
checked, saw that Guido already had changed most of the 
library files, and offered the change of the rest of them; you 
guys can do whatever you want with it :)
The lib-tk changes seem to be ok, after running some UI 
python scripts I have.  I haven't checked bsddb yet.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 17:41

Message:
Logged In: YES 
user_id=80475

Also, be sure to read the PEP on which modules should 
not be modernized.  Sometimes that information is written 
in the file itself rather than the pep.  For instance, the 
logging package is supposed to be kept in a form that 
runs on older pythons.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-11 10:34

Message:
Logged In: YES 
user_id=89016

There is no longer a test/test_b1.py in current CVS, so it
seems you've done the diff against an older version. Could
you update the patch for current CVS?

Also according to PEP 291
(http://www.python.org/peps/pep-0291.html) both distutils
and logging should remain 1.5.2 compatible.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701494&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 09:41:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 01:41:31 -0800
Subject: [Patches] [ python-Patches-681927 ] bundlebuilder: Add dylibs, frameworks to the bundle
Message-ID: <E18wJ1n-0007lI-00@sc8-sf-web3.sourceforge.net>

Patches item #681927, was opened at 2003-02-06 22:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470

Category: Macintosh
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Robin Dunn (robind)
Assigned to: Just van Rossum (jvr)
Summary: bundlebuilder: Add dylibs, frameworks to the bundle

Initial Comment:
This patch adds the ability to specify that shared
libraries and Frameworks (the last is untested as of
yet) to the bundle.  It is mostly by Kevin Olliver with
some suggestions by me.

In addition to copying the files into the bundle the
launcher script in the bundle is modified to set the
DYLD_LIBRARY_PATH to the right place.


----------------------------------------------------------------------

>Comment By: Just van Rossum (jvr)
Date: 2003-03-21 10:41

Message:
Logged In: YES 
user_id=92689

Thanks Robin, this is perfect. It's in CVS.

----------------------------------------------------------------------

Comment By: Robin Dunn (robind)
Date: 2003-03-21 03:47

Message:
Logged In: YES 
user_id=53955

New patch attached

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-03-18 19:22

Message:
Logged In: YES 
user_id=92689

Having a manual option is a fine start. But can any of you rework the patch so it doesn't mess with whitespace, and update it for current CVS?

----------------------------------------------------------------------

Comment By: Kevin Ollivier (kollivier)
Date: 2003-02-07 21:52

Message:
Logged In: YES 
user_id=248468

I'll take a look at otool and see if it does what we need. As Robin mentioned, I think giving both the manual and auto options is the best approach. 

I'll also check into the dependency on Apple's Dev Tools, but even if it is dependent we could just switch off auto-detection if users don't have it and spit out a warning. Another possible way to alleviate this problem may be to integrate with distutils. (i.e. make a 'buildbundle' option) That should at least allow us to find and include any libraries the developer linked against. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-07 09:59

Message:
Logged In: YES 
user_id=92689

I use tabs for indentation and use spaces for alignment... So things look nice _and_ wont screw up with different tab settings. But I admit that using such a non-standard way is asking for trouble. I'll convert to spaces after this patch has been done (unless you prefer I do it _before_ ;-).

(Btw. it might be that otool is only available with the apple dev tools, which would be a shame since we otherwise don't depend in dev tools being available. Hm.)

----------------------------------------------------------------------

Comment By: Robin Dunn (robind)
Date: 2003-02-07 00:20

Message:
Logged In: YES 
user_id=53955

Oops, sorry for the witespace patches.  I noticed that my
lines used spaces but the lines around them were using tabs
so I just ran a tabify on the whole file without taking
another look at the resulting patch file after that.  Looks
like some of other lines that wre added since 2.3a1 have
spaces too and that is where the problem comes from.  I'll
redo the patch but the whole file should probably be either
tabified or untabified after you are done applying it.

I didn't know about otool. I'll pass that on to Kevin.  We
discussed about doing automatic finding of libs but didn't
know how to go about it so thought that this would be a good
start.  Also we figured that even if there was a way to do
it that you would probably want a way to inlcude other files
that may not get automatically found, or to exclude some
that were, so there should be command line options for it
anyway. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2003-02-06 23:35

Message:
Logged In: YES 
user_id=92689

Cool. There's a problem with the patch, though: although I apologize for using tabs to begin with, please keep the tab usage consistent. There are quite a few hunks in the patch that only touch whitespace and that's both undesirable as well as blurring the intent of the patch... Could you upload a cleaner one?

Btw. for the --standalone build mode it would be possible to calculate all framework/dylib dependencies with the otool tool. If this were implemented perhaps the --lib option wouldn't even be needed? Another question remains: if we include a framework, is there a way to strip it from redunant files, eg. headers? If we would use this mechanism to include Python.framework we would definitely need a way to trim it down, eg. all of lib is taken care of by modulefinder anyway. If you (or Kevin) have any ideas about that, pls contact me off line.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=681927&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 10:57:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 02:57:54 -0800
Subject: [Patches] [ python-Patches-707427 ] Allow range() to return long integer values
Message-ID: <E18wKDi-00061q-00@sc8-sf-web1.sourceforge.net>

Patches item #707427, was opened at 2003-03-21 02:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707427&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Chad Netzer (chadn)
Assigned to: Nobody/Anonymous (nobody)
Summary: Allow range() to return long integer values

Initial Comment:
Extend range() builtin so that long integers may be
generated.
ie. range(10**20, 10**20 + 5)

New code path is only executed when normal code path
fails, to avoid slowing down the existing run path.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707427&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 16:53:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 08:53:11 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18wPlX-0006zB-00@sc8-sf-web2.sourceforge.net>

Patches item #702620, was opened at 2003-03-13 01:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-21 17:53

Message:
Logged In: YES 
user_id=45365

Donovan,
I checked your fixes in, but possibly a bit premature: things broke. For example, running findertools.py as main program (a simple test of the scripting infrastructure) will now fail for me, in getbaseclasses(writing_code). And that seems correct: writing_code is an NProperty, not a ComponentItem.

Before you fix things: please check out a fresh tree. I seriously hacked gensuitemodule after applying your mods (it can now run non-interactive on MacOSX).

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-18 19:00

Message:
Logged In: YES 
user_id=111050

Jack,
Thanks for taking a look at this.

You are correct, if a class has no properties then v._propdict will still be empty, and we will do unneccessary work the next time getbaseclasses is called. I suppose it could be "if not v._propdict and not v._elemdict:" which would reduce the unnecessary work down to when a base class has neither properties nor elements; frankly the if is not really required at all; it was just an attempt to prevent work that has already been performed from being performed again unnecessarily. Suggestions welcome.

Re _superclassnames, like everything else done with gensuitemodule, we need to be really careful about circular references, references to things that haven't been defined yet, etc. Everything generated by gensuitemodule is either a ComponentItem or an NProperty, and they don't actually inherit from each other in Python because doing so would be too hairy. So we can't use __bases__ because there is none :-)

The thing about _superclassnames is that it's just what it sounds like; a list of strings that indicate superclasses of the current class. By deferring getbaseclasses to import time, we ensure all of the base classes are defined by then.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-16 23:42

Message:
Logged In: YES 
user_id=45365

Donovan,
in as far as I understand the matter (in which area you are clearly my superior:-) I think the idea of the fix is correct, but I have one misgiving: if a class has no properties then v._propdict will still be empty after getbaseclasses(). This will result in the next call of getbaseclasses (if this class is the base class of another) going through the motions again.

Is this a problem?

Also, do we really need _superclassnames, can't we do this with __bases__? I vaguely remember we went through this issue before, but I can't remember fully...

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 19:18:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 11:18:41 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wS2L-000548-00@sc8-sf-web2.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 19:36:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 11:36:39 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wSJj-0005p1-00@sc8-sf-web2.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 19:37:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 11:37:19 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wSKN-0005r5-00@sc8-sf-web2.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
>Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 20:09:03 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 12:09:03 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wSp5-00069S-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-21 02:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Thomas Heller (theller)
Date: 2003-03-21 21:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 20:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 09:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 09:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 08:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 03:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 20:28:43 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 12:28:43 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wT87-0000C4-00@sc8-sf-web1.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 20:59:45 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 12:59:45 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wTc9-0004Dr-00@sc8-sf-web3.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 11:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 12:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 21:14:03 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 13:14:03 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wTpz-0000Go-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2003-03-21 21:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 20:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 21:28:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 13:28:21 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18wU3p-0002e5-00@sc8-sf-web1.sourceforge.net>

Patches item #706707, was opened at 2003-03-19 23:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Stuart Bishop (zenzen)
Assigned to: Neal Norwitz (nnorwitz)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-21 16:28

Message:
Logged In: YES 
user_id=33168

After patching, the test fails:

  File "/home/neal/build/python/2_3/Lib/test/test_time.py",
line 115, in test_tzset
    self.failUnlessEqual(time.daylight,1)
  File "/home/neal/build/python/2.3/Lib/unittest.py", line
292, in failUnlessEqual
    raise self.failureException, \
AssertionError: 0 != 1


Also, why is the code commented out (via a string) on lines
120-144?  Should these be removed?  I see the comment about
wallclock time, but don't understand why the code should be
left in if we can't test it.  I can understand a comment
describing generally the issue.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-20 20:18

Message:
Logged In: YES 
user_id=33168

I'll try to get to this soon.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-20 20:11

Message:
Logged In: YES 
user_id=6380

Unassigning, as I won't hve time for this. But it is
important - someone else should make sure this goes into 2.3b1!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-20 16:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 21:31:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 13:31:21 -0800
Subject: [Patches] [ python-Patches-675422 ] Add tzset method to time module
Message-ID: <E18wU6j-00030O-00@sc8-sf-web2.sourceforge.net>

Patches item #675422, was opened at 2003-01-27 08:42
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Add tzset method to time module

Initial Comment:
Adds access to the tzset method, allowing you to change your local timezone as required. In addition to invoking the tzset system
call, the code also updates the timezone attributes (time.timezone etc). This lets you do timezone conversions amongst other things.

Also includes changes to configure.in to only build new code if the tzset method correctly switches timezones on your platform. This 
should be for all modern Unixes, and possibly other platforms.

Also includes tests in test_time.py

Docs would be along the lines of:

tzset() -- 
Initialize, or reinitialize, the local timezone to the value stored in os.environ['TZ']. The TZ environment variable should be specified in
standard Uniz timezone format as documented in the tzset man page
(eg. 'US/Eastern', 'Europe/Amsterdam'). Unknown timezones will silently fall back to UTC. If the TZ environment variable is not set, the local timezone is set to the systems best guess of wallclock time.
Changing the TZ environment variable without calling tzset *may* change the local timezone used by methods such as localtime, but this behaviour should not be relied on.

eg::

>>> now = time.time()
>>> os.environ['TZ'] = 'Europe/Amsterdam'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 14:35:17 2003'
>>> time.tzname  
('CET', 'CEST')
>>> os.environ['TZ'] = 'US/Eastern'
>>> time.tzset()
>>> time.ctime(now)
'Mon Jan 27 08:35:17 2003'
>>> time.tzname
('EST', 'EDT')

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-21 16:31

Message:
Logged In: YES 
user_id=33168

Closing again, the problem can be addressed thru the other
patch.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-20 00:06

Message:
Logged In: YES 
user_id=46639

An update to this patch is now available:
http://www.python.org/sf/706707

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-19 23:18

Message:
Logged In: YES 
user_id=33168

test_time is now failing on Solaris 8.  altzone is -3600,
but should be 0.  Also, is there a reason to compare
timezone to altzone, but then check that each is 0 (line
78)?  Can you provide any suggestions for where to look for
the problem?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-14 17:03

Message:
Logged In: YES 
user_id=6380

OK, checked in with that line removed.

Thanks!

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-07 23:42

Message:
Logged In: YES 
user_id=46639

Leave it commented out or remove that line. It is testing
unimportant behaviour that looks more platform dependant
than I suspected (and now I look at it again, what tzname
should be set to if the timezone is unknow is unspecified by
the tzset(3) docs). The important behaviour is that:

a) the system silently falls back to UTC if the timezone is
unknown, and this is tested elsewhere 

b) calling tzset resets tzname, which is also tested elsewhere.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-07 09:25

Message:
Logged In: YES 
user_id=6380

zenzen: when I run the test suite on my Red Hat Linux 7.3
box, I get one failure: the test line
  self.failUnless(time.tzname[0] in ('UTC','GMT'))
fails when the timezone is set to 'Luna/Tycho', because
tzname is in fact set to  ('Luna/Tych', 'Luna/Tych').

If I comment out that one line the tzset test suite passes.

What should I do?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 16:49

Message:
Logged In: YES 
user_id=6380

Sorry, not a chance.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-21 16:45

Message:
Logged In: YES 
user_id=46639

It is a patch to 2.3, but I'd though I'd try and sneak this
new feature past people into 2.2.3 as I want to be able to
use it in Zope 2 :-)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-21 07:56

Message:
Logged In: YES 
user_id=6380

Uh? This is a new feature, so doesn't apply to 2.2.3.

Maybe you meant 2.3?

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-02-20 23:29

Message:
Logged In: YES 
user_id=46639

Assigning to Guido for consideration of being added to
2.2.3, and since he through this patch was a good idea in
the first place :-)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=675422&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 22:10:05 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 14:10:05 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wUiD-0006jU-00@sc8-sf-web3.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 11:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 14:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 13:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 12:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 22:15:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 14:15:39 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wUnb-0002WZ-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 22:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 21:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 20:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 22:36:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 14:36:15 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18wV7X-0007r1-00@sc8-sf-web3.sourceforge.net>

Patches item #702620, was opened at 2003-03-12 15:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Donovan Preston (dsposx)
Date: 2003-03-21 13:36

Message:
Logged In: YES 
user_id=111050

I am surprised I didn't have the same problem -- I should have. I suppose that's why I had the if hasattr in the first version of getbaseclasses. Changing

if not v._propdict:

to

if not getattr(v, '_propdict', None):

Would probably work. I am going to PyCon next week, so I will have more free time to work on non-directly-work related items. I'll do a fresh checkout and build of framework python, and experiment with the latest gensuitemoudle etc.

Donovan

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-21 07:53

Message:
Logged In: YES 
user_id=45365

Donovan,
I checked your fixes in, but possibly a bit premature: things broke. For example, running findertools.py as main program (a simple test of the scripting infrastructure) will now fail for me, in getbaseclasses(writing_code). And that seems correct: writing_code is an NProperty, not a ComponentItem.

Before you fix things: please check out a fresh tree. I seriously hacked gensuitemodule after applying your mods (it can now run non-interactive on MacOSX).

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-18 09:00

Message:
Logged In: YES 
user_id=111050

Jack,
Thanks for taking a look at this.

You are correct, if a class has no properties then v._propdict will still be empty, and we will do unneccessary work the next time getbaseclasses is called. I suppose it could be "if not v._propdict and not v._elemdict:" which would reduce the unnecessary work down to when a base class has neither properties nor elements; frankly the if is not really required at all; it was just an attempt to prevent work that has already been performed from being performed again unnecessarily. Suggestions welcome.

Re _superclassnames, like everything else done with gensuitemodule, we need to be really careful about circular references, references to things that haven't been defined yet, etc. Everything generated by gensuitemodule is either a ComponentItem or an NProperty, and they don't actually inherit from each other in Python because doing so would be too hairy. So we can't use __bases__ because there is none :-)

The thing about _superclassnames is that it's just what it sounds like; a list of strings that indicate superclasses of the current class. By deferring getbaseclasses to import time, we ensure all of the base classes are defined by then.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-16 13:42

Message:
Logged In: YES 
user_id=45365

Donovan,
in as far as I understand the matter (in which area you are clearly my superior:-) I think the idea of the fix is correct, but I have one misgiving: if a class has no properties then v._propdict will still be empty after getbaseclasses(). This will result in the next call of getbaseclasses (if this class is the base class of another) going through the motions again.

Is this a problem?

Also, do we really need _superclassnames, can't we do this with __bases__? I vaguely remember we went through this issue before, but I can't remember fully...

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-12 15:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 22:43:16 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 14:43:16 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wVEK-0003Sx-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 17:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 14:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 12:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 12:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 11:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 00:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 00:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 23:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 18:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 17:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 23:04:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 15:04:32 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wVYu-0004Dv-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Fri Mar 21 23:50:51 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 15:50:51 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18wWHj-0007cZ-00@sc8-sf-web1.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 17:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 15:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 14:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 12:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 12:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 11:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 00:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 00:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 23:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 18:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 17:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Sat Mar 22 04:08:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 20:08:59 -0800
Subject: [Patches] [ python-Patches-707900 ] bug fix 702858: deepcopying reflexive objects
Message-ID: <E18waJX-00049G-00@sc8-sf-web4.sourceforge.net>

Patches item #707900, was opened at 2003-03-21 21:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707900&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven Taschuk (staschuk)
Assigned to: Nobody/Anonymous (nobody)
Summary: bug fix 702858: deepcopying reflexive objects

Initial Comment:
A fix for bug 702858, which concerns the inability of
copy.deepcopy to correctly process reflexive
new-style class instances, that is, instances referring
to themselves.

The fix is one line; the other 51 lines in the patch
are altered and enhanced altered tests in
test_copy.py for this kind of thing.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707900&group_id=5470


From noreply@sourceforge.net  Sat Mar 22 07:26:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 21 Mar 2003 23:26:01 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18wdOD-0001yK-00@sc8-sf-web1.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2003-03-22 07:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 22:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 21:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 20:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Sat Mar 22 13:34:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 22 Mar 2003 05:34:34 -0800
Subject: [Patches] [ python-Patches-708007 ] TelnetPopen3, TelnetBase, Expect split
Message-ID: <E18wj8s-0003IM-00@sc8-sf-web1.sourceforge.net>

Patches item #708007, was opened at 2003-03-22 13:34
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708007&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Luke Kenneth Casson Leighton (lkcl)
Assigned to: Nobody/Anonymous (nobody)
Summary: TelnetPopen3, TelnetBase, Expect split

Initial Comment:
A reordering / code-split of Telnet in telnetlib.py
into Expect (the lowest base class), TelnetBase, Telnet
and TelnetPopen4.

Reason: Expect contains all of the read_xxx(),
expect(), write() and select() functions (and the
interact() and mt_interact())

TelnetPopen4 and Telnet derive from the same TelnetBase
class, and there is nothing stopping anyone from
writing a TelnetHTTP or TelnetURL class which will all
have the same interface: expect() and write() and even
interact()!

weird, huh - typing in URLs and getting the content
back, interactively :)

these TelnetXXX classes are all incredibly useful for
"remote host management" purposes; also the principle
of the TelnetHTTP class is very useful for doing
automated testing of web sites.  send URL, expect text
in it before proceeding with next URL (e.g. login,
check to see if login failed or succeeded; react
accordingly).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708007&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 00:15:12 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 22 Mar 2003 16:15:12 -0800
Subject: [Patches] [ python-Patches-708201 ] unchecked return value in import.c
Message-ID: <E18wt8q-0006ay-00@sc8-sf-web3.sourceforge.net>

Patches item #708201, was opened at 2003-03-22 17:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708201&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
Assigned to: Nobody/Anonymous (nobody)
Summary: unchecked return value in import.c

Initial Comment:
In Python/import.c, routine PyImport_ImportModule, a 
call to PyString_AsString is not checked for errors.  A 
possibly NULL return value gets passed to another 
routine, and DECREFed.  It's not a particularly likely 
place for an error to occur, but I did manage to get a 
MemoryError at exactly that point, resulting in a Python 
crash.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708201&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 11:59:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 03:59:55 -0800
Subject: [Patches] [ python-Patches-612627 ] Allow more Unicode on sys.stdout
Message-ID: <E18x48p-0004Xr-00@sc8-sf-web2.sourceforge.net>

Patches item #612627, was opened at 2002-09-21 22:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. L�wis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Allow more Unicode on sys.stdout

Initial Comment:
This patch extends the set of Unicode strings that can
be printed to sys.stdout, to support all strings that
the terminal will likely support. It also adds an
encoding attribute to sys.std{in,out}.

To do that:
- it adds a .encoding attribute to all file objects,
which is normally None
- initializes the encoding of sys.stdin and sys.stdout
if either is a terminal.
- adds a wrapper object around sys.stdout in site.py
that encodes all Unicode objects according to the
detected encoding, if that encoding is known to Python

To find the encoding of the terminal, it
- uses GetConsoleCP and GetConsoleOutputCP on Windows,
- uses nl_langinfo(CODESET) on Unix, if available.

The primary rationale for this change is that people
should be able to print Unicode in an interactive
session. A parallel change needs to be added for IDLE,
so that it adds the .encoding attribute to the emulated
stdout (it already supports printing of Unicode on stdout).

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 12:59

Message:
Logged In: YES 
user_id=21627

Is the patch now acceptable?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-10-26 19:47

Message:
Logged In: YES 
user_id=21627

I've attached a revised version which implements your
proposal; this version works without modification of site.py.

In its current form, the file encoding is only applied in
print; for sys.stdout.write, it is ignored. For print, it is
applied independent of whether this is a script or
interactive mode.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-10-25 14:09

Message:
Logged In: YES 
user_id=38388

I think it could work by adding a special case to 
PyFile_WriteObject() instead of calling PyObject_Print().
You first encode the Unicode object and then let
PyFile_WriteString() take care of the writing to the
FILE* object.

I see no other way, since you can't place the .encoding 
information into the FILE* object.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-09-24 11:02

Message:
Logged In: YES 
user_id=21627

I have considered implementing it in the file object.
However, it becomes quite involved, and heavy C code:
PyFile_WriteObject calls PyObject_Print. Since Unicode does
not implement a tp_print, this calls str/repr, which
converts using the default encoding.

It is not clear at which point the file encoding should be
taking into account.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-09-24 10:10

Message:
Logged In: NO 

I like the .encoding concept. 

I don't really like the sys.stdout wrapper. Wouldn't it be 
better to add the functionality to the file object .write() and 
.writelines() methods and then only use the wrapper in case 
sys.stdout is not a true file object ?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 12:07:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 04:07:11 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18x4Fr-0001ee-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 20:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
>Assigned to: Martin v. L�wis (loewis)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 13:07

Message:
Logged In: YES 
user_id=21627

What is the problem that this patch solves?

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-22 08:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 23:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 23:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 21:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 13:35:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 05:35:58 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18x5dm-00064z-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 19:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Martin v. L�wis (loewis)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Matthias Klose (doko)
Date: 2003-03-23 13:35

Message:
Logged In: YES 
user_id=60903

> What is the problem that this patch solves?

As the subject says: Provide a patch for #698517.

tk8.4.2 returns for the undefined fields in events empty
strings or '??' strings, on which the int conversions fail.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 12:07

Message:
Logged In: YES 
user_id=21627

What is the problem that this patch solves?

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-22 07:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 22:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 21:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 20:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From dapatches@lycos.com  Sun Mar 23 13:41:42 2003
From: dapatches@lycos.com (dapatches@lycos.com)
Date: Sun, 23 Mar 2003 08:41:42 -0500
Subject: [Patches] dobyvatel coleret Never Forget
Message-ID: <61w28a7n81ef5q0i8l6j$ondayh1ip6rn.dapatches@lycos.com>

<html><dobyvatel><coleret><dobyvatel>
<coleret><dobyvatel><coleret>
<XGAWCDVBKIJTDTGOUMA><PHDOEFFYLBBWUBAH><GRVWVQREORJYMFTD>
<FLCIXVWPSQTSYMAQXTSX><IGHJIQLYYSMIYUF><BDSKHMSLVTGWQRVN>
<coleret><dobyvatel><coleret>
<p>
patches
<p>
<p> <dobyvatel><coleret>
<p>
Capture Your DreamEarn Fina<coleret>ncial Independence
<p>
You can now for the first time,<dobyvatel>
own a business in your area with the most unique, 
innovative product in America today. Work less a week with the potential to earn 
$100,<coleret>000 a year. There is no selling and not MLM. Join a Multi-Trillion Dollar Market.
<p>
The profit margin is amazing.<dobyvatel> 
<p>
<dobyvatel>Break down the walls and live this life you've only dre<coleret>amed about.	
<p>
Limited availability. 
for Your Fr<coleret>ee information package. 
<p>
<a href="http://www.9top9.com/Fortune_500_Today/index.htm">START N-O-W</a>
<p>
<dobyvatel>
<p>
You must check this out if you are serious about making mo<coleret>ney!
<dobyvatel>
<p> 
<p>
</html>


From noreply@sourceforge.net  Sun Mar 23 14:33:46 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 06:33:46 -0800
Subject: [Patches] [ python-Patches-708374 ] add offset to mmap
Message-ID: <E18x6Xi-0007m0-00@sc8-sf-web4.sourceforge.net>

Patches item #708374, was opened at 2003-03-23 09:33
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: add offset to mmap

Initial Comment:
This patch is from Yotam Medini <yotamm at
mellanox.co.il> sent to me in mail.

It adds support for the offset parameter to mmap.

It ignores the check for mmap size "if the file is
character device.  Some device drivers (which I happen
to use) have zero size in fstat buffer, but still one
can seek() read() and tell()."
I added minimal doc and tests.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 14:46:08 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 06:46:08 -0800
Subject: [Patches] [ python-Patches-708201 ] unchecked return value in import.c
Message-ID: <E18x6jg-00032d-00@sc8-sf-web2.sourceforge.net>

Patches item #708201, was opened at 2003-03-22 19:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708201&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Jason Harper (jasonharper)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: unchecked return value in import.c

Initial Comment:
In Python/import.c, routine PyImport_ImportModule, a 
call to PyString_AsString is not checked for errors.  A 
possibly NULL return value gets passed to another 
routine, and DECREFed.  It's not a particularly likely 
place for an error to occur, but I did manage to get a 
MemoryError at exactly that point, resulting in a Python 
crash.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 09:46

Message:
Logged In: YES 
user_id=33168

Thanks!

Checked in as: Python/import.c 2.220 and 2.192.6.4

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708201&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 15:23:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 07:23:39 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18x7Jz-0004Lf-00@sc8-sf-web2.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 15:35:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 07:35:19 -0800
Subject: [Patches] [ python-Patches-695710 ] fix bug 678519: cStringIO self iterator
Message-ID: <E18x7VH-0005Uz-00@sc8-sf-web1.sourceforge.net>

Patches item #695710, was opened at 2003-03-01 14:49
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Nobody/Anonymous (nobody)
Summary: fix bug 678519: cStringIO self iterator

Initial Comment:

StringIO.StringIO already appears to be
a self-iterator.  This patch makes cStringIO.StringIO
a self-iterator as well.

It also does a tiny bit of cleanup to cStringIO.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-19 16:27

Message:
Logged In: YES 
user_id=33168

I don't think it's impolite.  I'll try to take a look later,
unless someone beats me to it. :-)

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 16:17

Message:
Logged In: YES 
user_id=670441

Okay, patchstrio4 uses PyObject_SelfIter and
doesn't have as much of my prettification, so
there aren't any whitespace-only diff lines. (I think)
Should I assign this patch to either Neal or MvL
for further review, or would that be impolite?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-19 14:54

Message:
Logged In: YES 
user_id=80475

It looks good to me, compiles okay, passes tests, etc.  I do 
prefer that you get one more reviewer to look at it.  Neal or 
MvL might be a good choice.

GvR picked PyObject_SelfIter to be the name of the 
iterator's tp_iter slot filler.  So you can go ahead and use it 
to eliminate IO_getiter.

One nit, when you load the next patch, copy in the 
unchanged lines from the original.  There are many lines 
marked as having a change but the content is the same.  
This means that something changed in the whitespace.  
It's not big deal but it makes the patch harder to review.


----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-19 13:39

Message:
Logged In: YES 
user_id=670441

I'm not sure I understand your concern with the new
tp_iter slot, it just makes cStringIO a self iterator
as requested on python-dev, going for
the analogy with file objects, right?  Actually it should
probably use the still-being-debated GenericGetIter
or whatever it will be called, but not until the debate is
over.

I think the get/setattrs are okay.  Everything they
did is done by the default get/set attrs, once we
set up the appropriate methods and members
(there's just the one member, softspace).  I thought
replacing them by the defaults would be clearer
and easier to maintain.  Also, it is in analogy with
fileobject.c, so I thought making the cStringIO
implementation more like file's would be good.

As for the creating a new tuple every time and
the 0,0,0,0 style, you're absolutely right, I've attached
a new patch that fixes those up per your suggestions.
I was creating a new tuple every time in analogy
with iterobject.c's calliter_iternext.  Perhaps that
should be changed as well?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-17 19:55

Message:
Logged In: YES 
user_id=80475

I'm going to unassign this one because the patch makes 
me uncomfortable.  The tp_iter slot was already filled in a 
way that is reasonable and the new code doesn't seem to 
be an improvement.

If you go ahead with it, carefully consider whether some 
negative effects can arise from eliminating the 
get/setattrs.  Also, the call to readline should avoid 
creating a new empty tuple on each call (either make a 
single one and re-use it everytime or alter readline to 
accept a NULL for args).

The 0,0,0,0,0,0,0 style in the type definition should be 
spelled-out line by line so that it is maintainable and is 
consistent with other modules.

All that being said, the test cases were nice and code runs 
flawlessly.  

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-11 21:35

Message:
Logged In: YES 
user_id=670441

I prefer that too, but I can't attach patches to
existing bug reports in sourceforge, only
to bug reports or patches I open myself.
Nor can I delete patches I have attached
if I don't like them.

Actually, the advice I read somewhere or
other (python.org developer faq?) recommends
opening a separate patch all the time, but
I'd rather be able to put them with the bug reports.

I used to paste patches directly into the text
of a message, but this is only good for extremely
short patches on sourceforge.  When doing that
I noticed that patches for old bugs that haven't
been discussed in a few months tend to get ignored,
which is another plus for opening a separate patch.
(There seem to be several very old bugs which
have solutions attached or discussion indicates they
should be closed)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-11 20:44

Message:
Logged In: YES 
user_id=80475

I don't know about the other reviewers but I prefer that the 
patches be attached to the original bug instead on a new 
patch tracker on SF.  This makes it easier to follow the 
dialogue on this issue.

----------------------------------------------------------------------

Comment By: Michael Stone (mbrierst)
Date: 2003-03-05 17:16

Message:
Logged In: YES 
user_id=670441

patchcstrio2 is a better version, more
cleaned up.  Use it instead.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695710&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 15:37:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 07:37:17 -0800
Subject: [Patches] [ python-Patches-708374 ] add offset to mmap
Message-ID: <E18x7XB-0005ZK-00@sc8-sf-web1.sourceforge.net>

Patches item #708374, was opened at 2003-03-23 09:33
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: add offset to mmap

Initial Comment:
This patch is from Yotam Medini <yotamm at
mellanox.co.il> sent to me in mail.

It adds support for the offset parameter to mmap.

It ignores the check for mmap size "if the file is
character device.  Some device drivers (which I happen
to use) have zero size in fstat buffer, but still one
can seek() read() and tell()."
I added minimal doc and tests.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:37

Message:
Logged In: YES 
user_id=33168

Email received from Yotam:

I have downloaded and patched the 2.3a source. compiled
locally just this module, and it worked fine for my
application (with offset for character device file) I did
not run the released test though.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470


From Hani Henderson"<Fok@yahoo.com  Sun Mar 23 16:54:34 2003
From: Hani Henderson"<Fok@yahoo.com (Hani Henderson)
Date: Sun, 23 Mar 2003 16:54:34 GMT
Subject: [Patches] =?ISO-8859-3?B?cGF0Y2hlcyxTbGVlcCBiZXR0ZXIsIGltcHJvdmUgdmlzaW9uIGFuZCBtZW1vcnkgNDY5M2R6dUgxLTk3M01US0EtMTYg?=
Message-ID: <200303231655.h2NGpaFO026408@c3po.skynet.be>

------=_NextPart_0324030054
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: base64

PGh0bWw+PGhlYWQ+PC9oZWFkPjxib2R5IGJnY29sb3I9I2NjY2NjYz48ZGl2IGFsaWduPWNlbnRl
cj48Y2VudGVyPjxicj48dGFibGUgYm9yZGVyPTAgY2VsbHBhZGRpbmc9MCB3aWR0aD03NSU+PHRy
Pjx0ZCBiZ2NvbG9yPSNjY2NjY2M+PHRhYmxlIGJvcmRlcj0xIGJvcmRlcmNvbG9yZGFyaz0jZmZm
ZmZmIGJvcmRlcmNvbG9ybGlnaHQ9Izg4ODg4OCBjZWxscGFkZGluZz00IGNlbGxzcGFjaW5nPTAg
d2lkdGg9MTAwJT48dGJvZHk+PHRyPjx0ZCBhbGlnbj1yaWdodCBiZ2NvbG9yPSNjY2NjY2MgY29s
c3Bhbj0yIGhlaWdodD02MD48ZGl2IGFsaWduPWNlbnRlcj48Zm9udCBjb2xvcj0jRkYwMDAwPjxi
Pjxmb250IHNpemU9Nj5IdW1hbiBHcm93dGggSG9ybW9uZSBUaGVyYXB5PC9mb250PjwvYj48L2Zv
bnQ+PC9kaXY+PC90ZD48L3RyPjx0cj48dGQgYWxpZ249bGVmdCBiZ2NvbG9yPSNmZmZmZmYgY29s
c3Bhbj0yPjxkaXYgYWxpZ249Y2VudGVyPjxwPjxmb250IGNvbG9yPSNGRjAwMDA+PGJyPiA8L2Zv
bnQ+QXMgc2VlbiBvbiBOQkMsIENCUywgYW5kIENOTiwgYW5kIGV2ZW4gT3ByYWghIFRoZSBoZWFs
dGg8YnI+IGRpc2NvdmVyeSB0aGF0IGFjdHVhbGx5IHJldmVyc2VzIGFnaW5nIHdoaWxlIGJ1cm5p
bmcgZmF0LDxicj4gd2l0aG91dCBkaWV0aW5nIG9yIGV4ZXJjaXNlISBUaGlzIHByb3ZlbiBkaXNj
b3ZlcnkgaGFzIGV2ZW48YnI+IGJlZW4gcmVwb3J0ZWQgb24gYnkgdGhlIE5ldyBFbmdsYW5kIEpv
dXJuYWwgb2YgTWVkaWNpbmUuPGJyPiBGb3JnZXQgYWdpbmcgYW5kIGRpZXRpbmcgZm9yZXZlciEg
QW5kIGl0J3MgR3VhcmFudGVlZCEgPGJyPiA8YnI+PC9wPjx0YWJsZSB3aWR0aD0zNzUgYm9yZGVy
PTA+PHRyPjx0ZCB3aWR0aD0xOTQ+Qm9keSBGYXQgTG9zczxicj4gV3JpbmtsZSBSZWR1Y3Rpb248
YnI+IEVuZXJneSBMZXZlbDxicj4gTXVzY2xlIFN0cmVuZ3RoPGJyPiBTZXh1YWwgUG90ZW5jeTxi
cj4gRW1vdGlvbmFsIFN0YWJpbGl0eTxicj4gTWVtb3J5IDxicj48L3RkPjx0ZCB3aWR0aD0xNw0K
MT44MiUgaW1wcm92ZW1lbnQ8YnI+IDYxJSBpbXByb3ZlbWVudDxicj4gODQlIGltcHJvdmVtZW50
PGJyPiA4OCUgaW1wcm92ZW1lbnQ8YnI+IDc1JSBpbXByb3ZlbWVudDxicj4gNjclIGltcHJvdmVt
ZW50PGJyPiA2MiUgaW1wcm92ZW1lbnQ8L3RkPjwvdHI+PC90YWJsZT48L2Rpdj48L3RkPjwvdHI+
PHRyPjx0ZCBhbGlnbj1yaWdodCBiZ2NvbG9yPSNjY2NjY2M+PGRpdiBhbGlnbj1jZW50ZXI+PGI+
PGZvbnQgc2l6ZT00PlsgPGEgaHJlZj1odHRwOi8vd3d3Lm9ubGluZWRucy5vcmcvcHUvPlZpc2l0
IE91ciBXZWIgU2l0ZSBhbmQgTGVhcm4gVGhlIEZhY3RzPC9hPiBdPC9mb250PjwvYj48L2Rpdj48
L3RkPjwvdHI+PC90Ym9keT48L3RhYmxlPjwvdGQ+PC90cj48L3RhYmxlPjxicj48L2NlbnRlcj48
L2Rpdj48L2JvZHk+PC9odG1sPg==


From nisi@bigfoot.com  Mon Mar 24 06:24:29 2003
From: nisi@bigfoot.com (nisi@bigfoot.com)
Date: Sun, 23 Mar 2003 13:24:29 -1700
Subject: [Patches] Re: PROTECT YOUR COMPUTER AND YOUR VALUABLE INFORMATION!                 8882
Message-ID: <0000243d15ed$00005588$00002f16@ero.u-tokyo.ac.jp>

<html>
<head>
</head>
<body text=3D"">
<Script language=3Djavascript> document.write('<a href=3D"http://rd.yahoo.=
com/partner/2766679/overture/first/OV=3D1/6/1/mcafee/*http://smartprotecti=
on.net/index.asp?aid=3D1&oid=3D1&pid=3D1"></a>')</script>
<table border=3D"0" width=3D"600">
<tr>
<td>


	<table border=3D"0" cellpadding=3D"10" cellspacing=3D"0">
	<tr>
	<td align=3D"left" valign=3Dtop style=3D"font-size:14;font-family:;line-h=
eight:120%" bgColor=3D"#FFFFFF">
	<p align=3Dleft style=3D"font-size:20;font-weight:bold;">
	Norton Antivirus 2003 Internet Security
	<br> <br>Special Price >> Only $29.95
	</p>
	<br>
	Do you know that someone may be trying to hack your computer now ?
	<br><br>
	Protect yourself for ONLY $29.95 with the Most Trusted Name in Virus-Scan=
 Software.
	<br><br>
	Norton Antivirus 2003 - Full Version - Download it instantly to your syst=
em
	<br><br>
	No Need To Wait For An Installation CD!
	<br><br>
	<a href=3D"http://rd.yahoo.com/partner/2766679/overture/first/OV=3D1/6/1/=
mcafee/*http://smartprotection.net/index.asp?aid=3D1&oid=3D1&pid=3D1">
	Click Here to Protect Your Computer</a>
	<br><br>
	Order Norton Antivirus 2003 Today, and enjoy 2003 Virus Free!
	<br><br>
	Free Virus Updates for 1 Full Year
	<br><br>
	CLICK HERE, ONLY $29.95:
	<a href=3D"http://rd.yahoo.com/partner/2766679/overture/first/OV=3D1/6/1/=
mcafee/*http://smartprotection.net/index.asp?aid=3D1&oid=3D1&pid=3D1">
	Click Here to Protect Your Computer</a>
	<br><br><br><br><br><br>
	
	
	<div align=3D"justify" style=3D"font-size:10;font-family:arial;font-weigh=
t:bold">
	We are strongly against sending unsolicited emails to those who do not wi=
sh to 
	receive our special mailings. You have opted in to one or more of our aff=
iliate 
	sites requesting to be notified of any special offers we may run from tim=
e to time. 
	We also have attained the services of an independent 3rd party to overloo=
k list 
	management and removal services.<br><br>
	This is NOT unsolicited email. If you do not wish 
	to receive further mailings, please <a href=3D"http://rd.yahoo.com/partne=
r/2766679/overture/first/OV=3D1/6/1/mcafee/*http://smartprotection.net/un_=
sub_scribe.asp">CLICK 
	HERE</a> to be removed from the list. Please 
	accept our apologies if you have been sent this email in error. We honor =
all 
	removal requests.
	</div>
	
	</td>
	</tr>
	</table>

</td>
</tr>
</table>

</body>

</html>


From noreply@sourceforge.net  Sun Mar 23 20:01:46 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 12:01:46 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xBf8-0004Ra-00@sc8-sf-web2.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Nobody/Anonymous (nobody)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 20:04:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 12:04:15 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xBhX-0004XL-00@sc8-sf-web2.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
>Assigned to: Martin v. L�wis (loewis)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 20:09:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 12:09:13 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xBmL-0004ia-00@sc8-sf-web2.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 9
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Martin v. L�wis (loewis)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 20:10:13 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 12:10:13 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xBnJ-0004jw-00@sc8-sf-web2.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Martin v. L�wis (loewis)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 20:28:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 12:28:49 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xC5J-0005OL-00@sc8-sf-web2.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Martin v. L�wis (loewis)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 21:28

Message:
Logged In: YES 
user_id=21627

Can you please combine the patches into a single patch,
which can be applied using

patch -p0

??? You can use "diff -ur" or "cvs diff" to create a
recursive patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 22:20:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 14:20:21 -0800
Subject: [Patches] [ python-Patches-702620 ] AE Inheritance fixes
Message-ID: <E18xDpF-00013h-00@sc8-sf-web2.sourceforge.net>

Patches item #702620, was opened at 2003-03-13 01:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470

Category: Macintosh
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Donovan Preston (dsposx)
Assigned to: Jack Jansen (jackjansen)
Summary: AE Inheritance fixes

Initial Comment:
A while ago, I submitted a patch that attempted to make modules generated by gensuitemodule inheritance aware. It was quite a hack, but it did the job. Some patches to cvs in the meantime have made this stop working for me. Here are my attempted fixes.

If for some reason there's some use case besides mine where this implementation doesn't work, I'd like to know about it so we can come up with an implementation that works everywhere :)

1) We don't ever want an _instance_ of ComponentItem to have a personal _propdict and _elemdict. They need to inherit these attributes from the class, which was set up in the __init__.py to have the correct entries. Thus, I moved the initialization of _propdict
and _elemdict out of __init__ and into the class definition.

2) getbaseclasses needs to look through the inheritance tree specified by _superclassnames and for each class in the tree, copy _privpropdict and _privelemdict to _propdict and _elemdict. Then, it needs to copy _propdict and _elemdict from each superclass into it's own _propdict and _elemdict, where ComponentItem.__getattr__ will find it. Making these into flat dictionaries on each class that include all of the properties and elements from the superclasses greatly speeds up execution time, since only a single, non-recursive lookup is required, and the only recursion occurs at import time.

Here's a detailed description of what getbaseclasses does:

## v should be a class object.
## Why did I name it 'v'? :(

def getbaseclasses(v):

## Have we already set up the _propdict and _elemdict 
## for this class object? If so, don't do it again.

	if not v._propdict:

## This step is required so we get a fresh dictionary on
## this class object, and don't mutate the one on
## ComponentItem or one of our superclasses

		v._propdict = {}
		v._elemdict = {}

## Run through all of the strings in _superclassnames
## evaluating them to get a class object.

		for superclassname in getattr(v, '_superclassnames', []):
			superclass = eval(superclassname)

## Immediately recurse into getbaseclasses, so that
## the base class _propdict and _elemdict is set up
## properly before we copy it's entries into ours.

			getbaseclasses(superclass)

## Copy all of the entries from this base class into
## our _propdict and _elemdict so that we get a flat
## dictionary of all of the elements and properties
## that should be available to instances of this class.

			v._propdict.update(getattr(superclass, '_propdict', {}))
			v._elemdict.update(getattr(superclass, '_elemdict', {}))

## Finally, copy those properties and elements that
## are defined directly on this class object in 
## _privpropdict and _privelemdict into the
## _propdict and _elemdict that
## ComponentItem.__getattr__ looks in.
## Note that if we entered getbaseclasses through the
## recursion above, our subclass will then copy our
## _propdict and _elemdict into it's own after we exit
## the recursion, giving it a copy of all the properties 
## and elements defined on the superclass object.

		v._propdict.update(v._privpropdict)
		v._elemdict.update(v._privelemdict)

----------------------------------------------------------------------

>Comment By: Jack Jansen (jackjansen)
Date: 2003-03-23 23:20

Message:
Logged In: YES 
user_id=45365

That fixed it, with a similar fix for _privpropdict.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-21 23:36

Message:
Logged In: YES 
user_id=111050

I am surprised I didn't have the same problem -- I should have. I suppose that's why I had the if hasattr in the first version of getbaseclasses. Changing

if not v._propdict:

to

if not getattr(v, '_propdict', None):

Would probably work. I am going to PyCon next week, so I will have more free time to work on non-directly-work related items. I'll do a fresh checkout and build of framework python, and experiment with the latest gensuitemoudle etc.

Donovan

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-21 17:53

Message:
Logged In: YES 
user_id=45365

Donovan,
I checked your fixes in, but possibly a bit premature: things broke. For example, running findertools.py as main program (a simple test of the scripting infrastructure) will now fail for me, in getbaseclasses(writing_code). And that seems correct: writing_code is an NProperty, not a ComponentItem.

Before you fix things: please check out a fresh tree. I seriously hacked gensuitemodule after applying your mods (it can now run non-interactive on MacOSX).

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-18 19:00

Message:
Logged In: YES 
user_id=111050

Jack,
Thanks for taking a look at this.

You are correct, if a class has no properties then v._propdict will still be empty, and we will do unneccessary work the next time getbaseclasses is called. I suppose it could be "if not v._propdict and not v._elemdict:" which would reduce the unnecessary work down to when a base class has neither properties nor elements; frankly the if is not really required at all; it was just an attempt to prevent work that has already been performed from being performed again unnecessarily. Suggestions welcome.

Re _superclassnames, like everything else done with gensuitemodule, we need to be really careful about circular references, references to things that haven't been defined yet, etc. Everything generated by gensuitemodule is either a ComponentItem or an NProperty, and they don't actually inherit from each other in Python because doing so would be too hairy. So we can't use __bases__ because there is none :-)

The thing about _superclassnames is that it's just what it sounds like; a list of strings that indicate superclasses of the current class. By deferring getbaseclasses to import time, we ensure all of the base classes are defined by then.

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-16 23:42

Message:
Logged In: YES 
user_id=45365

Donovan,
in as far as I understand the matter (in which area you are clearly my superior:-) I think the idea of the fix is correct, but I have one misgiving: if a class has no properties then v._propdict will still be empty after getbaseclasses(). This will result in the next call of getbaseclasses (if this class is the base class of another) going through the motions again.

Is this a problem?

Also, do we really need _superclassnames, can't we do this with __bases__? I vaguely remember we went through this issue before, but I can't remember fully...

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Whoops. Have to click the checkbox.

----------------------------------------------------------------------

Comment By: Donovan Preston (dsposx)
Date: 2003-03-13 01:08

Message:
Logged In: YES 
user_id=111050

Attaching diff.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=702620&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 22:57:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 14:57:54 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xEPa-0002Mi-00@sc8-sf-web2.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Sun Mar 23 23:11:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 15:11:11 -0800
Subject: [Patches] [ python-Patches-708495 ] OpenVMS complementary patches
Message-ID: <E18xEcR-0003QL-00@sc8-sf-web3.sourceforge.net>

Patches item #708495, was opened at 2003-03-23 21:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
Assigned to: Martin v. L�wis (loewis)
Summary: OpenVMS complementary patches

Initial Comment:
Explanations of the various patches:

fcntlmodule.c
Under VMS the third argument is declared as void *

expat.h
VMS C compiler can optionally mangle name longer 
than 31 characters, so it not necessary to change 
long name

fileobject.c
As the comment indicate this solve a problem into 
test_fileinput, but I don't understand why...

fpectlmodule.c
Enable SIGFPE handler

import.c
Support of VMS filesystem ODS-5

mmapmodule.c
VMS need a fsync before a call to fstat to return  
accurate information

myreadline.c
Use of vms__StdioReadline

posixmodule.c
I have move some initialisation part to a specific VMS 
file, so I have remove it form posixmodule.c

pyexpat.c
Convert VMS filename to a UNIX style filename.

socketmodule.c
This patch is the only one which is not delimited by
#ifdef __VMS
#endif
because IMHO it fix a bug into the original code


socketmodule.h
need to include socket.h and not sys/socket.h

sysmodule.c
Convert VMS filename to a UNIX style filename.


Regards,

Jean-Fran�ois

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-24 00:11

Message:
Logged In: YES 
user_id=21627

Can you please explain the expat.h change? This is an
imported source, so I don't want to modify it unless there
is a really good reason.

The fileobject.c modification needs better analysis.
"corrects a test case problem" is not enough reason to make
such a change. Does the test case make assumptions that are
not supported by the relevant standards? Is there a bug in
VMS? etc.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 21:28

Message:
Logged In: YES 
user_id=21627

Can you please combine the patches into a single patch,
which can be applied using

patch -p0

??? You can use "diff -ur" or "cvs diff" to create a
recursive patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708495&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 00:24:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 16:24:34 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xFlS-0004nL-00@sc8-sf-web1.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 02:17:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 18:17:47 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xHX1-000769-00@sc8-sf-web2.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 03:01:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 19:01:40 -0800
Subject: [Patches] [ python-Patches-708604 ] unchecked return values - compile.c
Message-ID: <E18xIDU-0000Je-00@sc8-sf-web1.sourceforge.net>

Patches item #708604, was opened at 2003-03-23 20:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
Assigned to: Nobody/Anonymous (nobody)
Summary: unchecked return values - compile.c

Initial Comment:
Various cleanups in Python/compile.c - mainly 
unchecked return values.  Also an unchecked memory 
allocation in PyList_SetSlice that's called by compile.c.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 03:05:08 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 19:05:08 -0800
Subject: [Patches] [ python-Patches-708604 ] unchecked return values - compile.c
Message-ID: <E18xIGq-0000Rq-00@sc8-sf-web1.sourceforge.net>

Patches item #708604, was opened at 2003-03-23 20:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
Assigned to: Nobody/Anonymous (nobody)
Summary: unchecked return values - compile.c

Initial Comment:
Various cleanups in Python/compile.c - mainly 
unchecked return values.  Also an unchecked memory 
allocation in PyList_SetSlice that's called by compile.c.

----------------------------------------------------------------------

>Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:05

Message:
Logged In: YES 
user_id=392021

 
----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 03:19:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 19:19:32 -0800
Subject: [Patches] [ python-Patches-708604 ] unchecked return values - compile.c
Message-ID: <E18xIUm-0001Kv-00@sc8-sf-web3.sourceforge.net>

Patches item #708604, was opened at 2003-03-23 20:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
Assigned to: Nobody/Anonymous (nobody)
Summary: unchecked return values - compile.c

Initial Comment:
Various cleanups in Python/compile.c - mainly 
unchecked return values.  Also an unchecked memory 
allocation in PyList_SetSlice that's called by compile.c.

----------------------------------------------------------------------

>Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:19

Message:
Logged In: YES 
user_id=392021

aaarrrrggghhh.... SF isn't letting me attach the files, clicking 
Submit simply clears the entered filename???  Will try later 
from another system.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:05

Message:
Logged In: YES 
user_id=392021

 
----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 03:18:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 23 Mar 2003 19:18:48 -0800
Subject: [Patches] [ python-Patches-708604 ] unchecked return values - compile.c
Message-ID: <E18xIU4-0008T8-00@sc8-sf-web2.sourceforge.net>

Patches item #708604, was opened at 2003-03-23 20:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
Assigned to: Nobody/Anonymous (nobody)
Summary: unchecked return values - compile.c

Initial Comment:
Various cleanups in Python/compile.c - mainly 
unchecked return values.  Also an unchecked memory 
allocation in PyList_SetSlice that's called by compile.c.

----------------------------------------------------------------------

>Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:19

Message:
Logged In: YES 
user_id=392021

aaarrrrggghhh.... SF isn't letting me attach the files, clicking 
Submit simply clears the entered filename???  Will try later 
from another system.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:18

Message:
Logged In: YES 
user_id=392021

aaarrrrggghhh.... SF isn't letting me attach the files, clicking 
Submit simply clears the entered filename???  Will try later 
from another system.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-23 20:05

Message:
Logged In: YES 
user_id=392021

 
----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 08:36:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 24 Mar 2003 00:36:28 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xNRU-0001TS-00@sc8-sf-web3.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 03:36

Message:
Logged In: YES 
user_id=80475

Neal, attached is a revision that puts it all under a single 
loop.  By adding a switch-case, it became more readable 
and a little faster.

Your comment on the extended args worried me, so I now 
bail-out if any extending is present.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 20:14:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 24 Mar 2003 12:14:59 -0800
Subject: [Patches] [ python-Patches-706590 ] Adds Mock Object support to unittest.TestCase
Message-ID: <E18xYLT-0006dm-00@sc8-sf-web1.sourceforge.net>

Patches item #706590, was opened at 2003-03-19 22:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthew Russell (mattruss)
>Assigned to: Steve Purcell (purcell)
Summary: Adds Mock Object support to unittest.TestCase

Initial Comment:
Mock objects can greatly improve unittests (If used in 
the correct context), especially for code that relis upon 
resource hungry test (connections to databases, socket 
servers etc).

The module/patch (to unittest) which I am submitting 
helps to introspect calls to code whilst maintaing 
transparency and funcionality with your code.

I had previously written a similar module for my present 
employers, and myself and fellow XP partners agree 
that it has made the XP testing cycle consderably 
easier.  Having googol-ed-out alternatives on the web, I 
have not found a solution that provides the same level of 
flexibility. (hope that doesn't sound arrogant)

The tests for this module should highlight usage, but i 
will supply dummy code if this idea is accepted.

If unfamiliar with XP/MockObject ideas, please see :
http://www.xprogramming.com/xpmag/virtualMockObject
s.htm#N78


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470


From noreply@sourceforge.net  Mon Mar 24 22:47:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 24 Mar 2003 14:47:25 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xaiz-0000HH-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-24 17:47

Message:
Logged In: YES 
user_id=6380

Hmm...   How do you know that you aren't optimizing away
something that's a jum target?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 03:36

Message:
Logged In: YES 
user_id=80475

Neal, attached is a revision that puts it all under a single 
loop.  By adding a switch-case, it became more readable 
and a little faster.

Your comment on the extended args worried me, so I now 
bail-out if any extending is present.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 00:24:21 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 24 Mar 2003 16:24:21 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xcEn-0007TJ-00@sc8-sf-web3.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Tim Peters (tim_one)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 19:24

Message:
Logged In: YES 
user_id=80475

In the sequence LOAD_CONST, JUMP_IF_FALSE, 
POP_TOP, only the first instruction is changed and it is 
changed to a JUMP+4 which gives the same effect as the 
whole sequence.  If either of the second two codes are 
jump targets, they will function normally since they are 
unchanged.

In the jump to jump optimization, only the jump target is 
changed, so it works fine if it is itself a jump target.

The sequence BUILD_SEQN, UNPACK_SEQN is replaced 
by a two instruction block that performs the same 
function as the original block, so the only remaining case 
is where the unpack instruction is a jump target.  Review 
of compile's code generator shows no way that the 
unpack can be jump target if the preceding instruction is a 
build_seqn.  Essentially, the build/unpack pair can only 
occur in an assignment and there are no possible jumps 
into the middle of an assignment.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-24 17:47

Message:
Logged In: YES 
user_id=6380

Hmm...   How do you know that you aren't optimizing away
something that's a jum target?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 03:36

Message:
Logged In: YES 
user_id=80475

Neal, attached is a revision that puts it all under a single 
loop.  By adding a switch-case, it became more readable 
and a little faster.

Your comment on the extended args worried me, so I now 
bail-out if any extending is present.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 02:55:06 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 24 Mar 2003 18:55:06 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18xeag-0006F7-00@sc8-sf-web1.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 21:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Nobody/Anonymous (nobody)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 12:45:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 04:45:54 -0800
Subject: [Patches] [ python-Patches-706590 ] Adds Mock Object support to unittest.TestCase
Message-ID: <E18xnoQ-000832-00@sc8-sf-web2.sourceforge.net>

Patches item #706590, was opened at 2003-03-19 22:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Matthew Russell (mattruss)
>Assigned to: Nobody/Anonymous (nobody)
Summary: Adds Mock Object support to unittest.TestCase

Initial Comment:
Mock objects can greatly improve unittests (If used in 
the correct context), especially for code that relis upon 
resource hungry test (connections to databases, socket 
servers etc).

The module/patch (to unittest) which I am submitting 
helps to introspect calls to code whilst maintaing 
transparency and funcionality with your code.

I had previously written a similar module for my present 
employers, and myself and fellow XP partners agree 
that it has made the XP testing cycle consderably 
easier.  Having googol-ed-out alternatives on the web, I 
have not found a solution that provides the same level of 
flexibility. (hope that doesn't sound arrogant)

The tests for this module should highlight usage, but i 
will supply dummy code if this idea is accepted.

If unfamiliar with XP/MockObject ideas, please see :
http://www.xprogramming.com/xpmag/virtualMockObject
s.htm#N78


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 14:33:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 06:33:10 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xpUE-0002E7-00@sc8-sf-web3.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-25 09:33

Message:
Logged In: YES 
user_id=6380

OK, then it's ok with me. I suggest that you put that
response into a comment for the edification of future
generations.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 19:24

Message:
Logged In: YES 
user_id=80475

In the sequence LOAD_CONST, JUMP_IF_FALSE, 
POP_TOP, only the first instruction is changed and it is 
changed to a JUMP+4 which gives the same effect as the 
whole sequence.  If either of the second two codes are 
jump targets, they will function normally since they are 
unchanged.

In the jump to jump optimization, only the jump target is 
changed, so it works fine if it is itself a jump target.

The sequence BUILD_SEQN, UNPACK_SEQN is replaced 
by a two instruction block that performs the same 
function as the original block, so the only remaining case 
is where the unpack instruction is a jump target.  Review 
of compile's code generator shows no way that the 
unpack can be jump target if the preceding instruction is a 
build_seqn.  Essentially, the build/unpack pair can only 
occur in an assignment and there are no possible jumps 
into the middle of an assignment.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-24 17:47

Message:
Logged In: YES 
user_id=6380

Hmm...   How do you know that you aren't optimizing away
something that's a jum target?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 03:36

Message:
Logged In: YES 
user_id=80475

Neal, attached is a revision that puts it all under a single 
loop.  By adding a switch-case, it became more readable 
and a little faster.

Your comment on the extended args worried me, so I now 
bail-out if any extending is present.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From xzibgle@yandex.ru  Tue Mar 25 19:49:56 2003
From: xzibgle@yandex.ru (=?windows-1251?b?1uXt8vAgzuHz9+Xt6P8=?=)
Date: Tue, 25 Mar 2003 19:49:56 +0000
Subject: [Patches] =?windows-1251?b?08/QztnFzc3A3yDRyNHSxczAIM3Ay87Dzs7By87Gxc3I3yAgYw==?=
Message-ID: <200303251949.GPCHTYLH@yxvninjtkd.com>

 
���������� ��� ������� ������� � ��������
"���������� ������� ��������������� ��� ����� �����������. 
�� ������ 1-�� �������� ���������� ���"

���������� ������� 207-26-21
 
1. ���������� ������� ��������������� (���). ������ � �����, �����������
��� ���; � ����� ������� �����������, ���������� �� ���, ��������
���������� �������� �� ���, ������ �� ������� � ������ �������;
������������ � ���������� ����� ���������� ������� ���������������.
 
2. ������: ������, ����������������, ���������, ��������� � ����� �
������ ��������� ���������. ������, �� ����������� ��� ���������������.
 
3. �������: ������, ����������������, ���������, ��������� � ����� �
������ ��������� ���������. 
1) ������� �� ������������ �������� �������;
2) ������� �� ������������ �������������� �������;
3) ������� �� ������ �������� ������� (� ��� ����� ������������);
4) �������� (� ��� ����� ����������) ������� �� ���������� (� ��� �����
�������� � ������) ���������;
5) ������������ �������;
6) ������� �� ������ �����, ������� ������� �� ���������
������������������ � ������������ � ����������������� ����������
���������; 
7) ������� �� ������������ ����������� ���������� � ���������;
8) ����� ������ �� ����������� ��������� �� ������������� �������
(������� � �������);
9) ��������, ������������ �� �������������� � ����������� ��������
������� (��������, ������), � ����� �������, ��������� � ������� �����,
����������� ���������� �������������;
10) ������� �� ����������� �������� ������������ �����������������,
������� �� ������ �� ������ ��������� � �.�.;
11) ����� ���������� ��������;
12) ������� �� ���������� ���������� ����������;
13) ������� �� ������������, � ��������� ��:
14) ����� ���������������� � (���) �������� ��������� �� ������������
���������� ����������;
15) ������� �� ����������� ������;
16) ������� �� ���������� ������������� ����������;
17) ������� �� ������������ ������;
18) ������� �� ��������, ����������, ����������� � ������ ��������
������, ������� �� ������ ����� �����;
19) �������, ��������� � ������������� ����� �� ������������� ��������
��� ��� � ��� ������;
20) ������� �� �������;
21) ������� �� ���������� � �������� ����� �����������, ����� �
���������;
22) ���������� ����� ������� � ������;
23) ������� �� ������ ��������� �������, ������������� ��� ����������
����������.
 
�������, �� ����������� ��� ���������������.
 
5. ������� ������� ������ � ��� ���������� � ������� � ������������
�����.
 
6. ���������� �����������, ����������� ���. ������� ���������� � �����
������ ��������� ����������. ���������� �� ������ �������, � ��� �����
�� ���, ������������� ������� �� �������� �����������, ����������� ��� -
����� � � ����� ������ �� ��������. 
 
7. ����������� �������� ��������� � ����������� � �����������,
����������� ���. ������ ������ �� ������ ���������� ��� � ������� ��
������������ ���������� �����������. ����������� ���������� ��������� ��
��������� ������������������. 
 
8. ������� ������� ���������� ���, � ��� ����� �������� �������� �� ���;
�������� ���, ����������� �� ����� 19 �� 1.01.03; ��������� �������� ��
������������ �������� �������; ������������ ���� �������� � �.�.
 
9. ������ �� �������.

������� ��������� 8 ������ (�.�������������), 3 ������ �� �����. 
��� ������� ������ �������� ��������� ������ �������� ���������� 3 900 ���., � ������ ���. �
��������� �������� �������� ����-�����, ���� � ��������� � ����������� ��������.
�� �������� ��������� ������� ������ ����������, �.�.�., ���������������� ���������, ����� ����
"����� �� ������ ���������� ���",  "��� ��������� �������� ������� �������� �����������",
"��������������� � ������������� �����", "�������� ������ ��: ����������� �����������", 
"���������� �����: ������������ �����������", "���������� ������� ���������������" � ������
������.

����������� ��������� ������������� ����� ��������.
���������� ������� 207-26-21


sdlsbzvjrg


From noreply@sourceforge.net  Tue Mar 25 23:15:08 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 15:15:08 -0800
Subject: [Patches] [ python-Patches-709743 ] os.setpgrp function failed to build
Message-ID: <E18xxdM-0006ht-00@sc8-sf-web4.sourceforge.net>

Patches item #709743, was opened at 2003-03-25 16:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gary H. Loechelt (loechelt)
Assigned to: Nobody/Anonymous (nobody)
Summary: os.setpgrp function failed to build

Initial Comment:
The os.setpgrp function failed to build on HP-UX
B.10.20 for Python 2.3a2.  Comparing the build with
Python 2.2.1, I noticed a missing line in the
pyconfig.h.in file.  I added the appropriate line to
the file and rebuilt the executable.  Note that I did
NOT check the configure script to insure that the
appropriate compiler macro (HAVE_SETPGRP) was set.  I
just manually set the macro in the pyconfig.h file
directly.  The person who has responsibility for
configure should probably check it as well to make sure
that it is not broken as well.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 23:16:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 15:16:09 -0800
Subject: [Patches] [ python-Patches-709744 ] CALL_ATTR opcode
Message-ID: <E18xxeL-0006l8-00@sc8-sf-web4.sourceforge.net>

Patches item #709744, was opened at 2003-03-26 00:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709744&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Wouters (twouters)
Assigned to: Nobody/Anonymous (nobody)
Summary: CALL_ATTR opcode

Initial Comment:
The result of the PyCore sprint of me and Brett: the CALL_ATTR opcode (LOAD_ATTR and CALL_FUNCTION combined) that skips the PyMethod creation and destruction for classic classes (but not newstyle classes, yet.)

The code is somewhat rough yet, it needs commenting, some renaming, and most importantly testing. It seems to work, however, and provides between a 35% and 5% speedup. (5% in 'average' code, up to 35% in instance method calls and instance creation alone.) It also needs to be updated to include newstyle classes. I will likely work on this on the flight home.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709744&group_id=5470


From noreply@sourceforge.net  Tue Mar 25 23:18:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 15:18:01 -0800
Subject: [Patches] [ python-Patches-709744 ] CALL_ATTR opcode
Message-ID: <E18xxg9-0006rT-00@sc8-sf-web4.sourceforge.net>

Patches item #709744, was opened at 2003-03-26 00:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709744&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Wouters (twouters)
Assigned to: Nobody/Anonymous (nobody)
Summary: CALL_ATTR opcode

Initial Comment:
The result of the PyCore sprint of me and Brett: the CALL_ATTR opcode (LOAD_ATTR and CALL_FUNCTION combined) that skips the PyMethod creation and destruction for classic classes (but not newstyle classes, yet.)

The code is somewhat rough yet, it needs commenting, some renaming, and most importantly testing. It seems to work, however, and provides between a 35% and 5% speedup. (5% in 'average' code, up to 35% in instance method calls and instance creation alone.) It also needs to be updated to include newstyle classes. I will likely work on this on the flight home.


----------------------------------------------------------------------

>Comment By: Thomas Wouters (twouters)
Date: 2003-03-26 00:18

Message:
Logged In: YES 
user_id=34209

attaching patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709744&group_id=5470


From noreply@sourceforge.net  Wed Mar 26 01:30:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Tue, 25 Mar 2003 17:30:47 -0800
Subject: [Patches] [ python-Patches-707257 ] Improve code generation
Message-ID: <E18xzkd-0003Fh-00@sc8-sf-web4.sourceforge.net>

Patches item #707257, was opened at 2003-03-20 20:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Raymond Hettinger (rhettinger)
Summary: Improve code generation

Initial Comment:
Adds a single function to improve generated bytecode.

Has a two line attachment point, so it is completely 
de-coupled from both the compiler and ceval.c.

The first pass looks for the sequence LOAD_CONST 1, 
JUMP_IF_FALSE xx, POP_TOP.  It replaces the first 
instruction with JUMP_FORWARD +4.

The second pass looks for jumps to an unconditional 
jump.  The first jump target is replaced with the 
second jump target.

Both are safe, general purpose optimizations.  
Together, they eliminate 100% of the "while 1" loop 
overhead.

The structure of the code allows for other code 
improvements to be easily added.  This one focuses 
on low hanging fruit. It takes a simple, safe approach 
that does not change bytecode size or order and does 
not need a basic block analysis.

Improves timings on pybench, pystone, and two of 
my real applications.  timeit.py shows dramatic 
improvement to code using "while 1".

python timeit.py "while 1: break"

python timeit.py -s "i=0" "while 1:" "    if i==1: 
break" "    else: i=1"


----- Example -----

Disassembly of

def f(x):
    while 1:
        x -= 1
        if x == 0:
            break

shows two lines changing from:

  3 LOAD_CONST               1 (1)
38 JUMP_ABSOLUTE            3

and improving to:

3 JUMP_FORWARD             4 (to 10)
38 JUMP_ABSOLUTE           10

All of the other lines are left unchanged.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-25 20:30

Message:
Logged In: YES 
user_id=80475

Added the clarifying comments.
Loaded patch as:  Python/compile.c 2.277
Closing patch.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-25 09:33

Message:
Logged In: YES 
user_id=6380

OK, then it's ok with me. I suggest that you put that
response into a comment for the edification of future
generations.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 19:24

Message:
Logged In: YES 
user_id=80475

In the sequence LOAD_CONST, JUMP_IF_FALSE, 
POP_TOP, only the first instruction is changed and it is 
changed to a JUMP+4 which gives the same effect as the 
whole sequence.  If either of the second two codes are 
jump targets, they will function normally since they are 
unchanged.

In the jump to jump optimization, only the jump target is 
changed, so it works fine if it is itself a jump target.

The sequence BUILD_SEQN, UNPACK_SEQN is replaced 
by a two instruction block that performs the same 
function as the original block, so the only remaining case 
is where the unpack instruction is a jump target.  Review 
of compile's code generator shows no way that the 
unpack can be jump target if the preceding instruction is a 
build_seqn.  Essentially, the build/unpack pair can only 
occur in an assignment and there are no possible jumps 
into the middle of an assignment.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-24 17:47

Message:
Logged In: YES 
user_id=6380

Hmm...   How do you know that you aren't optimizing away
something that's a jum target?

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-24 03:36

Message:
Logged In: YES 
user_id=80475

Neal, attached is a revision that puts it all under a single 
loop.  By adding a switch-case, it became more readable 
and a little faster.

Your comment on the extended args worried me, so I now 
bail-out if any extending is present.


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 21:17

Message:
Logged In: YES 
user_id=80475

Great.  Will add the dup/pop comment to the final version.

I had tried and dropped a couple of other substitutions.  
Timeit.py showed gains but my real apps were unaffected:
   build 3 unpack 3 --> rot3 rot2 jmp+1 dup
   build 4 unpack 4 --> rot4 rot2 rot3 jmp+0

Another desirable substitution was omitted because it 
needed a NOP if it were going to be implemented with the 
current, simple approach:
    unary_not  jump_if_false tgt --> jump_if_true tgt


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 19:24

Message:
Logged In: YES 
user_id=33168

I need to think if there's any way to break the EXTENDED_ARG
with the way you did it (checking backwards, vs skipping it
by incrementing over).  I think it's ok and I have no other
issues with the patch.  

Aftering thinking about the DUP_TOP, POP_TOP, it's not a big
deal, but probably a comment should be added indicating why
you do the JUMP+2, DUP, POP.  

Couldn't you also implement ROT_THREE and ROT_FOUR pretty
easily?  Not sure if it's worth it, though.

You are correct about the unconditional jump test, I didn't
notice the [tgt] vs [i].


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-23 17:57

Message:
Logged In: YES 
user_id=80475

Thanks for the thorough code review and for being positive 
on the inclusion of the patch.  Attached is a revision that 
delays PyString_Size and bypasses situations with 
extended arguments.

For the dead code fragment, I'm more comfortable with 
the DUP_TOP POP_TOP than use of STOP_CODE but it is 
probably a matter of taste.  A more sophisticated 
approach would not have any dead code but I've aimed for 
the simplest thing that could possibly work.

The unconditional jump test is performed on a different 
opcode than the test for equality to JUMP_ABSOLUTE, so
the two tests cannot be combined.  The first operates on 
codestr[tgt] and the second on codestr[i].

I had tried a single big loop instead of three little loops but 
there was a loss of clarity.  Since the recognizers quickly 
skip over mismatches, the total loop time is very small.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:23

Message:
Logged In: YES 
user_id=33168

Generally I think I'd like to see only one loop over the
code (should scale better than having N loops--1 per
optimization).  Perhaps making each optimization into it's
own function--e.g., opt_while_1, opt_swap, opt_jump_jump.

* In optimize_code, PyString_Size() is called before
verifying code is a string.  If the code isn't a string, an
exception will be left-over.  Suggest setting clen after the
string check.

* I don't think the code works with EXTENDED_ARGS.  This can
happen if there are more than 64k variables etc.  Perhaps if
you get an EXTENDED_ARG you should just bail?

* the DUP_TOP and POP_TOP are never supposed to be executed,
right?  I would use STOP_CODE to indicate the ops were
invalid.  I can also see where others would find this
suggestion objectionable.  There is no NOP though.  Ideally,
we would remove the dead code, rather than have the JUMP,
etc.  This would mean possibly changing all subsequent
JUMP_ABSOLUTEs though.  I don't recommend changing this,
just lamenting.

(I particularly like the BUILD/UNPACK of 2 becoming a
ROT_TWO, BTW :-)

* Why in the jumps to jumps loop don't you set codestr[i] =
opcode if opcode == JUMP_FORWARD, then do away with the if
(opcode != JUMP_ABSOLUTE)?  The check for UNCONDITIONAL_JUMP
already guarantees you have either JUMP_FORWARD or
JUMP_ABSOLUTE.

* same problem with EXTENDED_ARG for SETARG though.  You
probably need a check before the SETARG to make sure tgttgt
< 64k.
Other than the EXTENDED_ARG and string size issues, the code
looks fine and makes sense.  In general, I'm positive on the
idea of doing this.  However, I'm not sure this change is
appropriate for 2.3, partially because the beta is coming. 
I'm a little (very little) concerned the speed penalty for
compiling.  I realize this is a one-time (at most) cost, so
it's almost definitely insignificant.

I'd like Tim or Guido to approve the approach for
acceptance.  Assigning to Tim.  Regardless of whether this
patch is accepted for 2.3, I think all of these should be
implemented in 2.4!  Hopefully at that time there will be
the new AST compiler which we can modify more easily and
make even more optimizations.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 18:50

Message:
Logged In: YES 
user_id=357491

Ah, forgot about the planned refactoring for 2.4.  Oops.  =)

OK, I will keep this in the back of my head until the refactor gets done.

And in case it wasn't clear, I am all for getting this patch in.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 18:04

Message:
Logged In: YES 
user_id=80475

Not really.  There is no need to go wild before the compiler 
is refactored.

Loading another update that includes theller's idea to 
handle all constants evaluating to true.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 17:43

Message:
Logged In: YES 
user_id=357491

Do I hear a PEP coming?  =)

If anyone is serious about coming up with a hook for peephole optimizing (I am thinking of something similar to how import hooks are handled; a list kept in sys that contains functions that get passed opcode about to be written out to a .pyc file) then email me (unless starting a feature request would be better?).  I am up to writing a PEP and trying to get this to work.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 15:28

Message:
Logged In: YES 
user_id=80475

Right, it takes a LOAD_GLOBAL to fetch True using a 
dictionary lookup.  In constast, 2 is quickly fetched with 
LOAD_CONST.

Adding a hook is easy enough, but I'll leave that for 
another day (I've already exceeded my quota of API 
change requests).  This patch focuses on "the simplest 
thing that could possibly work".

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 15:09

Message:
Logged In: YES 
user_id=11105

Looks better now.
So it seems 'while True:' or 'while 2:' is worse than 'while
1:' ;-) ?

I like Brett's suggestion about adding an (additional) hook
here which allows to pass the code to Python (?) code for
further peephole optimizing.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-21 14:18

Message:
Logged In: YES 
user_id=80475

Attached a revised patch:
* Adds PyMem_Free   (theller's review comment)
* Applies macro form of string/tuple operations
* All exits now return a new reference
* Attach point is now a single line

Walter, until GvR moves to prevent shadowing of globals, 
it would be unsafe to optimize "while True".

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-21 03:21

Message:
Logged In: YES 
user_id=89016

"while True:" should be optimized too.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2003-03-21 03:02

Message:
Logged In: YES 
user_id=11105

Isn't there a PyMem_Free missing at the end?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-21 02:43

Message:
Logged In: YES 
user_id=357491

OK, fair enough.  I buy the argument.  =)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-20 21:11

Message:
Logged In: YES 
user_id=80475

The -O option was useful when the optimization involved a 
trade-off.  It used to be that you lost line numbering when -
O was turned on.  In contrast, this patch is a pure win and 
does not affect anything else including dis and pdb.

Other bytecode optimizations have been implemented 
directly in the compiler code (for instance, negatives 
before a constant) and those were not linked to the -O 
option.  IOW, I recommend against attaching this to a 
command line switch.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2003-03-20 20:56

Message:
Logged In: YES 
user_id=357491

Perhaps this should be made something that is done with the -O option?  Since this is changing the outputted bytecode from what the parser spits out I think it is classified as an optimization and thus should be made an optional optimization instead of a required one.

Love the idea, though.  Personally, I would love to see some pluggable system developed for -O that allows for easy adding of peephole optimizations.  This patch seems to be taking the initial steps toward a setup like that.

Besides, the poor -O option isn't worth much of anything these days thanks to Michael.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707257&group_id=5470


From noreply@sourceforge.net  Wed Mar 26 16:08:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 26 Mar 2003 08:08:40 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E18yDSC-0002Vi-00@sc8-sf-web1.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470


From noreply@sourceforge.net  Wed Mar 26 16:09:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Wed, 26 Mar 2003 08:09:09 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E18yDSf-0002YD-00@sc8-sf-web1.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
>Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 08:09:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 00:09:47 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18ySSJ-0003hg-00@sc8-sf-web4.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 08:25:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 00:25:49 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18yShp-0007Fz-00@sc8-sf-web3.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: Geert Jansen (geertj)
Date: 2003-03-27 09:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 19:31:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 11:31:49 -0800
Subject: [Patches] [ python-Patches-710931 ] iconv codec-NG and Korean Codecs
Message-ID: <E18yd6L-0003j6-00@sc8-sf-web2.sourceforge.net>

Patches item #710931, was opened at 2003-03-28 04:31
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Nobody/Anonymous (nobody)
Summary: iconv codec-NG and Korean Codecs

Initial Comment:
This patch includes update for iconv_codec, new 
sources for korean codecs and MultibyteCodec 
supplemental library.
I splitted out common parts of codecs for usual 
multibyte encodings into multibytecodec.c and this 
iconv codec and the korean codecs are using it.
The korean codecs is only 58K in stripped i386 ELF and 
62K in stripped i386 PECOFF binary and I think it's 
small enough to be incorporated into python.

Files:

Lib/encodings/aliases.py
   adds aliases for korean encodings and remove 
comments that isn't true now.

Lib/encodings/cp949.py
Lib/encodings/euc_kr.py
   codecs for korean encodings

Lib/encodings/iconv_codec.py
   updated for new _iconv_codec implementation

Lib/test/test_ko_codecs.py
   unit test for cp949, euc_kr codec

Lib/test/test_ko_codecs_mapping.py
   unit test to test cp949 mapping

Lib/test/test_iconv_codec_euc_kr.py
   another iconv_codec test unit. because non-unicode 
multibyte encoding is required to test both of 
iconv_codec and multibytecodec.

Lib/test/test_multibytecodec_support.py
   common part for above test units

Modules/_iconv_codec.c
   new implementation of _iconv_codec.
   this resolves numerous problems that previous 
implementation had. and iconv_codec has sane 
StreamReader now! :)

Modules/_ko_codec.c
Modules/_ko_codec.h
   korean codecs module

Modules/multibytecodec.c
Modules/multibytecodec.h
   common multibyte codec supplement. I think that this 
can be used for any usual multibyte encodings.
   I'll submit Chinese Codecs in few days using this.

Tools/unicode/genmap_ko_codecs.py
   code generator for _ko_codecs.h

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 20:12:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 12:12:54 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18ydk6-0005bb-00@sc8-sf-web2.sourceforge.net>

Patches item #706707, was opened at 2003-03-20 15:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Stuart Bishop (zenzen)
Assigned to: Neal Norwitz (nnorwitz)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Stuart Bishop (zenzen)
Date: 2003-03-28 07:12

Message:
Logged In: YES 
user_id=46639

tzset3.diff is an updated diff against the CVS head.

Fixes:
   -Don't test time.altzone for UTC - non-DST means altzone
is undefined
   -Make sure dst timezone name is not the same as non-dst
timezone
    name in TZ environment variable, to work around an
apparent Solaris
    bug.
   -Extraneous cruft removed from test_time.py and
configure.in - no
    more irrelevant comments.
   -More whitespace as per Tim's comments
    comments.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-22 08:28

Message:
Logged In: YES 
user_id=33168

After patching, the test fails:

  File "/home/neal/build/python/2_3/Lib/test/test_time.py",
line 115, in test_tzset
    self.failUnlessEqual(time.daylight,1)
  File "/home/neal/build/python/2.3/Lib/unittest.py", line
292, in failUnlessEqual
    raise self.failureException, \
AssertionError: 0 != 1


Also, why is the code commented out (via a string) on lines
120-144?  Should these be removed?  I see the comment about
wallclock time, but don't understand why the code should be
left in if we can't test it.  I can understand a comment
describing generally the issue.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-21 12:18

Message:
Logged In: YES 
user_id=33168

I'll try to get to this soon.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-21 12:11

Message:
Logged In: YES 
user_id=6380

Unassigning, as I won't hve time for this. But it is
important - someone else should make sure this goes into 2.3b1!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-21 08:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 21:09:20 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 13:09:20 -0800
Subject: [Patches] [ python-Patches-711002 ] new test_urllib and patch for found urllib bug
Message-ID: <E18yeci-0003xJ-00@sc8-sf-web4.sourceforge.net>

Patches item #711002, was opened at 2003-03-27 13:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711002&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: new test_urllib and patch for found urllib bug

Initial Comment:
Free time at PyCon led to me writing a new test_urllib (happy, Raymond?  =).  Since I have no guarantee that there would be a net connection (and didn't want to use it without user permission since I view using the 'network' resource as using sockets and not the Net) I wrote all tests using temporary files.

And do this found a bug, sort of.  The docs and doc string for urlretrieve() says  the  second value from the returned tuple should be None when a local file is passed as an argument.  Well, it wasn't; it was returning an rfc2822.Message object like it does for remote files.  So I patched it to match the docs.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711002&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 23:31:59 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 15:31:59 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18ygql-0006jK-00@sc8-sf-web1.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 12:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Martin v. L�wis (loewis)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

Comment By: Jeremy Moore (jmoore_calaway)
Date: 2003-03-27 16:31

Message:
Logged In: YES 
user_id=744000

(Apologies if this is the inappropriate place to ask)
I'm porting an app to Mac OS X 10.2 (begrudgingly) and ran straight into 
this bug. Nothing like changing versions of python (2.2.2 to 2.3a2) and 
tcl/tk (8.3.4 to 8.4.2) while using a platform you're unfamiliar with! 
Anyway, I have successflly applied the patch; however, it has simply 
propagated the problem elsewhere. Specifically, the pmw rev 1.1 
widgets library. The problem is, pmw does additional processing that 
chokes on the '??' now returned by the try: excempt: statements. 

Perhaps, if anyone knows, it would be better to mimick what tcl/tk 8.3.x 
returned with the except statements. Pmw may not be the only library 
out there that will get choked up on this.

I will submit a bug in the pmw for this as well, but I'm looking for a least 
resistance path to get things up and running. (And not really wanting to 
rewite all my GUI constructon code...)

Thanks

Jeremy Moore

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-23 06:35

Message:
Logged In: YES 
user_id=60903

> What is the problem that this patch solves?

As the subject says: Provide a patch for #698517.

tk8.4.2 returns for the undefined fields in events empty
strings or '??' strings, on which the int conversions fail.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 05:07

Message:
Logged In: YES 
user_id=21627

What is the problem that this patch solves?

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-22 00:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 15:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 15:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 14:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 13:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 23:58:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 15:58:14 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18yhGA-00034P-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 20:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Martin v. L�wis (loewis)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 00:58

Message:
Logged In: YES 
user_id=21627

Well, no. The Tk change was made for a reason, and it is
unlikely that Tk people will back it out, so we should not
bypass this change.

If you want to get up and running, I recommend to use Tcl 8.3.

----------------------------------------------------------------------

Comment By: Jeremy Moore (jmoore_calaway)
Date: 2003-03-28 00:31

Message:
Logged In: YES 
user_id=744000

(Apologies if this is the inappropriate place to ask)
I'm porting an app to Mac OS X 10.2 (begrudgingly) and ran straight into 
this bug. Nothing like changing versions of python (2.2.2 to 2.3a2) and 
tcl/tk (8.3.4 to 8.4.2) while using a platform you're unfamiliar with! 
Anyway, I have successflly applied the patch; however, it has simply 
propagated the problem elsewhere. Specifically, the pmw rev 1.1 
widgets library. The problem is, pmw does additional processing that 
chokes on the '??' now returned by the try: excempt: statements. 

Perhaps, if anyone knows, it would be better to mimick what tcl/tk 8.3.x 
returned with the except statements. Pmw may not be the only library 
out there that will get choked up on this.

I will submit a bug in the pmw for this as well, but I'm looking for a least 
resistance path to get things up and running. (And not really wanting to 
rewite all my GUI constructon code...)

Thanks

Jeremy Moore

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-23 14:35

Message:
Logged In: YES 
user_id=60903

> What is the problem that this patch solves?

As the subject says: Provide a patch for #698517.

tk8.4.2 returns for the undefined fields in events empty
strings or '??' strings, on which the int conversions fail.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 13:07

Message:
Logged In: YES 
user_id=21627

What is the problem that this patch solves?

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-22 08:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 23:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 23:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 21:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Thu Mar 27 23:59:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 15:59:39 -0800
Subject: [Patches] [ python-Patches-710931 ] iconv codec-NG and Korean Codecs
Message-ID: <E18yhHX-00036Y-00@sc8-sf-web4.sourceforge.net>

Patches item #710931, was opened at 2003-03-27 20:31
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
>Assigned to: Martin v. L�wis (loewis)
Summary: iconv codec-NG and Korean Codecs

Initial Comment:
This patch includes update for iconv_codec, new 
sources for korean codecs and MultibyteCodec 
supplemental library.
I splitted out common parts of codecs for usual 
multibyte encodings into multibytecodec.c and this 
iconv codec and the korean codecs are using it.
The korean codecs is only 58K in stripped i386 ELF and 
62K in stripped i386 PECOFF binary and I think it's 
small enough to be incorporated into python.

Files:

Lib/encodings/aliases.py
   adds aliases for korean encodings and remove 
comments that isn't true now.

Lib/encodings/cp949.py
Lib/encodings/euc_kr.py
   codecs for korean encodings

Lib/encodings/iconv_codec.py
   updated for new _iconv_codec implementation

Lib/test/test_ko_codecs.py
   unit test for cp949, euc_kr codec

Lib/test/test_ko_codecs_mapping.py
   unit test to test cp949 mapping

Lib/test/test_iconv_codec_euc_kr.py
   another iconv_codec test unit. because non-unicode 
multibyte encoding is required to test both of 
iconv_codec and multibytecodec.

Lib/test/test_multibytecodec_support.py
   common part for above test units

Modules/_iconv_codec.c
   new implementation of _iconv_codec.
   this resolves numerous problems that previous 
implementation had. and iconv_codec has sane 
StreamReader now! :)

Modules/_ko_codec.c
Modules/_ko_codec.h
   korean codecs module

Modules/multibytecodec.c
Modules/multibytecodec.h
   common multibyte codec supplement. I think that this 
can be used for any usual multibyte encodings.
   I'll submit Chinese Codecs in few days using this.

Tools/unicode/genmap_ko_codecs.py
   code generator for _ko_codecs.h

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:00:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:00:25 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18yhIH-00038A-00@sc8-sf-web4.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
>Assigned to: M.-A. Lemburg (lemburg)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:00

Message:
Logged In: YES 
user_id=21627

Marc-Andre, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2003-03-27 09:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:01:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:01:01 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E18yhIr-00039B-00@sc8-sf-web4.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
>Assigned to: Martin v. L�wis (loewis)
>Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:02:23 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:02:23 -0800
Subject: [Patches] [ python-Patches-709743 ] os.setpgrp function failed to build
Message-ID: <E18yhKB-0003Cf-00@sc8-sf-web4.sourceforge.net>

Patches item #709743, was opened at 2003-03-26 00:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gary H. Loechelt (loechelt)
>Assigned to: Martin v. L�wis (loewis)
Summary: os.setpgrp function failed to build

Initial Comment:
The os.setpgrp function failed to build on HP-UX
B.10.20 for Python 2.3a2.  Comparing the build with
Python 2.2.1, I noticed a missing line in the
pyconfig.h.in file.  I added the appropriate line to
the file and rebuilt the executable.  Note that I did
NOT check the configure script to insure that the
appropriate compiler macro (HAVE_SETPGRP) was set.  I
just manually set the macro in the pyconfig.h file
directly.  The person who has responsibility for
configure should probably check it as well to make sure
that it is not broken as well.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:01

Message:
Logged In: YES 
user_id=21627

Can you please report precisely as to how it fails?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:01:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:01:58 -0800
Subject: [Patches] [ python-Patches-709743 ] os.setpgrp function failed to build
Message-ID: <E18yhJm-0003BA-00@sc8-sf-web4.sourceforge.net>

Patches item #709743, was opened at 2003-03-26 00:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gary H. Loechelt (loechelt)
Assigned to: Nobody/Anonymous (nobody)
Summary: os.setpgrp function failed to build

Initial Comment:
The os.setpgrp function failed to build on HP-UX
B.10.20 for Python 2.3a2.  Comparing the build with
Python 2.2.1, I noticed a missing line in the
pyconfig.h.in file.  I added the appropriate line to
the file and rebuilt the executable.  Note that I did
NOT check the configure script to insure that the
appropriate compiler macro (HAVE_SETPGRP) was set.  I
just manually set the macro in the pyconfig.h file
directly.  The person who has responsibility for
configure should probably check it as well to make sure
that it is not broken as well.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:01

Message:
Logged In: YES 
user_id=21627

Can you please report precisely as to how it fails?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:03:32 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:03:32 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18yhLI-0003Fk-00@sc8-sf-web4.sourceforge.net>

Patches item #709178, was opened at 2003-03-25 03:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
>Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 00:12:52 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Thu, 27 Mar 2003 16:12:52 -0800
Subject: [Patches] [ python-Patches-708374 ] add offset to mmap
Message-ID: <E18yhUK-0007zC-00@sc8-sf-web1.sourceforge.net>

Patches item #708374, was opened at 2003-03-23 15:33
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: add offset to mmap

Initial Comment:
This patch is from Yotam Medini <yotamm at
mellanox.co.il> sent to me in mail.

It adds support for the offset parameter to mmap.

It ignores the check for mmap size "if the file is
character device.  Some device drivers (which I happen
to use) have zero size in fstat buffer, but still one
can seek() read() and tell()."
I added minimal doc and tests.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:12

Message:
Logged In: YES 
user_id=21627

I think non-zero offsets need to be supported for Windows as
well.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 16:37

Message:
Logged In: YES 
user_id=33168

Email received from Yotam:

I have downloaded and patched the 2.3a source. compiled
locally just this module, and it worked fine for my
application (with offset for character device file) I did
not run the released test though.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 08:40:20 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 00:40:20 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18ypPQ-0005sZ-00@sc8-sf-web1.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
>Assigned to: Martin v. L�wis (loewis)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 09:40

Message:
Logged In: YES 
user_id=38388

Looks ok.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:00

Message:
Logged In: YES 
user_id=21627

Marc-Andre, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2003-03-27 09:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 08:44:42 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 00:44:42 -0800
Subject: [Patches] [ python-Patches-612627 ] Allow more Unicode on sys.stdout
Message-ID: <E18ypTe-00065y-00@sc8-sf-web1.sourceforge.net>

Patches item #612627, was opened at 2002-09-21 22:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. L�wis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Allow more Unicode on sys.stdout

Initial Comment:
This patch extends the set of Unicode strings that can
be printed to sys.stdout, to support all strings that
the terminal will likely support. It also adds an
encoding attribute to sys.std{in,out}.

To do that:
- it adds a .encoding attribute to all file objects,
which is normally None
- initializes the encoding of sys.stdin and sys.stdout
if either is a terminal.
- adds a wrapper object around sys.stdout in site.py
that encodes all Unicode objects according to the
detected encoding, if that encoding is known to Python

To find the encoding of the terminal, it
- uses GetConsoleCP and GetConsoleOutputCP on Windows,
- uses nl_langinfo(CODESET) on Unix, if available.

The primary rationale for this change is that people
should be able to print Unicode in an interactive
session. A parallel change needs to be added for IDLE,
so that it adds the .encoding attribute to the emulated
stdout (it already supports printing of Unicode on stdout).

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 09:44

Message:
Logged In: YES 
user_id=38388

Looks ok except for the direct hacking
of f_encoding in the sys module. Please add
either a macro or a new API to make changing
the encoding from C possible without tapping
directly into the implementation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 12:59

Message:
Logged In: YES 
user_id=21627

Is the patch now acceptable?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-10-26 19:47

Message:
Logged In: YES 
user_id=21627

I've attached a revised version which implements your
proposal; this version works without modification of site.py.

In its current form, the file encoding is only applied in
print; for sys.stdout.write, it is ignored. For print, it is
applied independent of whether this is a script or
interactive mode.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-10-25 14:09

Message:
Logged In: YES 
user_id=38388

I think it could work by adding a special case to 
PyFile_WriteObject() instead of calling PyObject_Print().
You first encode the Unicode object and then let
PyFile_WriteString() take care of the writing to the
FILE* object.

I see no other way, since you can't place the .encoding 
information into the FILE* object.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-09-24 11:02

Message:
Logged In: YES 
user_id=21627

I have considered implementing it in the file object.
However, it becomes quite involved, and heavy C code:
PyFile_WriteObject calls PyObject_Print. Since Unicode does
not implement a tp_print, this calls str/repr, which
converts using the default encoding.

It is not clear at which point the file encoding should be
taking into account.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-09-24 10:10

Message:
Logged In: NO 

I like the .encoding concept. 

I don't really like the sys.stdout wrapper. Wouldn't it be 
better to add the functionality to the file object .write() and 
.writelines() methods and then only use the wrapper in case 
sys.stdout is not a true file object ?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 15:34:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 07:34:14 -0800
Subject: [Patches] [ python-Patches-709743 ] os.setpgrp function failed to build
Message-ID: <E18yvry-0004kI-00@sc8-sf-web2.sourceforge.net>

Patches item #709743, was opened at 2003-03-25 16:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Gary H. Loechelt (loechelt)
Assigned to: Martin v. L�wis (loewis)
Summary: os.setpgrp function failed to build

Initial Comment:
The os.setpgrp function failed to build on HP-UX
B.10.20 for Python 2.3a2.  Comparing the build with
Python 2.2.1, I noticed a missing line in the
pyconfig.h.in file.  I added the appropriate line to
the file and rebuilt the executable.  Note that I did
NOT check the configure script to insure that the
appropriate compiler macro (HAVE_SETPGRP) was set.  I
just manually set the macro in the pyconfig.h file
directly.  The person who has responsibility for
configure should probably check it as well to make sure
that it is not broken as well.

----------------------------------------------------------------------

>Comment By: Gary H. Loechelt (loechelt)
Date: 2003-03-28 08:34

Message:
Logged In: YES 
user_id=142817

The build failed because the HAVE_SETPGRP compiler macro was
never set.  Consequently, the code for the os.setpgrp
function (posix_setpgrp) in the posixmodule.c file never
compiled.  Even though the rest of the posixmodule.c
compiled, the os.setpgrp function was not available in the
os module.  Once I manually set the HAVE_SETPGRP compiler
macro in the pyconfig.h header file and rebuilt
posixmodule.c, everything worked and I was able to call the
os.setpgrp function.  I began to track down why the
HAVE_SETPGRP compiler macro never got set during my
configuration.  Realizing that pyconfig.h is generated from
pyconfig.h.in, I checked to see if HAVE_SETPGRP was even in
pyconfig.h.in to start with.  It was not.  I compared
pyconfig.h.in in python version 2.3a2 with version 2.2.1 and
confirmed that HAVE_SETPGRP is indeed missing from
pyconfig.h.in.  Consequently, it never gets passed on to
pyconfig.h during configuration, and posix_setpgrp never
gets compiled in posixmodule.c because the macro is never
defined.  That was why I could not import the setpgrp
function from the os module in my build of python 2.3a2,
even though the rest of the os module was fine.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 17:01

Message:
Logged In: YES 
user_id=21627

Can you please report precisely as to how it fails?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 17:12:49 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 09:12:49 -0800
Subject: [Patches] [ python-Patches-711448 ] Warn about inter-module assignments shadowing builtins
Message-ID: <E18yxPN-0003qM-00@sc8-sf-web3.sourceforge.net>

Patches item #711448, was opened at 2003-03-28 17:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711448&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Nobody/Anonymous (nobody)
Summary: Warn about inter-module assignments shadowing builtins

Initial Comment:
The attached patch modifies module tp_setattro to warn
about
code that adds a name to the globals of another module that
shadows a builtin.  Unfortunately, there are other ways to
modify module globals (e.g. using vars() and mutating the
dictionary).

There are a few issues with module objects that I'm not
clear
about.  For example, do modules always have a md_dict that
is a PyDictObject?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711448&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 17:15:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 09:15:11 -0800
Subject: [Patches] [ python-Patches-711448 ] Warn about inter-module assignments shadowing builtins
Message-ID: <E18yxRf-0003zn-00@sc8-sf-web3.sourceforge.net>

Patches item #711448, was opened at 2003-03-28 17:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711448&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neil Schemenauer (nascheme)
Assigned to: Nobody/Anonymous (nobody)
Summary: Warn about inter-module assignments shadowing builtins

Initial Comment:
The attached patch modifies module tp_setattro to warn
about
code that adds a name to the globals of another module that
shadows a builtin.  Unfortunately, there are other ways to
modify module globals (e.g. using vars() and mutating the
dictionary).

There are a few issues with module objects that I'm not
clear
about.  For example, do modules always have a md_dict that
is a PyDictObject?

----------------------------------------------------------------------

>Comment By: Neil Schemenauer (nascheme)
Date: 2003-03-28 17:15

Message:
Logged In: YES 
user_id=35752

Attaching patch.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711448&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 18:41:25 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 10:41:25 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18yyn7-0007k7-00@sc8-sf-web3.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 17:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 09:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 15:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 18:52:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 10:52:10 -0800
Subject: [Patches] [ python-Patches-709743 ] os.setpgrp function failed to build
Message-ID: <E18yyxW-00056I-00@sc8-sf-web2.sourceforge.net>

Patches item #709743, was opened at 2003-03-26 00:15
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470

Category: Build
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Gary H. Loechelt (loechelt)
Assigned to: Martin v. L�wis (loewis)
Summary: os.setpgrp function failed to build

Initial Comment:
The os.setpgrp function failed to build on HP-UX
B.10.20 for Python 2.3a2.  Comparing the build with
Python 2.2.1, I noticed a missing line in the
pyconfig.h.in file.  I added the appropriate line to
the file and rebuilt the executable.  Note that I did
NOT check the configure script to insure that the
appropriate compiler macro (HAVE_SETPGRP) was set.  I
just manually set the macro in the pyconfig.h file
directly.  The person who has responsibility for
configure should probably check it as well to make sure
that it is not broken as well.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 19:52

Message:
Logged In: YES 
user_id=21627

I see. Thanks for the report; this is now fixed in

configure 1.385
configure.in 1.396
pyconfig.h.in 1.75

(notice that configure.in is the only file to change here,
bot configure and pyconfig.h.in are generated).

----------------------------------------------------------------------

Comment By: Gary H. Loechelt (loechelt)
Date: 2003-03-28 16:34

Message:
Logged In: YES 
user_id=142817

The build failed because the HAVE_SETPGRP compiler macro was
never set.  Consequently, the code for the os.setpgrp
function (posix_setpgrp) in the posixmodule.c file never
compiled.  Even though the rest of the posixmodule.c
compiled, the os.setpgrp function was not available in the
os module.  Once I manually set the HAVE_SETPGRP compiler
macro in the pyconfig.h header file and rebuilt
posixmodule.c, everything worked and I was able to call the
os.setpgrp function.  I began to track down why the
HAVE_SETPGRP compiler macro never got set during my
configuration.  Realizing that pyconfig.h is generated from
pyconfig.h.in, I checked to see if HAVE_SETPGRP was even in
pyconfig.h.in to start with.  It was not.  I compared
pyconfig.h.in in python version 2.3a2 with version 2.2.1 and
confirmed that HAVE_SETPGRP is indeed missing from
pyconfig.h.in.  Consequently, it never gets passed on to
pyconfig.h during configuration, and posix_setpgrp never
gets compiled in posixmodule.c because the macro is never
defined.  That was why I could not import the setpgrp
function from the os module in my build of python 2.3a2,
even though the rest of the os module was fine.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:01

Message:
Logged In: YES 
user_id=21627

Can you please report precisely as to how it fails?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709743&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 20:56:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 12:56:48 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18z0u8-0001Lb-00@sc8-sf-web2.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 21:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 15:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 13:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 19:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 21:16:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 13:16:18 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18z1D0-0005vW-00@sc8-sf-web3.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 17:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 12:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 11:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 09:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 15:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 22:15:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 14:15:34 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18z28M-0002Rc-00@sc8-sf-web4.sourceforge.net>

Patches item #709178, was opened at 2003-03-25 03:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 23:15

Message:
Logged In: YES 
user_id=21627

I'm in favour of applying this patch, and also of patches
that mandate recent Cygwin releases; if such patches are
implemented, the minimum required Cygwin version should be
stated somewhere.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 22:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 21:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 19:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:24:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:24:19 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E18z3Ct-0001oH-00@sc8-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 20:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
>Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 21:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 21:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 19:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 19:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:25:54 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:25:54 -0800
Subject: [Patches] [ python-Patches-532180 ] fix xmlrpclib float marshalling bug
Message-ID: <E18z3EQ-0001re-00@sc8-sf-web1.sourceforge.net>

Patches item #532180, was opened at 2002-03-19 23:28
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=532180&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Fredrik Lundh (effbot)
Summary: fix xmlrpclib float marshalling bug

Initial Comment:
As it stands now, xmlrpclib can send doubles, such as 
1.#INF, that are not part of the XML-RPC standard. 
This patch causes a ValueError to be raised instead.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 00:25

Message:
Logged In: YES 
user_id=21627

I'll conclude that it is a lot of tedious work for no
reason, and close this patch.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-21 00:55

Message:
Logged In: YES 
user_id=31435

Python's internal format buffers are too small to use C %f 
in its full generality, so you're suggesting something 
there that's much harder to get done than you suspect.  
Note that %f isn't a cureall anyway, as in either Python or 
C, e.g., '%f' % 1e-10 throws away all information, 
producing a string of zeroes.  What you did is usually much 
better than that.

Let's wait to hear what /F wants to do.  If he's inclined 
to take this part of the spec at face value, I can work 
with him to write a "conforming" float->string that's 
numerically sound.  Else it's a lot of tedious work for no 
reason.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-21 00:24

Message:
Logged In: YES 
user_id=108973

OK, this floating point stuff is over my head.

Is it OK that it loses accuracy?  
- No
Is it OK that it produces 16 trailing zeroes for 1e-250?
- Yes
Is it OK that it raises OverflowError for the normal double 
1e-300?  
- No

Would exposing and using the C %f specifier, along with 
repr, make for identical roundtrips?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 23:53

Message:
Logged In: YES 
user_id=31435

I don't use XML-RPC, so I'm assigning this to /F (it was 
his code at the start, and he wants to keep it in synch 
with his company's version).

Formatting floats is a difficult job if you pay attention 
to accuracy.  The original code had the property that 
converting a Python float to an XML-RPC string, then back 
to a float again, reproduced the original input exactly.  
The code in the patch enjoys that property only by 
accident; much of the time a roundtrip conversion using it 
won't reproduce the number that was passed in.  Is that 
OK?  There's no way to tell, since the XML-RPC spec has 
scant idea what it's doing here, so leaves important 
questions unanswered.  OTOH, it seems to me that the 
*point* of this porotocol is to transport values across 
boxes, so of course it should move heaven and earth to 
transport them faithfully.

Is it OK that it loses accuracy?  Is it OK that it produces 
16 trailing zeroes for 1e-250?  Is it OK that it raises 
OverflowError for the normal double 1e-300?  No matter 
what's asked, the spec has no answers.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 21:48

Message:
Logged In: YES 
user_id=108973

Ooops, I already wrote the converter (see new patch). I'm 
not very concerned about sending 300 character strings for 
large doubles, but I guess someone might be. I am concerned 
about how large and ugly the code is.

XML-RPC is very poorly specified but the grammar for 
doubles seems reasonably clear (silly, but clear).

If you don't like my double marshalling code, you could 
please just checkin your infinity/NaN detection code (also 
part of my patch)?


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 21:13

Message:
Logged In: YES 
user_id=31435

If you think XML-RPC users are keen to see multi-hundred 
character strings produced for ordinary doubles, Python 
isn't going to be much help (you'll have to write your own 
float -> string conversion); or if you think they're happy 
to get an exception if they want to pass (e.g.) 1e20, you 
can keep using repr() and complain because repr(1e20) 
produces an exponent.

"decimal format" is simply two extremely common words 
pasted together <+.9 wink>.  I expect the Python docs here 
ended up so vague because whoever wrote this part of the 
docs didn't know the full story and didn't have time to 
figure it out.

But I expect the same is true of the part of this spec 
dealing with doubles (it doesn't define what it means 
by "double-precision", and then goes on to say stuff that 
doesn't make sense for what C or Java mean by double, or by 
what IEEE-754 means by double precision -- it's off in its 
own world, so if you take it at face value you'll have to 
guess what the world is, and implement it yourself).

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 20:32

Message:
Logged In: YES 
user_id=108973

I think that we should be flexible about the data that we 
accept but rigorous about the data that we generate. So the 
sign should always be send but not required. 

"decimal format" appears in the Python documentation 
(http://www.python.org/doc/current/lib/typesseq-
strings.html) so it is probably a documentation bug if the 
meaning is not widely known.

I parsed it as "not exponential format".

My question was whether the %f Python format specifier 
simply mapped to the C %f format specifier. But, based on 
the output of a simple C program, that does not appear to 
be the case.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 20:04

Message:
Logged In: YES 
user_id=31435

Well, Brian, the spec clearly disallows 1.0 too -- if you 
want to take that spec seriously, you can implement what it 
says and we'll redirect the complaints to your personal 
email account <wink>.

I can't parse your question about the C library (like, I 
don't know what you mean by "decimal format").


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 19:57

Message:
Logged In: YES 
user_id=108973

Whether it was intended or not, the spec clearly disallows 
it. 

I noticed the %f behavior too, which is interesting because 
the Python docs say: 
f Floating point decimal format

I wonder if it is the underlying C library refusing to 
write large float values in decimal format.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 19:08

Message:
Logged In: YES 
user_id=31435

Ack, I take part of that back:  it's Python's 
implementation of '%f' that can produce exponent notation.  
There's no simple way to get the effect of C's %f from 
Python.  It's clear as mud whether "the spec" *intended* to 
outlaw exponent notation.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 18:53

Message:
Logged In: YES 
user_id=31435

"%f" can produce exponent notation too, which is also not 
allowed by this pseudo-spec.

r = repr(some_double)
if 'n' in r or 'N' in r:
    raise ValueError(...)

is robust, will work fine x-platform, and isn't insane 
<wink>.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 18:31

Message:
Logged In: YES 
user_id=108973

Eric Kidd's XML-RPC C uses sprintf("%f") for marshalling 
and strtod for unmarshalling.

Let me design a more robust patch. 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 17:23

Message:
Logged In: YES 
user_id=31435

The spec appears worse than useless to me here -- whoever 
wrote it just made stuff up.  They don't appear to know 
anything about floats or about grammar specification.  Do 
you really want to allow "+." and disallow "1.0"?  This 
seems a case where the spec is so braindead that nobody (in 
their mind <wink>) will implement it as given.  What do 
other implementations do?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-03-20 17:03

Message:
Logged In: YES 
user_id=21627

You are right. An even better patch would check for
compliance with the protocol. Currently, the xmlrpc spec says

#  There is no representation for infinity or negative 
# infinity or "not a number". At this time, only decimal
# point notation is allowed, a plus or a minus, followed by
# any number of numeric characters, followed by a period 
# and any number of numeric characters. Whitespace is not 
# allowed. The range of allowable values is 
# implementation-dependent, is not specified.

That would be best validated with a regular expression.


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 16:02

Message:
Logged In: YES 
user_id=31435

Note that the patch only catches "the problem" on a 
platform whose C library can't read back its own float 
output.  Windows is in that class, but many other platforms 
aren't.

It would be better to see whether 'n' or 'N' appear in the 
repr() (that would catch variations of 'inf', 'INF', 'NaN' 
and 'IND', while no "normal" float contains n).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-03-20 08:28

Message:
Logged In: YES 
user_id=21627

It seems repr of the float is computed twice in every case.
I recommend to save the result of the first computation.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=532180&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:27:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:27:04 -0800
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E18z3FY-0001wH-00@sc8-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 20:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven F. Lott (slott56)
>Assigned to: Martin v. L�wis (loewis)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-11-22 10:23

Message:
Logged In: YES 
user_id=21627

I now recommend to approve this patch. It improves SGML
correctness, and, while supporting an MS extension,
explicitly points out that it is doing so.

----------------------------------------------------------------------

Comment By: Steven F. Lott (slott56)
Date: 2002-04-22 20:50

Message:
Logged In: YES 
user_id=328067

My suggestion for handling this MS extension syntax is 
to (1) tolerate the extension without an error, (2) treat it 
as an SGML marked section, using the 
unknown_decl() call-back.  Since this is a separate 
function, subclasses can override to alter this behavior.  

The content hidden in these MS-specific marked 
section appears to always be a &nbsp;.  While it might 
be expedient to completly skip over this junk, it makes it 
difficult to handle marked sections in a future version of 
markupbase.

Attached is a revised patch against V1.39 of sgmllib.py 
and 1.4 of markupbase.py

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-21 17:11

Message:
Logged In: YES 
user_id=3066

This is the same as bug #505747.

These "tags" are not legal HTML in any form, but are some
Microsoft invention.  It's not entirely clear what the right
thing to do is, but it is clear that we need to deal with
these in some different way.

Changed group to indicate that such changes can only go into
the trunk; feature changes in maintenance versions are not
allowed.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-04-18 19:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:31:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:31:11 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E18z3JX-000571-00@sc8-sf-web4.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 21:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 18:31

Message:
Logged In: YES 
user_id=87160

I can help with testing. I have access to W2K and Win98
(ugh) boxen. I don't mind installing a few older toolchains
if you think that's necessary.

I think any C/C++ python extension using plain distutils (no
fancy hacks added on) and has one or more DLL dependencies
is a good test case.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 17:15

Message:
Logged In: YES 
user_id=21627

I'm in favour of applying this patch, and also of patches
that mandate recent Cygwin releases; if such patches are
implemented, the minimum required Cygwin version should be
stated somewhere.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 16:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 15:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 13:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 19:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:31:57 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:31:57 -0800
Subject: [Patches] [ python-Patches-554807 ] Add _winreg support for Cygwin
Message-ID: <E18z3KH-00028n-00@sc8-sf-web1.sourceforge.net>

Patches item #554807, was opened at 2002-05-11 14:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554807&group_id=5470

Category: Windows
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Gerald S. Williams (gsw_agere)
Assigned to: Mark Hammond (mhammond)
Summary: Add _winreg support for Cygwin

Initial Comment:
This adds _winreg support to Cygwin Python without 
dependencies on other Windows modules.

For platforms in which MS_WINDOWS isn't defined, this 
reports the OSError exception instead of WindowsErr. 
It also uses the non-MBCS versions of registry access 
in this case.

Some minor changes to _winreg.c were made to clean up 
compiler warnings from GCC.

setup.py was changed to create a dynamic _winreg 
module under cygwin. There are also some earlier 
changes in the patch file to skip the import test (due 
to Cygwin fork issues), and to require libintl when 
building _locale under Cygwin.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 00:31

Message:
Logged In: YES 
user_id=21627

I'm rejecting that patch, since no updates are happening. If
somebody wants to deal with _winreg support for Cygwin
again, please submit a new patch.

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2002-09-18 04:57

Message:
Logged In: YES 
user_id=14198

I'll take this on.  I have a number of other patches and
bugs to look at, so if someone wants to beat me to it, be my
guest.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-09-16 19:57

Message:
Logged In: YES 
user_id=21627

If a convincing patch comes along, I'd happily apply it.
Supporting _winreg is still reasonable even if
/proc/registry exists, for compatibility with other Win32 ports.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-09-16 19:19

Message:
Logged In: NO 

I'm prepared to try to help if there's still energy here, and there
are specific things to do.

However I agree that _if_ the cygwin /proc/registry story is
going to become writeable, then there's not much point.

----------------------------------------------------------------------

Comment By: Gerald S. Williams (gsw_agere)
Date: 2002-07-30 16:04

Message:
Logged In: YES 
user_id=329402

I plan to get back to this eventually, although held off for three 
reasons:
 - Cygwin is incorporating a registry file system that may be a 
better way to implement this
 - saw some posts about possible Unicode changes
 - Real Life (job priorities, vacation)

I probably won't get back to this until the middle of August.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-07-28 12:02

Message:
Logged In: YES 
user_id=21627

Is any kind of tweaking forthcoming?

----------------------------------------------------------------------

Comment By: Gerald S. Williams (gsw_agere)
Date: 2002-05-15 15:30

Message:
Logged In: YES 
user_id=329402

It sounds like the patches need some tweaking (my testing 
had passed but was certainly limited).

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-15 14:57

Message:
Logged In: YES 
user_id=21627

Yes, but you are wrong assuming that the *A functions expect
Latin-1. Instead, they expect char* encoded as CP_ACP, which
is known as "mbcs" in Python.

The *W functions do *not* expect multi-byte strings, but
Unicode strings.

Notice that _winreg also calls the *A functions, even in
MSVC builds.

So I think converting Unicode to Latin-1 is definitely
incorrect.

----------------------------------------------------------------------

Comment By: Gerald S. Williams (gsw_agere)
Date: 2002-05-15 14:48

Message:
Logged In: YES 
user_id=329402

Windows supplies two versions of the relevant functions. 
The Cygwin version (at least as built) uses the ANSI 
versions, as indicated by the A at the end of the symbol 
names:
  $ nm _winreg.o | grep RegQueryValue
           U _RegQueryValueA@16
           U _RegQueryValueExA@24

As opposed to the "Windows Unicode/wide-char" functions, 
which end in W and require MBCS functions to decode.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-15 00:23

Message:
Logged In: YES 
user_id=21627

Can you please explain why not using MBCS is the right thing?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=554807&group_id=5470


From noreply@sourceforge.net  Fri Mar 28 23:34:26 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 15:34:26 -0800
Subject: [Patches] [ python-Patches-590682 ] New codecs: html, asciihtml
Message-ID: <E18z3Mg-0002F4-00@sc8-sf-web1.sourceforge.net>

Patches item #590682, was opened at 2002-08-04 06:58
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Rejected
Priority: 3
Submitted By: Oren Tirosh (orenti)
Assigned to: M.-A. Lemburg (lemburg)
Summary: New codecs: html, asciihtml

Initial Comment:
These codecs translate HTML character &entity; 
references.

The html codec may be applied after other codecs such 
as utf-8 or iso8859_X and preserves their encoding.  The 
asciihtml encoder produces 7-bit ascii and its output is 
therefore safe for insertion into almost any document 
regardless of its encoding.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 00:34

Message:
Logged In: YES 
user_id=21627

Apparently, this patch is not needed anymore, so I'm
rejecting it.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-12 11:11

Message:
Logged In: YES 
user_id=21627

Oren, is this patch still needed, as we now have the 
xmlcharrefreplace error handler?

----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-09 17:38

Message:
Logged In: YES 
user_id=562624

Case insensitivity fixed. General cleanup.  Codecs renamed to 
htmlescape and htmlescape8bit.  Improved error handling. 
Update unicode_test.


----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-05 14:11

Message:
Logged In: YES 
user_id=562624

Yes, entities are supposed to be case sensitive but I'm 
working with manually-generated html in which &GT; is not so 
uncommon...  I guess life is different in XML world.
Case-smashing loses the distinction between some entities. I 
guess I need a more intelligent solution.

> If you apply it to an 8-bit UTF-8 encoded strings you'll get 
garbage!

Actually, it works great. The html codec passes characters 
128-255 unmodified and therefore can be chained with other 
codecs.  But I now have a more elegant and high-performance 
approach than codec chaining. See my python-dev posting. 

----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-05 14:11

Message:
Logged In: YES 
user_id=562624

Yes, entities are supposed to be case sensitive but I'm 
working with manually-generated html in which &GT; is not so 
uncommon...  I guess life is different in XML world.
Case-smashing loses the distinction between some entities. I 
guess I need a more intelligent solution.

> If you apply it to an 8-bit UTF-8 encoded strings you'll get 
garbage!

Actually, it works great. The html codec passes characters 
128-255 unmodified and therefore can be chained with other 
codecs.  But I now have a more elegant and high-performance 
approach than codec chaining. See my python-dev posting. 

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-08-05 09:59

Message:
Logged In: YES 
user_id=38388

On the htmlentitydefs: yes, these are in use as they are 
defined now. If you want a mapping from and to Unicode,
I'd suggest to provide this as a new table. About the cased
key in the entitydefs dict: AFAIK, these have to be cased since
entities are case-sensitive. Could be wrong though.

On PEP 293: this is going in the final round now. Your patch 
doesn't compete with it though, since PEP 293 is a much more 
general  approach.

On the general idea: I think the codecs are misnamed. They
should
be called htmlescape and asciihtmlescape since they don't
provide
"real" HTML encoding/decoding as Martin already mentioned. 
There's something wrong with your approach, BTW: the codec
should only operate on Unicode (taking only Unicode input
and generating Unicode). If you apply it to an 8-bit
UTF-8 encoded strings you'll get garbage !

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-08-04 17:54

Message:
Logged In: YES 
user_id=21627

I'm in favour of exposing this via a search functions, for
generated codec names, on top of PEP 293 (I would not like
your codec to compete with the alternative mechanism). My
dislike for the current patch also comes from the fact that
it singles-out ASCII, which the search function would not.

You could implement two forms: html.codecname and
xml.codecname. The html form would do HTML entity references
in both directions, and fall back to character references
only if necessary; the XML form would use character
references all the time, and entity references only for the
builtin entities.

And yes, I do recommend users to use codecs.charmap_encode
directly, as this is probably the most efficient, yet most
compact way to convert Unicode to a less-than-7-bit form.

In anycase, I'd encourage you to contribute to the progress
of PEP 293 first - this has been an issue for several years
now, and I would be sorry if it would fail.

While you are waiting for PEP 293 to complete, please do
consider cleaning up htmlentitydefs to provide mappings from
and to Unicode characters.

----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-04 17:07

Message:
Logged In: YES 
user_id=562624

>People may be tricked into believing that they can 
>decode arbitrary HTML with your codec - when your 
>codec would incorrectly deal with CDATA sections.

You don't even need to go as far as CDATA to see that tags 
must be parsed first and only then tag bodies and attribute 
values can be individually decoded. If you do it in the reverse 
order the tag parser will try to parse &lt; as a tag. It should be 
documented, though.

For encoding it's also obvious that encoding must be done 
first and then the encoded strings can be inserted into tags - 
< in strings is encoded into &lt; preventing it from being 
interpreted as a tag. This is a good thing! it prevents insertion 
attacks.

> You can easily enough arrange to get errors on <, >, 
> and &, by using codecs.charmap_encode with an 
> appropriate encoding map.

If you mean to use this as some internal implementation 
detail it's ok. Are actually proposing that this is the way end 
users should use it?

How about this:

Install an encoder registry function that responds to any 
codec name matching "xmlcharref.SPAM" and does all the 
internal magic you describe to create a codec instance that 
combines xmlcharref translation including <,>,& and the 
SPAM encoding. This dynamically-generated codec will do 
both encoding and decoding and be cached, of course.

"Namespaces are one honking great idea -- let's do more of 
those!"


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-08-04 13:50

Message:
Logged In: YES 
user_id=21627

You can easily enough arrange to get errors on <, >, amd &,
by using codecs.charmap_encode with an appropriate encoding map.

Infact, with that, you can easily get all entity refereces
into the encoded data, without any need for an explicit
iteration.

However, I am concerned that you offer decoding as well.
People may be tricked into believing that they can decode
arbitrrary HTML with your codec - when your codec would
incorrectly deal with CDATA sections.


----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-04 13:10

Message:
Logged In: YES 
user_id=562624

PEP 293 and patch #432401 are not a replacement for these 
codecs - it does decoding as well as encoding and also 
translates <, >, and & which are valid in all encodings and 
therefore won't get translated by error callbacks.


----------------------------------------------------------------------

Comment By: Oren Tirosh (orenti)
Date: 2002-08-04 13:00

Message:
Logged In: YES 
user_id=562624

Yes, the error callback approach handles strange mixes 
better than my method of chaining codecs. But it only does 
encoding - this patch also provides full decoding of named, 
decimal and hexadecimal character entity references.

Assuming PEP 293 is accepted, I'd like to see the asciihtml 
codec stay for its decoding ability and renamed to xmlcharref. 
The encoding part of this codec can just call .encode("ascii", 
errors="xmlcharrefreplace") to make it a full two-way codec.

I'd prefer htmlentitydefs.py to use unicode, too. It's not so 
useful the way it is.  Another problem is that it uses mixed 
case names as keys. The dictionary lookup is likely to miss 
incoming entities with arbitrary case so it's more-or-less 
broken. Does anyone actually use it the way it is? Can it be 
changed to use unicode without breaking anyone's code?


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-08-04 10:54

Message:
Logged In: YES 
user_id=21627

This patch is superceded by PEP 293 and patch #432401, which
allows you to write

unitext.encode("ascii", errors = "xmlcharrefreplace")

This probably should be left open until PEP 293 is
pronounced upon, and then either rejected or reviewed in detail.

I'd encourage a patch that uses Unicode in htmlentitydefs
directly, and computes entitydefs from that, instead of
vice-versa (or atleast exposes a unicode_entitydefs, perhaps
even lazily) - perhaps also with a reverse mapping.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=590682&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 06:47:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Fri, 28 Mar 2003 22:47:47 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zA83-0005hD-00@sc8-sf-web3.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 10:04:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 02:04:11 -0800
Subject: [Patches] [ python-Patches-707701 ] fix for #698517, Tkinter and tk8.4.2
Message-ID: <E18zDC7-00044F-00@sc8-sf-web4.sourceforge.net>

Patches item #707701, was opened at 2003-03-21 20:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470

Category: Tkinter
Group: Python 2.2.x
>Status: Closed
>Resolution: Accepted
Priority: 7
Submitted By: Matthias Klose (doko)
Assigned to: Martin v. L�wis (loewis)
Summary: fix for #698517, Tkinter and tk8.4.2

Initial Comment:
[all python version, that can be built with tk8.4.2]

Fixing the failing conversions in _substitute. Use
try/except for each integer field, that is not
supported by all events.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 11:04

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as Tkinter.py 1.170 and
1.160.10.3.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 00:58

Message:
Logged In: YES 
user_id=21627

Well, no. The Tk change was made for a reason, and it is
unlikely that Tk people will back it out, so we should not
bypass this change.

If you want to get up and running, I recommend to use Tcl 8.3.

----------------------------------------------------------------------

Comment By: Jeremy Moore (jmoore_calaway)
Date: 2003-03-28 00:31

Message:
Logged In: YES 
user_id=744000

(Apologies if this is the inappropriate place to ask)
I'm porting an app to Mac OS X 10.2 (begrudgingly) and ran straight into 
this bug. Nothing like changing versions of python (2.2.2 to 2.3a2) and 
tcl/tk (8.3.4 to 8.4.2) while using a platform you're unfamiliar with! 
Anyway, I have successflly applied the patch; however, it has simply 
propagated the problem elsewhere. Specifically, the pmw rev 1.1 
widgets library. The problem is, pmw does additional processing that 
chokes on the '??' now returned by the try: excempt: statements. 

Perhaps, if anyone knows, it would be better to mimick what tcl/tk 8.3.x 
returned with the except statements. Pmw may not be the only library 
out there that will get choked up on this.

I will submit a bug in the pmw for this as well, but I'm looking for a least 
resistance path to get things up and running. (And not really wanting to 
rewite all my GUI constructon code...)

Thanks

Jeremy Moore

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-23 14:35

Message:
Logged In: YES 
user_id=60903

> What is the problem that this patch solves?

As the subject says: Provide a patch for #698517.

tk8.4.2 returns for the undefined fields in events empty
strings or '??' strings, on which the int conversions fail.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 13:07

Message:
Logged In: YES 
user_id=21627

What is the problem that this patch solves?

----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-22 08:26

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 23:15

Message:
Logged In: YES 
user_id=60903

Attach alternate patch by Chad


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 23:10

Message:
Logged In: YES 
user_id=40145

Hmmm, you are right.  Your approach will be quicker, due to
local namespace function lookup speed (try/except is fast in
non-exception path).

But, then again, a lot more exception paths will be executed
with the new Tk (with "??" fields), anyway, so the speed
issues may not be that important.


----------------------------------------------------------------------

Comment By: Matthias Klose (doko)
Date: 2003-03-21 22:14

Message:
Logged In: YES 
user_id=60903

I thought the whole thing to define getint = int was to do
local lookups only. Therefore the inlined try/excepts


----------------------------------------------------------------------

Comment By: Chad Netzer (chadn)
Date: 2003-03-21 21:59

Message:
Logged In: YES 
user_id=40145

Would it be better to simply define getint() as:

def getint( s ):
    try:
        return int( s )
    except ValueError:
        return s

Rather than add lots of try/excepts in the codebase?
I'm attaching an example diff (btw - I kept your field
explanations in the code; I liked them there)

These patches are important, BTW, since 8.4.1 has a few bugs
that would require other patches to Tkinter (returning ""
for getboolean for example, which seems to be fixed)


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=707701&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 11:37:22 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 03:37:22 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zEeI-0006gR-00@sc8-sf-web4.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 14:16:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 06:16:37 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18zH8P-0005eX-00@sc8-sf-web2.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:16

Message:
Logged In: YES 
user_id=21627

This patch breaks binary compatibility, as it changes the
layout of PyInterpreterState. We could reduce the risk of
breakage by moving the new members at the end of the struct.

Assigning to Guido for pronouncement: Should this
a) be rejected?
b) be accepted as is? (arguing that nobody uses the
interpreter state, anyway)
c) accepted with the proposed change (i.e.
sizeof(PyInterpreterState) still changes, but the offset of
the existing members doesn't).

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 09:40

Message:
Logged In: YES 
user_id=38388

Looks ok.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:00

Message:
Logged In: YES 
user_id=21627

Marc-Andre, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2003-03-27 09:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 14:18:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 06:18:48 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E18zHAW-0000Wx-00@sc8-sf-web1.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
>Assigned to: Walter D�rwald (doerwalter)
>Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:18

Message:
Logged In: YES 
user_id=21627

Looks fine, please apply it. Also add a test case that fails
now but passes with the change, and add a NEWS entry.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 14:40:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 06:40:56 -0800
Subject: [Patches] [ python-Patches-612627 ] Allow more Unicode on sys.stdout
Message-ID: <E18zHVw-00036Z-00@sc8-sf-web4.sourceforge.net>

Patches item #612627, was opened at 2002-09-21 22:32
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Martin v. L�wis (loewis)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Allow more Unicode on sys.stdout

Initial Comment:
This patch extends the set of Unicode strings that can
be printed to sys.stdout, to support all strings that
the terminal will likely support. It also adds an
encoding attribute to sys.std{in,out}.

To do that:
- it adds a .encoding attribute to all file objects,
which is normally None
- initializes the encoding of sys.stdin and sys.stdout
if either is a terminal.
- adds a wrapper object around sys.stdout in site.py
that encodes all Unicode objects according to the
detected encoding, if that encoding is known to Python

To find the encoding of the terminal, it
- uses GetConsoleCP and GetConsoleOutputCP on Windows,
- uses nl_langinfo(CODESET) on Unix, if available.

The primary rationale for this change is that people
should be able to print Unicode in an interactive
session. A parallel change needs to be added for IDLE,
so that it adds the .encoding attribute to the emulated
stdout (it already supports printing of Unicode on stdout).

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:40

Message:
Logged In: YES 
user_id=21627

In stdout3.txt, PyFile_SetEncoding has been added, wrapping
the creation and assignment of the string object f_encoding.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 09:44

Message:
Logged In: YES 
user_id=38388

Looks ok except for the direct hacking
of f_encoding in the sys module. Please add
either a macro or a new API to make changing
the encoding from C possible without tapping
directly into the implementation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-23 12:59

Message:
Logged In: YES 
user_id=21627

Is the patch now acceptable?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-10-26 19:47

Message:
Logged In: YES 
user_id=21627

I've attached a revised version which implements your
proposal; this version works without modification of site.py.

In its current form, the file encoding is only applied in
print; for sys.stdout.write, it is ignored. For print, it is
applied independent of whether this is a script or
interactive mode.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2002-10-25 14:09

Message:
Logged In: YES 
user_id=38388

I think it could work by adding a special case to 
PyFile_WriteObject() instead of calling PyObject_Print().
You first encode the Unicode object and then let
PyFile_WriteString() take care of the writing to the
FILE* object.

I see no other way, since you can't place the .encoding 
information into the FILE* object.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-09-24 11:02

Message:
Logged In: YES 
user_id=21627

I have considered implementing it in the file object.
However, it becomes quite involved, and heavy C code:
PyFile_WriteObject calls PyObject_Print. Since Unicode does
not implement a tp_print, this calls str/repr, which
converts using the default encoding.

It is not clear at which point the file encoding should be
taking into account.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2002-09-24 10:10

Message:
Logged In: NO 

I like the .encoding concept. 

I don't really like the sys.stdout wrapper. Wouldn't it be 
better to add the functionality to the file object .write() and 
.writelines() methods and then only use the wrapper in case 
sys.stdout is not a true file object ?


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=612627&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 15:00:07 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 07:00:07 -0800
Subject: [Patches] [ python-Patches-710931 ] iconv codec-NG and Korean Codecs
Message-ID: <E18zHoV-00073G-00@sc8-sf-web2.sourceforge.net>

Patches item #710931, was opened at 2003-03-27 20:31
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Martin v. L�wis (loewis)
Summary: iconv codec-NG and Korean Codecs

Initial Comment:
This patch includes update for iconv_codec, new 
sources for korean codecs and MultibyteCodec 
supplemental library.
I splitted out common parts of codecs for usual 
multibyte encodings into multibytecodec.c and this 
iconv codec and the korean codecs are using it.
The korean codecs is only 58K in stripped i386 ELF and 
62K in stripped i386 PECOFF binary and I think it's 
small enough to be incorporated into python.

Files:

Lib/encodings/aliases.py
   adds aliases for korean encodings and remove 
comments that isn't true now.

Lib/encodings/cp949.py
Lib/encodings/euc_kr.py
   codecs for korean encodings

Lib/encodings/iconv_codec.py
   updated for new _iconv_codec implementation

Lib/test/test_ko_codecs.py
   unit test for cp949, euc_kr codec

Lib/test/test_ko_codecs_mapping.py
   unit test to test cp949 mapping

Lib/test/test_iconv_codec_euc_kr.py
   another iconv_codec test unit. because non-unicode 
multibyte encoding is required to test both of 
iconv_codec and multibytecodec.

Lib/test/test_multibytecodec_support.py
   common part for above test units

Modules/_iconv_codec.c
   new implementation of _iconv_codec.
   this resolves numerous problems that previous 
implementation had. and iconv_codec has sane 
StreamReader now! :)

Modules/_ko_codec.c
Modules/_ko_codec.h
   korean codecs module

Modules/multibytecodec.c
Modules/multibytecodec.h
   common multibyte codec supplement. I think that this 
can be used for any usual multibyte encodings.
   I'll submit Chinese Codecs in few days using this.

Tools/unicode/genmap_ko_codecs.py
   code generator for _ko_codecs.h

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 16:00

Message:
Logged In: YES 
user_id=21627

Please submit an individual patch for each single bug fix or
new feature; it appears that this patch deals with
completely unrelated things. Therefore. I'm rejecting this
patch, encouraging you to submit new separate patches.

I have a few specific comments you may want to consider:

- What is the rationale for adding an alias processing to
the iconv codecs?

- It is unclear how you expect reuse of the
multibytecodec.c. Currently, this is incorporated into
_ko_codecs. How would this cooperate with other usages of
multibytecodecs? In particular, why is that needed in
iconv_codec?

- "complete reimplementation" is insufficient reason to
accept a change. What specific problems does the old iconv
codec have, and how specifically have they been corrected?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 15:02:01 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 07:02:01 -0800
Subject: [Patches] [ python-Patches-710931 ] iconv codec-NG and Korean Codecs
Message-ID: <E18zHqL-0003oz-00@sc8-sf-web4.sourceforge.net>

Patches item #710931, was opened at 2003-03-27 20:31
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Martin v. L�wis (loewis)
Summary: iconv codec-NG and Korean Codecs

Initial Comment:
This patch includes update for iconv_codec, new 
sources for korean codecs and MultibyteCodec 
supplemental library.
I splitted out common parts of codecs for usual 
multibyte encodings into multibytecodec.c and this 
iconv codec and the korean codecs are using it.
The korean codecs is only 58K in stripped i386 ELF and 
62K in stripped i386 PECOFF binary and I think it's 
small enough to be incorporated into python.

Files:

Lib/encodings/aliases.py
   adds aliases for korean encodings and remove 
comments that isn't true now.

Lib/encodings/cp949.py
Lib/encodings/euc_kr.py
   codecs for korean encodings

Lib/encodings/iconv_codec.py
   updated for new _iconv_codec implementation

Lib/test/test_ko_codecs.py
   unit test for cp949, euc_kr codec

Lib/test/test_ko_codecs_mapping.py
   unit test to test cp949 mapping

Lib/test/test_iconv_codec_euc_kr.py
   another iconv_codec test unit. because non-unicode 
multibyte encoding is required to test both of 
iconv_codec and multibytecodec.

Lib/test/test_multibytecodec_support.py
   common part for above test units

Modules/_iconv_codec.c
   new implementation of _iconv_codec.
   this resolves numerous problems that previous 
implementation had. and iconv_codec has sane 
StreamReader now! :)

Modules/_ko_codec.c
Modules/_ko_codec.h
   korean codecs module

Modules/multibytecodec.c
Modules/multibytecodec.h
   common multibyte codec supplement. I think that this 
can be used for any usual multibyte encodings.
   I'll submit Chinese Codecs in few days using this.

Tools/unicode/genmap_ko_codecs.py
   code generator for _ko_codecs.h

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 16:00

Message:
Logged In: YES 
user_id=21627

Please submit an individual patch for each single bug fix or
new feature; it appears that this patch deals with
completely unrelated things. Therefore. I'm rejecting this
patch, encouraging you to submit new separate patches.

I have a few specific comments you may want to consider:

- What is the rationale for adding an alias processing to
the iconv codecs?

- It is unclear how you expect reuse of the
multibytecodec.c. Currently, this is incorporated into
_ko_codecs. How would this cooperate with other usages of
multibytecodecs? In particular, why is that needed in
iconv_codec?

- "complete reimplementation" is insufficient reason to
accept a change. What specific problems does the old iconv
codec have, and how specifically have they been corrected?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710931&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 16:12:45 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 08:12:45 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zIwn-0006A0-00@sc8-sf-web4.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 16:25:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 08:25:48 -0800
Subject: [Patches] [ python-Patches-711838 ] urllib2 doesn't support non-anonymous ftp
Message-ID: <E18zJ9Q-0004dJ-00@sc8-sf-web3.sourceforge.net>

Patches item #711838, was opened at 2003-03-29 11:25
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2 doesn't support non-anonymous ftp

Initial Comment:
urllib2 doesn't support non-anonymous ftp.  Added
support based on how urllib did it.

More details about this bug in Red Hat's bugzilla:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=78168
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=80676

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 16:32:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 08:32:53 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zJGH-0006vi-00@sc8-sf-web4.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 17:11:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 09:11:14 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zJrO-0006FG-00@sc8-sf-web3.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 12:11

Message:
Logged In: YES 
user_id=80475

Arghh, I don't see what the problem is.  The co_names 
cache variable is private and not part of the public 
interface for code objects.  The only way to see a change in 
behavior is for a program to violate the prohibition of 
sticking a name in another module's globals that affects a 
builtin (and, even then, it would have to occur between 
calls the the function).  Normal shadowing (using globals) 
would continue to work just fine.

While it gives only a minor timing gain, the big win would 
be removing the incentive to create python code like this:
  def f(x, y, int=int, True=True, chr=chr):
      . . .


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 17:27:12 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 09:27:12 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zK6q-0003OR-00@sc8-sf-web2.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 12:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 17:28:11 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 09:28:11 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zK7n-0003Qn-00@sc8-sf-web2.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 12:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
>Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 17:37:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 09:37:55 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zKHD-0006xq-00@sc8-sf-web1.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 18:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
>Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 18:37

Message:
Logged In: YES 
user_id=21627

Where do you get this guarantee from?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 18:05:53 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 10:05:53 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zKiH-0004iZ-00@sc8-sf-web2.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 12:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Neal Norwitz (nnorwitz)
>Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 13:05

Message:
Logged In: YES 
user_id=80475

Python 2.3a2 (#39, Feb 19 2003, 17:58:58) [MSC v.1200 
32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> None = 1
SyntaxError: assignment to None (<pyshell#0>, line 1)


In addition, the compiler already makes this assumption 
elsewhere.  Every function ends with:
  2           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE 

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 12:37

Message:
Logged In: YES 
user_id=21627

Where do you get this guarantee from?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 19:01:00 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 11:01:00 -0800
Subject: [Patches] [ python-Patches-711902 ] Cause pydoc to show data descriptor __doc__ strings
Message-ID: <E18zLZc-0001CC-00@sc8-sf-web3.sourceforge.net>

Patches item #711902, was opened at 2003-03-29 10:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711902&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cause pydoc to show data descriptor __doc__ strings

Initial Comment:
Data descriptors (descriptors having both a __get__ and 
a __set__ method) often have __doc__ strings.  Pydoc 
displays these for descriptors of type property, but not 
for other types (e.g., getsets).  The attached patch will 
display __doc__ strings for data descriptors (if available) 
in the "Data and non-method functions" section of the 
type description.

This patch is intended to be a minimal change.  It's 
possible that inspect.classify_class_attrs should return 
a new kind for data descriptors (or possibly 
the "property" kind should include all data descriptors 
(not just properties)), which could then be handled 
differently from other non-classified data.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711902&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 21:19:50 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 13:19:50 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zNjy-0002iC-00@sc8-sf-web2.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 12:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Later
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
>Assigned to: Raymond Hettinger (rhettinger)
>Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 16:19

Message:
Logged In: YES 
user_id=80475

Hmm, in Py2.3a2+, it only gives a warning.
Putting this one on hold until I can find out
why it was safe for the compiler to return
a None constant at the end of a function.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 13:05

Message:
Logged In: YES 
user_id=80475

Python 2.3a2 (#39, Feb 19 2003, 17:58:58) [MSC v.1200 
32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> None = 1
SyntaxError: assignment to None (<pyshell#0>, line 1)


In addition, the compiler already makes this assumption 
elsewhere.  Every function ends with:
  2           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE 

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 12:37

Message:
Logged In: YES 
user_id=21627

Where do you get this guarantee from?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 21:28:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 13:28:48 -0800
Subject: [Patches] [ python-Patches-708374 ] add offset to mmap
Message-ID: <E18zNse-0007ap-00@sc8-sf-web4.sourceforge.net>

Patches item #708374, was opened at 2003-03-23 09:33
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470

Category: Modules
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: add offset to mmap

Initial Comment:
This patch is from Yotam Medini <yotamm at
mellanox.co.il> sent to me in mail.

It adds support for the offset parameter to mmap.

It ignores the check for mmap size "if the file is
character device.  Some device drivers (which I happen
to use) have zero size in fstat buffer, but still one
can seek() read() and tell()."
I added minimal doc and tests.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-29 16:28

Message:
Logged In: YES 
user_id=33168

Sounds fair.  Attached is an updated patch which includes
windows support (I think).  I cannot test on Windows. 
Tested on Linux.  Includes updates for doc, src, and test.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 19:12

Message:
Logged In: YES 
user_id=21627

I think non-zero offsets need to be supported for Windows as
well.


----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-23 10:37

Message:
Logged In: YES 
user_id=33168

Email received from Yotam:

I have downloaded and patched the 2.3a source. compiled
locally just this module, and it worked fine for my
application (with offset for character device file) I did
not run the released test though.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708374&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 21:46:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 13:46:55 -0800
Subject: [Patches] [ python-Patches-706707 ] time.tzset standards compliance update
Message-ID: <E18zOAB-0007zG-00@sc8-sf-web4.sourceforge.net>

Patches item #706707, was opened at 2003-03-19 23:57
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 7
Submitted By: Stuart Bishop (zenzen)
Assigned to: Neal Norwitz (nnorwitz)
Summary: time.tzset standards compliance update

Initial Comment:
Update to configure.in and test_time.py to only use TZ
environment variable format documented at
http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-29 16:46

Message:
Logged In: YES 
user_id=33168

In the last chunk added, there is a bare except when calling
time.tzset().  What are the possible exceptions?  I don't
want to have a bare except since this can mask a real error.

The patch still fails for me on Linux (Redhat):
 * line 107: self.failUnless(time.tzname[1] == 'AEDT')
       - tzname has:  ('AEST', 'AEST')
 * line 109: self.failUnlessEqual(time.daylight, 1)
 * line 111: self.failUnlessEqual(time.altzone, -39600)
Haven't tried on other Unixes.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-03-27 15:12

Message:
Logged In: YES 
user_id=46639

tzset3.diff is an updated diff against the CVS head.

Fixes:
   -Don't test time.altzone for UTC - non-DST means altzone
is undefined
   -Make sure dst timezone name is not the same as non-dst
timezone
    name in TZ environment variable, to work around an
apparent Solaris
    bug.
   -Extraneous cruft removed from test_time.py and
configure.in - no
    more irrelevant comments.
   -More whitespace as per Tim's comments
    comments.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-21 16:28

Message:
Logged In: YES 
user_id=33168

After patching, the test fails:

  File "/home/neal/build/python/2_3/Lib/test/test_time.py",
line 115, in test_tzset
    self.failUnlessEqual(time.daylight,1)
  File "/home/neal/build/python/2.3/Lib/unittest.py", line
292, in failUnlessEqual
    raise self.failureException, \
AssertionError: 0 != 1


Also, why is the code commented out (via a string) on lines
120-144?  Should these be removed?  I see the comment about
wallclock time, but don't understand why the code should be
left in if we can't test it.  I can understand a comment
describing generally the issue.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-20 20:18

Message:
Logged In: YES 
user_id=33168

I'll try to get to this soon.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-20 20:11

Message:
Logged In: YES 
user_id=6380

Unassigning, as I won't hve time for this. But it is
important - someone else should make sure this goes into 2.3b1!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-20 16:50

Message:
Logged In: YES 
user_id=31435

Assigned to Guido, as I can't test it.

Two notes:

1. Leaving commented-out code in config and the test suite 
doesn't appear to serve a purpose, although it will serve to 
confuse future readers ("why is this here?  why is it 
commented out?").

2. The Python style guide asks for a blank after commas in 
argument lists and tuples.  We're not really in danger of 
stretching the screen here <wink>.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706707&group_id=5470


From noreply@sourceforge.net  Sat Mar 29 22:40:58 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 14:40:58 -0800
Subject: [Patches] [ python-Patches-659834 ] Check for readline 2.2 features
Message-ID: <E18zP0U-0007hv-00@sc8-sf-web3.sourceforge.net>

Patches item #659834, was opened at 2002-12-29 20:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470

Category: Build
Group: Python 2.2.x
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Magnus Lie Hetland (mlh)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Check for readline 2.2 features

Initial Comment:
This patch adds a snippet to configure.in,
to check whether 
rl_completion_append_character
(which is used in Python 
2.3) is available.

rl_prep_terminal is assumed to co-exist 
with rl_completion_append_character.

It is assumed that 
HAVE_RL_COMPLETION_APPEND_CHARACTER will be 
used in readline.c to make it compatible with older versions of the 
readline library.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-29 17:40

Message:
Logged In: YES 
user_id=33168

Magnus, it would be great if you could test 2.2.3 from CVS
too.  I have checked in a change that builds and works with
newer versions of readline.  I don't have readline v2.2.

Checked in as:
 * configure 1.279.6.19
 * configure.in 1.288.6.19
 * Modules/readline.c 2.41.6.7

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-12 17:16

Message:
Logged In: YES 
user_id=6380

I need a volunteer to backport this to 2.2 who can run an
older version of autoconf; the autoconf that I have
installed is too new to process the 2.2 configure.in file.
(The 2.3 version is already checked in.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-30 11:27

Message:
Logged In: YES 
user_id=6380

Checked in. Thanks!

Hm, this should be backported to 2.2.3 too! So I'll keep it
open.

----------------------------------------------------------------------

Comment By: Magnus Lie Hetland (mlh)
Date: 2002-12-29 21:11

Message:
Logged In: YES 
user_id=20535

New patch for configure.in (added a comment) and a patch
for 
readline.c that uses 
HAVE_RL_COMPLETION_APPEND_CHARACTER.

Tested 
on Gentoo Linux with new readline (the new completion
behaviour 
was preserved) and on Solaris with old readline (now compiles, with old 
completion behaviour in place).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 01:13:44 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 17:13:44 -0800
Subject: [Patches] [ python-Patches-659834 ] Check for readline 2.2 features
Message-ID: <E18zROK-0005ox-00@sc8-sf-web4.sourceforge.net>

Patches item #659834, was opened at 2002-12-30 02:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470

Category: Build
Group: Python 2.2.x
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Magnus Lie Hetland (mlh)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Check for readline 2.2 features

Initial Comment:
This patch adds a snippet to configure.in,
to check whether 
rl_completion_append_character
(which is used in Python 
2.3) is available.

rl_prep_terminal is assumed to co-exist 
with rl_completion_append_character.

It is assumed that 
HAVE_RL_COMPLETION_APPEND_CHARACTER will be 
used in readline.c to make it compatible with older versions of the 
readline library.

----------------------------------------------------------------------

>Comment By: Magnus Lie Hetland (mlh)
Date: 2003-03-30 03:13

Message:
Logged In: YES 
user_id=20535

I've now tested it with 2.2.3 (using the 2.2 maintenance branch,
which 
had the revision numbers you cited) and it works nicely.
That is, my 
old readline (readline 2.2, I think, although I couldn't
find the version 
number this time around -- at least it doesn't
have the completer 
character functionality) works. There is one thing I find a bit odd, though... 
With the 2.3 version of this check, the following ends up in 
pyconfig.h:

/* Define if you have readline 2.2 */
/* #undef 
HAVE_RL_COMPLETION_APPEND_CHARACTER 
*/

However, it isn't there when I use the 2.2 branch version. I guess 
it shouldn't matter either way (it's uncommented anyway), but it seems 
that the two versions behave differently, though... But since it all works, 
it's a bit hard to find out what's "wrong", if anything...

Anyway, the 
(tentative) verdict from me is that it works.

And just a final note: 
This check is really sort of a "band aid" solution, since the behaviour of the 
completer will differ, based on which readline version you have. Making 
the default the same for readline 2.2 and readline 4.* and making it 
configurable from Python for the newer versions might be better... 
Although possibly not important enough to warrant the work.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-29 23:40

Message:
Logged In: YES 
user_id=33168

Magnus, it would be great if you could test 2.2.3 from CVS
too.  I have checked in a change that builds and works with
newer versions of readline.  I don't have readline v2.2.

Checked in as:
 * configure 1.279.6.19
 * configure.in 1.288.6.19
 * Modules/readline.c 2.41.6.7

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-12 23:16

Message:
Logged In: YES 
user_id=6380

I need a volunteer to backport this to 2.2 who can run an
older version of autoconf; the autoconf that I have
installed is too new to process the 2.2 configure.in file.
(The 2.3 version is already checked in.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-30 17:27

Message:
Logged In: YES 
user_id=6380

Checked in. Thanks!

Hm, this should be backported to 2.2.3 too! So I'll keep it
open.

----------------------------------------------------------------------

Comment By: Magnus Lie Hetland (mlh)
Date: 2002-12-30 03:11

Message:
Logged In: YES 
user_id=20535

New patch for configure.in (added a comment) and a patch
for 
readline.c that uses 
HAVE_RL_COMPLETION_APPEND_CHARACTER.

Tested 
on Gentoo Linux with new readline (the new completion
behaviour 
was preserved) and on Solaris with old readline (now compiles, with old 
completion behaviour in place).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 01:32:35 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sat, 29 Mar 2003 17:32:35 -0800
Subject: [Patches] [ python-Patches-711861 ] Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None
Message-ID: <E18zRgZ-0001j8-00@sc8-sf-web2.sourceforge.net>

Patches item #711861, was opened at 2003-03-29 12:27
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470

Category: Core (C code)
Group: Python 2.3
>Status: Closed
Resolution: Later
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Raymond Hettinger (rhettinger)
>Summary: Replace LOAD_GLOBAL "None" with LOAD_CONST Py_None

Initial Comment:
Okay, here's one __builtin__ that's guaranteed not to 
change or be shadowed.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 16:19

Message:
Logged In: YES 
user_id=80475

Hmm, in Py2.3a2+, it only gives a warning.
Putting this one on hold until I can find out
why it was safe for the compiler to return
a None constant at the end of a function.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 13:05

Message:
Logged In: YES 
user_id=80475

Python 2.3a2 (#39, Feb 19 2003, 17:58:58) [MSC v.1200 
32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> None = 1
SyntaxError: assignment to None (<pyshell#0>, line 1)


In addition, the compiler already makes this assumption 
elsewhere.  Every function ends with:
  2           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE 

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 12:37

Message:
Logged In: YES 
user_id=21627

Where do you get this guarantee from?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711861&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 10:40:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 02:40:09 -0800
Subject: [Patches] [ python-Patches-712124 ] Obsolete comment in urlparse.py
Message-ID: <E18zaET-0000Rr-00@sc8-sf-web1.sourceforge.net>

Patches item #712124, was opened at 2003-03-30 03:40
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712124&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven Taschuk (staschuk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Obsolete comment in urlparse.py

Initial Comment:
urlparse.py contains a comment to the effect
that urljoin('http://foo/bar', '//g') returns 'http://g/',
contrary to the RFC 1808 example, which calls
for 'http://g' (with no trailing slash).

But this is false, and has been since at least 2.2.2;
urljoin correctly returns 'http://g' in this case, as the
test suite in fact verifies.

The patch simply removes this bogus comment.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712124&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 14:54:36 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 06:54:36 -0800
Subject: [Patches] [ python-Patches-545300 ] sgmllib support for additional tag forms
Message-ID: <E18zeCi-0000Pi-00@sc8-sf-web1.sourceforge.net>

Patches item #545300, was opened at 2002-04-17 20:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Steven F. Lott (slott56)
Assigned to: Martin v. L�wis (loewis)
Summary: sgmllib support for additional tag forms

Initial Comment:
MS-word generated HTML includes declaration 
tags of the form: 
<![if !supportEmptyParas]>&nbsp;<![endif]>
scattered throughout the body of an HTML 
document.

The current sgmllib parse_declaration routine 
rejects these as invalid syntax, where browsers 
tolerate these embedded declarations.

This patch accepts these declaration forms.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:54

Message:
Logged In: YES 
user_id=21627

Thanks for the patch, I have installed it as

markupbase.py 1.7
sgmllib.py 1.43
test_htmllib.py 1.3
NEWS 1.706

This also fixes bugs 505747 and 704996.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-11-22 10:23

Message:
Logged In: YES 
user_id=21627

I now recommend to approve this patch. It improves SGML
correctness, and, while supporting an MS extension,
explicitly points out that it is doing so.

----------------------------------------------------------------------

Comment By: Steven F. Lott (slott56)
Date: 2002-04-22 20:50

Message:
Logged In: YES 
user_id=328067

My suggestion for handling this MS extension syntax is 
to (1) tolerate the extension without an error, (2) treat it 
as an SGML marked section, using the 
unknown_decl() call-back.  Since this is a separate 
function, subclasses can override to alter this behavior.  

The content hidden in these MS-specific marked 
section appears to always be a &nbsp;.  While it might 
be expedient to completly skip over this junk, it makes it 
difficult to handle marked sections in a future version of 
markupbase.

Attached is a revised patch against V1.39 of sgmllib.py 
and 1.4 of markupbase.py

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-21 17:11

Message:
Logged In: YES 
user_id=3066

This is the same as bug #505747.

These "tags" are not legal HTML in any form, but are some
Microsoft invention.  It's not entirely clear what the right
thing to do is, but it is clear that we need to deal with
these in some different way.

Changed group to indicate that such changes can only go into
the trunk; feature changes in maintenance versions are not
allowed.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-04-18 19:23

Message:
Logged In: YES 
user_id=21627

That patch looks wrong: You are changing what a tag is,
removing the underscore, however, underscores are allowed in
tag names.

Also, could you please generate the patch against the CVS
version of the code? Your patch doesn't apply for the
current code, which has changed significantly compared to
the version you appear to be using.

There is no way that this can go into 2.1 IMO.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=545300&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 14:59:18 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 06:59:18 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E18zeHG-0000W9-00@sc8-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 20:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:59

Message:
Logged In: YES 
user_id=21627

I'm not sure how to place this. Is this an extension to
pydoc? Should it go into Tools, or into Lib, or into some
existing module?

If this goes into Lib somewhere, it lacks documentation.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 21:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 21:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 19:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 19:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:34:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:34:17 -0800
Subject: [Patches] [ python-Patches-701743 ] Reloading pseudo modules
Message-ID: <E18zflB-0002xg-00@sc8-sf-web2.sourceforge.net>

Patches item #701743, was opened at 2003-03-11 19:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reloading pseudo modules

Initial Comment:
Python allows to put something that is not a module in
sys.modules. Unfortunately reload() does not work wth
such a pseudo module ("TypeError: reload() argument
must be module" is raised). This patch changes
Python/import.c::PyImport_ReloadModule() so that it
works with anything that has a __name__ attribute that
can be found in sys.modules.keys().

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:34

Message:
Logged In: YES 
user_id=21627

The patch looks fine now as far as it goes. I'm unsure what
the use case is, though: What object do you have in
sys.modules for which reload() would be meaningful? Can you
attach an example where reloading fails now but succeeds
with your patch applied?

As for reload modifying the module object: It needs to, or
else all clients would have to run reload; this would
include things like function default arguments. I guess it
returns a result for historical reasons.


----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-17 15:25

Message:
Logged In: YES 
user_id=89016

PyImport_ReloadModule() is only called by the implementation
of the reload builtin, so it seems that m==NULL can only
happen with broken extension modules. I've updated the patch
accordingly (raising a SystemError) and changed the error
case for a missing __name__ attribute to raise a TypeError
when an AttributeError is detected. Unfortunately this might
mask exceptions (e.g. when __name__ is implemented as a
property.)

Another problem is that reload() seems to repopulate the
existing module object when reloading real modules. Example:
Write a simple foo.py which contains "x = 1" and then:
>>> import foo
>>> foo.x
1
[ Now open your editor and change foo.py to "x = 2" ]
>>> foo2 = reload(foo)
>>> foo.x
2
>>> foo2.x
2
>>> print id(foo), id(foo2)
1077466884 1077466884
>>> 

Of course this can't work with pseudo modules. I wonder why
reload() has a return value at all, as it always modifies
its parameter for real modules.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-15 14:51

Message:
Logged In: YES 
user_id=21627

I think the exceptions need to be reworked: "must be a
module" now only occurs if m is NULL. Under what
circumstances could that happen? Failure to provide __name__
is passed through; shouldn't this get diagnosed in a better way?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:42:40 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:42:40 -0800
Subject: [Patches] [ python-Patches-712124 ] Obsolete comment in urlparse.py
Message-ID: <E18zftI-0003QO-00@sc8-sf-web2.sourceforge.net>

Patches item #712124, was opened at 2003-03-30 12:40
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712124&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Steven Taschuk (staschuk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Obsolete comment in urlparse.py

Initial Comment:
urlparse.py contains a comment to the effect
that urljoin('http://foo/bar', '//g') returns 'http://g/',
contrary to the RFC 1808 example, which calls
for 'http://g' (with no trailing slash).

But this is false, and has been since at least 2.2.2;
urljoin correctly returns 'http://g' in this case, as the
test suite in fact verifies.

The patch simply removes this bogus comment.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:42

Message:
Logged In: YES 
user_id=21627

Thanks for the patch, committed as 1.40.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712124&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:43:57 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:43:57 -0800
Subject: [Patches] [ python-Patches-711838 ] urllib2 doesn't support non-anonymous ftp
Message-ID: <E18zfuX-0003U5-00@sc8-sf-web2.sourceforge.net>

Patches item #711838, was opened at 2003-03-29 17:25
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2 doesn't support non-anonymous ftp

Initial Comment:
urllib2 doesn't support non-anonymous ftp.  Added
support based on how urllib did it.

More details about this bug in Red Hat's bugzilla:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=78168
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=80676

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:43

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file. In addition, even if you
*did* check this checkbox, a bug in SourceForge
prevents attaching a file when *creating* an issue.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:49:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:49:15 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zfzf-0003hh-00@sc8-sf-web2.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 17:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
>Assigned to: Tim Peters (tim_one)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:51:57 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:51:57 -0800
Subject: [Patches] [ python-Patches-708604 ] unchecked return values - compile.c
Message-ID: <E18zg2H-0003mk-00@sc8-sf-web2.sourceforge.net>

Patches item #708604, was opened at 2003-03-24 04:01
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Harper (jasonharper)
>Assigned to: Martin v. L�wis (loewis)
Summary: unchecked return values - compile.c

Initial Comment:
Various cleanups in Python/compile.c - mainly 
unchecked return values.  Also an unchecked memory 
allocation in PyList_SetSlice that's called by compile.c.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-24 04:19

Message:
Logged In: YES 
user_id=392021

aaarrrrggghhh.... SF isn't letting me attach the files, clicking 
Submit simply clears the entered filename???  Will try later 
from another system.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-24 04:18

Message:
Logged In: YES 
user_id=392021

aaarrrrggghhh.... SF isn't letting me attach the files, clicking 
Submit simply clears the entered filename???  Will try later 
from another system.

----------------------------------------------------------------------

Comment By: Jason Harper (jasonharper)
Date: 2003-03-24 04:05

Message:
Logged In: YES 
user_id=392021

 
----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=708604&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:55:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:55:10 -0800
Subject: [Patches] [ python-Patches-701395 ] Wrong prototype for PyUnicode_Splitlines on documentation
Message-ID: <E18zg5O-0003wa-00@sc8-sf-web2.sourceforge.net>

Patches item #701395, was opened at 2003-03-11 09:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470

Category: Documentation
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Hye-Shik Chang (perky)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: Wrong prototype for PyUnicode_Splitlines on documentation

Initial Comment:
A mismatch of prototype and description between 
documentation and implementation. 
 

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:55

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as concrete.tex 1.22.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-11 10:18

Message:
Logged In: YES 
user_id=38388

Looks good. Assigned to Fred.

Thanks.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701395&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 16:56:34 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 08:56:34 -0800
Subject: [Patches] [ python-Patches-684981 ] fix for bug 501716
Message-ID: <E18zg6k-00040A-00@sc8-sf-web2.sourceforge.net>

Patches item #684981, was opened at 2003-02-12 00:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684981&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Michael Stone (mbrierst)
>Assigned to: Martin v. L�wis (loewis)
Summary: fix for bug 501716

Initial Comment:

Fixes bug described there:
"es#" parser marker leaks memory

Also fixes two other minor leaks involving
strings with encoded NULL's and when
a bad buffer_len pointer is passed to
PyArg_Parse...

Is a nicer version of the patch I pasted
in to the comments on the 501716 bug report.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=684981&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 17:15:37 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 09:15:37 -0800
Subject: [Patches] [ python-Patches-695250 ] fix for bug 672614 :)
Message-ID: <E18zgPB-0004hu-00@sc8-sf-web2.sourceforge.net>

Patches item #695250, was opened at 2003-02-28 20:48
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695250&group_id=5470

Category: Core (C code)
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Michael Stone (mbrierst)
Assigned to: Nobody/Anonymous (nobody)
Summary: fix for bug 672614 :)

Initial Comment:

python -S shouldn't show COPYRIGHT
string as they are not available.


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:15

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied (in modified form) as main.c
1.76 and  1.61.6.3.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=695250&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 17:24:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 09:24:17 -0800
Subject: [Patches] [ python-Patches-672053 ] Py_Main() removal of exit() calls. Return value instead
Message-ID: <E18zgXZ-00051v-00@sc8-sf-web2.sourceforge.net>

Patches item #672053, was opened at 2003-01-21 22:11
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=672053&group_id=5470

Category: Modules
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Douglas Napoleone (derivin)
Assigned to: Nobody/Anonymous (nobody)
Summary: Py_Main() removal of exit() calls. Return value instead

Initial Comment:
Py_Main() does not perform to spec.
The C/API documentation notes that the function will 
return a value of 2 for imporper commandline values.
Instead it calls exit()

calling exit() in general is bad. The caller should be the 
one to call exit or return from main() with the supplied 
exit code.

this is particularly troublesome when there are end 
cleanup calls that need to be made before terminating 
the program and static destruction is not an option.

The patch just replaces the exit calls with a return.
Calls to usage() have their return value returned.

very streight forward


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:24

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as main.c 1.77.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=672053&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 17:25:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 09:25:55 -0800
Subject: [Patches] [ python-Patches-662464 ] 659188: no docs for HTMLParser
Message-ID: <E18zgZ9-00056e-00@sc8-sf-web2.sourceforge.net>

Patches item #662464, was opened at 2003-01-05 05:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher Blunck (blunck2)
Assigned to: Nobody/Anonymous (nobody)
Summary: 659188: no docs for HTMLParser

Initial Comment:
Added some high level docs to explain how to use the class.
Provided docstrings for the handle_* callback methods.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:25

Message:
Logged In: YES 
user_id=21627

Christopher, can you please indicate whether you are going
to provide a patch for the primary source of the
documentation, i.e. the TeX files?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-15 13:20

Message:
Logged In: YES 
user_id=21627

Can you please provide a patch for the Tex documentation
(Doc/lib/libhtmlparser.tex) as well? I think this is where
the submitter of bug 659188 was looking.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 17:38:16 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 09:38:16 -0800
Subject: [Patches] [ python-Patches-650412 ] posixfy some things
Message-ID: <E18zgl6-0005US-00@sc8-sf-web2.sourceforge.net>

Patches item #650412, was opened at 2002-12-08 13:48
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=650412&group_id=5470

Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Marc Recht (marc)
Assigned to: Nobody/Anonymous (nobody)
Summary: posixfy some things

Initial Comment:
Add special check for flock, since it isn't a POSIX function. This avoids a implicit declaration on FreeBSD 5. (It's present in the libc, but undefined because of POSIX_C_SOURCE.)
Add a new check for getpagesize. It isn't a POSIX function either and needs the same treatment as flock.
Changed resources.c so it uses getpagesize only if it's available. Else it tries to use sysconf. It none of the two is available it returns 0.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:38

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Applied as

configure 1.388
configure.in 1.399
pyconfig.h.in 1.76
resource.c 2.30


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-12 14:27

Message:
Logged In: YES 
user_id=21627

Please check the checkbox.

----------------------------------------------------------------------

Comment By: Marc Recht (marc)
Date: 2002-12-12 14:07

Message:
Logged In: YES 
user_id=205

- changed to elif
- single patch (-p0)

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-08 18:23

Message:
Logged In: YES 
user_id=21627

Does the resulting resource.c actually compile? It seems to 
be missing an #endif. Please use #elif instead.

Please provide a single patch file, which can be applied with 
patch -p0.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=650412&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 17:41:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 09:41:55 -0800
Subject: [Patches] [ python-Patches-706590 ] Adds Mock Object support to unittest.TestCase
Message-ID: <E18zgod-00052T-00@sc8-sf-web3.sourceforge.net>

Patches item #706590, was opened at 2003-03-19 23:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Matthew Russell (mattruss)
Assigned to: Nobody/Anonymous (nobody)
Summary: Adds Mock Object support to unittest.TestCase

Initial Comment:
Mock objects can greatly improve unittests (If used in 
the correct context), especially for code that relis upon 
resource hungry test (connections to databases, socket 
servers etc).

The module/patch (to unittest) which I am submitting 
helps to introspect calls to code whilst maintaing 
transparency and funcionality with your code.

I had previously written a similar module for my present 
employers, and myself and fellow XP partners agree 
that it has made the XP testing cycle consderably 
easier.  Having googol-ed-out alternatives on the web, I 
have not found a solution that provides the same level of 
flexibility. (hope that doesn't sound arrogant)

The tests for this module should highlight usage, but i 
will supply dummy code if this idea is accepted.

If unfamiliar with XP/MockObject ideas, please see :
http://www.xprogramming.com/xpmag/virtualMockObject
s.htm#N78


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:41

Message:
Logged In: YES 
user_id=21627

This is now in feature request #708125.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706590&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 18:04:24 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 10:04:24 -0800
Subject: [Patches] [ python-Patches-662464 ] 659188: no docs for HTMLParser
Message-ID: <E18zhAO-0006R1-00@sc8-sf-web2.sourceforge.net>

Patches item #662464, was opened at 2003-01-04 23:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher Blunck (blunck2)
Assigned to: Nobody/Anonymous (nobody)
Summary: 659188: no docs for HTMLParser

Initial Comment:
Added some high level docs to explain how to use the class.
Provided docstrings for the handle_* callback methods.

----------------------------------------------------------------------

>Comment By: Christopher Blunck (blunck2)
Date: 2003-03-30 13:04

Message:
Logged In: YES 
user_id=531881

Sure.  I'll patch and post it later on today.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 12:25

Message:
Logged In: YES 
user_id=21627

Christopher, can you please indicate whether you are going
to provide a patch for the primary source of the
documentation, i.e. the TeX files?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-15 07:20

Message:
Logged In: YES 
user_id=21627

Can you please provide a patch for the Tex documentation
(Doc/lib/libhtmlparser.tex) as well? I think this is where
the submitter of bug 659188 was looking.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 19:02:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 11:02:39 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zi4l-0007hW-00@sc8-sf-web3.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 14:02

Message:
Logged In: YES 
user_id=6380

That prohibition isn't agreed yet, and would be new. Since
this *is* a change in existing semantics and rule, there
would have to be a period where the old semantics were
maintained but a warning was given about violating the new
rule. Your patch doesn't do any of that.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 12:11

Message:
Logged In: YES 
user_id=80475

Arghh, I don't see what the problem is.  The co_names 
cache variable is private and not part of the public 
interface for code objects.  The only way to see a change in 
behavior is for a program to violate the prohibition of 
sticking a name in another module's globals that affects a 
builtin (and, even then, it would have to occur between 
calls the the function).  Normal shadowing (using globals) 
would continue to work just fine.

While it gives only a minor timing gain, the big win would 
be removing the incentive to create python code like this:
  def f(x, y, int=int, True=True, chr=chr):
      . . .


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 19:42:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 11:42:31 -0800
Subject: [Patches] [ python-Patches-659834 ] Check for readline 2.2 features
Message-ID: <E18zihL-00031f-00@sc8-sf-web4.sourceforge.net>

Patches item #659834, was opened at 2002-12-29 20:22
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470

Category: Build
Group: Python 2.2.x
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Magnus Lie Hetland (mlh)
Assigned to: Neal Norwitz (nnorwitz)
Summary: Check for readline 2.2 features

Initial Comment:
This patch adds a snippet to configure.in,
to check whether 
rl_completion_append_character
(which is used in Python 
2.3) is available.

rl_prep_terminal is assumed to co-exist 
with rl_completion_append_character.

It is assumed that 
HAVE_RL_COMPLETION_APPEND_CHARACTER will be 
used in readline.c to make it compatible with older versions of the 
readline library.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 14:42

Message:
Logged In: YES 
user_id=33168

Hmmm, I didn't realize I had to add
HAVE_RL_COMPLETION_APPEND_CHARACTER manually.

Checked in as: pyconfig.h.in 1.20.8.3


----------------------------------------------------------------------

Comment By: Magnus Lie Hetland (mlh)
Date: 2003-03-29 20:13

Message:
Logged In: YES 
user_id=20535

I've now tested it with 2.2.3 (using the 2.2 maintenance branch,
which 
had the revision numbers you cited) and it works nicely.
That is, my 
old readline (readline 2.2, I think, although I couldn't
find the version 
number this time around -- at least it doesn't
have the completer 
character functionality) works. There is one thing I find a bit odd, though... 
With the 2.3 version of this check, the following ends up in 
pyconfig.h:

/* Define if you have readline 2.2 */
/* #undef 
HAVE_RL_COMPLETION_APPEND_CHARACTER 
*/

However, it isn't there when I use the 2.2 branch version. I guess 
it shouldn't matter either way (it's uncommented anyway), but it seems 
that the two versions behave differently, though... But since it all works, 
it's a bit hard to find out what's "wrong", if anything...

Anyway, the 
(tentative) verdict from me is that it works.

And just a final note: 
This check is really sort of a "band aid" solution, since the behaviour of the 
completer will differ, based on which readline version you have. Making 
the default the same for readline 2.2 and readline 4.* and making it 
configurable from Python for the newer versions might be better... 
Although possibly not important enough to warrant the work.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-29 17:40

Message:
Logged In: YES 
user_id=33168

Magnus, it would be great if you could test 2.2.3 from CVS
too.  I have checked in a change that builds and works with
newer versions of readline.  I don't have readline v2.2.

Checked in as:
 * configure 1.279.6.19
 * configure.in 1.288.6.19
 * Modules/readline.c 2.41.6.7

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-12 17:16

Message:
Logged In: YES 
user_id=6380

I need a volunteer to backport this to 2.2 who can run an
older version of autoconf; the autoconf that I have
installed is too new to process the 2.2 configure.in file.
(The 2.3 version is already checked in.)

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-30 11:27

Message:
Logged In: YES 
user_id=6380

Checked in. Thanks!

Hm, this should be backported to 2.2.3 too! So I'll keep it
open.

----------------------------------------------------------------------

Comment By: Magnus Lie Hetland (mlh)
Date: 2002-12-29 21:11

Message:
Logged In: YES 
user_id=20535

New patch for configure.in (added a comment) and a patch
for 
readline.c that uses 
HAVE_RL_COMPLETION_APPEND_CHARACTER.

Tested 
on Gentoo Linux with new readline (the new completion
behaviour 
was preserved) and on Solaris with old readline (now compiles, with old 
completion behaviour in place).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=659834&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 20:16:56 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 12:16:56 -0800
Subject: [Patches] [ python-Patches-712317 ] Bug fix 548176: urlparse('http://foo?blah') errs
Message-ID: <E18zjEe-0001SV-00@sc8-sf-web3.sourceforge.net>

Patches item #712317, was opened at 2003-03-30 13:16
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712317&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Steven Taschuk (staschuk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Bug fix 548176: urlparse('http://foo?blah') errs

Initial Comment:
For detailed description of the problem, see
    http://www.python.org/sf/548176
In summary, URLs such as
    http://www.example.com?query=spam
are misparsed by urlparse.urlparse, which decides that 
everything after the '//' is the host name.  This is contrary to 
RFC 2396 and probably contrary to the intent of RFC 1738.

The patch corrects the problem, adds a test to expose it, 
and rearranges some of the tests to better exercise the 
code in question.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712317&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 20:45:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 12:45:04 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18zjfs-0002US-00@sc8-sf-web3.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 03:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Geert Jansen (geertj)
>Assigned to: Martin v. L�wis (loewis)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 15:45

Message:
Logged In: YES 
user_id=6380

(c) is okay with me. Since PyInterpreterState is always
allocated by the Python core, I can't see how this could
possibly break something.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 09:16

Message:
Logged In: YES 
user_id=21627

This patch breaks binary compatibility, as it changes the
layout of PyInterpreterState. We could reduce the risk of
breakage by moving the new members at the end of the struct.

Assigning to Guido for pronouncement: Should this
a) be rejected?
b) be accepted as is? (arguing that nobody uses the
interpreter state, anyway)
c) accepted with the proposed change (i.e.
sizeof(PyInterpreterState) still changes, but the offset of
the existing members doesn't).

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 03:40

Message:
Logged In: YES 
user_id=38388

Looks ok.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 19:00

Message:
Logged In: YES 
user_id=21627

Marc-Andre, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2003-03-27 03:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 20:47:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 12:47:17 -0800
Subject: [Patches] [ python-Patches-706338 ] Fix a few broken links in pydoc
Message-ID: <E18zji1-0002aK-00@sc8-sf-web3.sourceforge.net>

Patches item #706338, was opened at 2003-03-19 10:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706338&group_id=5470

Category: Documentation
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Greg Chapman (glchapman)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Fix a few broken links in pydoc

Initial Comment:
Patch to fix a few of the help files references in 
pydoc.Helper.  I'm not sure what was originally 
in 'ref/execframe' (which does not exist in the 2.3 
documentation set), but, since 'ref/naming' seems the 
best file for NAMESPACES, I converted both references 
to 'ref/execframe' to 'ref/naming'.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 15:47

Message:
Logged In: YES 
user_id=33168

Thanks!

Checked in as:
  Lib/pydoc.py 1.81 and 1.56.8.9

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=706338&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:12:10 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:12:10 -0800
Subject: [Patches] [ python-Patches-710576 ] Backport to 2.2.2 of codec registry fix
Message-ID: <E18zk66-0004oc-00@sc8-sf-web2.sourceforge.net>

Patches item #710576, was opened at 2003-03-27 09:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470

Category: Core (C code)
Group: Python 2.2.x
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Geert Jansen (geertj)
Assigned to: Martin v. L�wis (loewis)
Summary: Backport to 2.2.2 of codec registry fix

Initial Comment:
Hi, 
 
attached is a backport to Python 2.2.2 of the patch that 
fixes bug: 
 
  #663074: codec registry and Python embedding problem 
 
which is discussed here: 
 
http://sourceforge.net/tracker/index.php?func=detail&aid=663074&group_id=5470&atid=105470 
 
If there will be a Python 2.2.3 release, I suggest this patch 
is applied. Currently, mod_python programs cannot use 
encodings, because mod_python is one of the (few?) 
programs that uses multiple subinterpreters. 
 
About the patch: it is a backport of Gustavo Niemeyer's 
patch for 2.3 CVS. I had to adapt it a little bit because in 
2.2 there is no codec error registry. 
 
Greetings, 
Geert Jansen 

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:12

Message:
Logged In: YES 
user_id=21627

Thanks for the patch. Committed (with changes) as

pystate.h 2.18.16.3
NEWS 1.337.2.4.2.69
codecs.c 2.13.26.3
pystate.c 2.20.16.3
pythonrun.c 2.153.6.5


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 22:45

Message:
Logged In: YES 
user_id=6380

(c) is okay with me. Since PyInterpreterState is always
allocated by the Python core, I can't see how this could
possibly break something.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:16

Message:
Logged In: YES 
user_id=21627

This patch breaks binary compatibility, as it changes the
layout of PyInterpreterState. We could reduce the risk of
breakage by moving the new members at the end of the struct.

Assigning to Guido for pronouncement: Should this
a) be rejected?
b) be accepted as is? (arguing that nobody uses the
interpreter state, anyway)
c) accepted with the proposed change (i.e.
sizeof(PyInterpreterState) still changes, but the offset of
the existing members doesn't).

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2003-03-28 09:40

Message:
Logged In: YES 
user_id=38388

Looks ok.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:00

Message:
Logged In: YES 
user_id=21627

Marc-Andre, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

Comment By: Geert Jansen (geertj)
Date: 2003-03-27 09:25

Message:
Logged In: YES 
user_id=537938

Here is the patch. It is tested and verified to fix the problem by 
two people. I also verified that it passes the test suite. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710576&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:15:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:15:04 -0800
Subject: [Patches] [ python-Patches-658316 ] skips.txt for regrtest.py
Message-ID: <E18zk8u-0004wL-00@sc8-sf-web2.sourceforge.net>

Patches item #658316, was opened at 2002-12-24 21:03
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658316&group_id=5470

Category: Tests
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: skips.txt for regrtest.py

Initial Comment:
As I promised on python-dev here is the functionality
to have  a skips.txt file for regrtest.py.  If the file
is present in the current directory it is parsed (using
the exact same code as used for  the -f option for
regrtest; good, old copy-n-paste) and all tests are
added to the expected skip set.

And as commented in the file, the name of the file is
so named after Skip Montanaro because he is too shy.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:15

Message:
Logged In: YES 
user_id=21627

Raymond, any further comments?

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2002-12-30 02:37

Message:
Logged In: YES 
user_id=357491

Oops.  =)

New diff includes a paragraph at the end of the module
documentation that mentions how to use the new functionality.

Please delete the old diff.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-12-29 06:09

Message:
Logged In: YES 
user_id=80475

The patch looks good.
Now, it needs documentation.

----------------------------------------------------------------------

Comment By: Brett Cannon (bcannon)
Date: 2002-12-26 22:04

Message:
Logged In: YES 
user_id=357491

Sorry about that!  I could have sworn I checked the box.  I
have uploaded enough files here you would think it would be
habitual by now.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2002-12-26 19:10

Message:
Logged In: YES 
user_id=33168

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=658316&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:16:42 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:16:42 -0800
Subject: [Patches] [ python-Patches-649997 ] Complementary patch for OpenVMS
Message-ID: <E18zkAU-0004zg-00@sc8-sf-web2.sourceforge.net>

Patches item #649997, was opened at 2002-12-07 12:45
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649997&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Pi�ronne Jean-Fran�ois (pieronne)
>Assigned to: Martin v. L�wis (loewis)
Summary: Complementary patch for OpenVMS

Initial Comment:
Hi,

I have attach the complementary patch for OpenVMS

As the previous one, all the update use conditionnal compilation for VMS, except in two place:
There is for socketmodule.c two update which use ENABLE_IPV6 test but not __VMS because I think it 
was a bug into the initial code, there is a use of "sockaddr_storage" which is if I remember correctly a 
IPV6 structure.


Regards,


Jean-Fran�ois Pi�ronne


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:16

Message:
Logged In: YES 
user_id=21627

Am I correct assuming that this patch has been superceded
now by 708495?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-08 11:34

Message:
Logged In: YES 
user_id=21627

I have a number of questions:
- What is RMS? (probably not Richard M Stallman :-)
  RMSError is not used, so it should not be included in the 
patch.

- Why do you need to omit the argument for F_GETFD?

- Why do you cast the ioctl argument to void*? In POSIX, this 
argument is of type int.

- What is the third argument to getcwd?

- Don't use nested #ifs, use #elif instead where appropriate.

- 


----------------------------------------------------------------------

Comment By: Pi�ronne Jean-Fran�ois (pieronne)
Date: 2002-12-08 11:00

Message:
Logged In: YES 
user_id=414701

Done

Thanks

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-12-08 09:49

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=649997&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:18:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:18:38 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zkCM-00054r-00@sc8-sf-web2.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 10:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-10 02:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 22:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 18:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:26:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:26:55 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zkKN-0005hp-00@sc8-sf-web1.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 10:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:26

Message:
Logged In: YES 
user_id=60314

Yes, Martin, I'm still quite convinced shelve's behavior is
generally surprising and often problematic, and even though
the fixed suggested by both me and dannu are each imperfect
(given the impossibility to find out, in general, whether an object
has been modified), I think one or the other would still be better
than the current situation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-10 02:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 22:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 18:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:43:27 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:43:27 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zkaN-0004XB-00@sc8-sf-web3.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 10:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:43

Message:
Logged In: YES 
user_id=21627

Would you then be willing to provide a complete patch
(documentation, NEWS entry, test case)?

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:26

Message:
Logged In: YES 
user_id=60314

Yes, Martin, I'm still quite convinced shelve's behavior is
generally surprising and often problematic, and even though
the fixed suggested by both me and dannu are each imperfect
(given the impossibility to find out, in general, whether an object
has been modified), I think one or the other would still be better
than the current situation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-10 02:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 22:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 18:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:56:07 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:56:07 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zkmd-0007Ik-00@sc8-sf-web4.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 03:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-30 16:56

Message:
Logged In: YES 
user_id=80475

The issue has arisen a couple of times
of comp.lang.python.
I think this patch would be helpful.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:43

Message:
Logged In: YES 
user_id=21627

Would you then be willing to provide a complete patch
(documentation, NEWS entry, test case)?

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2003-03-30 16:26

Message:
Logged In: YES 
user_id=60314

Yes, Martin, I'm still quite convinced shelve's behavior is
generally surprising and often problematic, and even though
the fixed suggested by both me and dannu are each imperfect
(given the impossibility to find out, in general, whether an object
has been modified), I think one or the other would still be better
than the current situation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-09 19:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 15:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 13:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 11:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:59:09 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:59:09 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18zkpZ-0007PF-00@sc8-sf-web4.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

>Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-30 16:59

Message:
Logged In: YES 
user_id=80475

I see.

Would this patch be acceptable as a -OO option or should 
I drop it?

Also, the same question applies to a tiny patch converting 
  LOAD_GLOBAL "None"   -->   LOAD_CONST Py_None


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 14:02

Message:
Logged In: YES 
user_id=6380

That prohibition isn't agreed yet, and would be new. Since
this *is* a change in existing semantics and rule, there
would have to be a period where the old semantics were
maintained but a warning was given about violating the new
rule. Your patch doesn't do any of that.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 12:11

Message:
Logged In: YES 
user_id=80475

Arghh, I don't see what the problem is.  The co_names 
cache variable is private and not part of the public 
interface for code objects.  The only way to see a change in 
behavior is for a program to violate the prohibition of 
sticking a name in another module's globals that affects a 
builtin (and, even then, it would have to occur between 
calls the the function).  Normal shadowing (using globals) 
would continue to work just fine.

While it gives only a minor timing gain, the big win would 
be removing the incentive to create python code like this:
  def f(x, y, int=int, True=True, chr=chr):
      . . .


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 21:59:41 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 13:59:41 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zkq5-0007Qd-00@sc8-sf-web4.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 10:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:59

Message:
Logged In: YES 
user_id=60314

sure, but along what lines -- my previous patch's, or dannu's?  let 
me know, and I'll get to work on it as soon as I'm back from 
Python-UK & short following trip (i..e around Apr 12)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-30 23:56

Message:
Logged In: YES 
user_id=80475

The issue has arisen a couple of times
of comp.lang.python.
I think this patch would be helpful.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:43

Message:
Logged In: YES 
user_id=21627

Would you then be willing to provide a complete patch
(documentation, NEWS entry, test case)?

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:26

Message:
Logged In: YES 
user_id=60314

Yes, Martin, I'm still quite convinced shelve's behavior is
generally surprising and often problematic, and even though
the fixed suggested by both me and dannu are each imperfect
(given the impossibility to find out, in general, whether an object
has been modified), I think one or the other would still be better
than the current situation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-10 02:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 22:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 18:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 22:04:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 14:04:04 -0800
Subject: [Patches] [ python-Patches-667548 ] Add missing constants for IRIX al module
Message-ID: <E18zkuK-0006bX-00@sc8-sf-web1.sourceforge.net>

Patches item #667548, was opened at 2003-01-13 22:30
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667548&group_id=5470

Category: Modules
>Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Michael Pruett (mpruett)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: Add missing constants for IRIX al module

Initial Comment:
The following Audio Library constants are not defined by 
the IRIX al module as of Python 2.2.2:

AL_LOCKED
AL_NULL_INTERFACE
AL_OPTICAL_IF_TYPE
AL_SMPTE272M_IF_TYPE

The attached patch adds these constants.


----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 17:04

Message:
Logged In: YES 
user_id=33168

Thanks!  Checked in as Modules/almodule.c 1.38

----------------------------------------------------------------------

Comment By: Michael Pruett (mpruett)
Date: 2003-01-13 22:33

Message:
Logged In: YES 
user_id=250621


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=667548&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 22:06:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 14:06:31 -0800
Subject: [Patches] [ python-Patches-711002 ] new test_urllib and patch for found urllib bug
Message-ID: <E18zkwh-0006gP-00@sc8-sf-web1.sourceforge.net>

Patches item #711002, was opened at 2003-03-27 13:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711002&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brett Cannon (bcannon)
Assigned to: Nobody/Anonymous (nobody)
Summary: new test_urllib and patch for found urllib bug

Initial Comment:
Free time at PyCon led to me writing a new test_urllib (happy, Raymond?  =).  Since I have no guarantee that there would be a net connection (and didn't want to use it without user permission since I view using the 'network' resource as using sockets and not the Net) I wrote all tests using temporary files.

And do this found a bug, sort of.  The docs and doc string for urlretrieve() says  the  second value from the returned tuple should be None when a local file is passed as an argument.  Well, it wasn't; it was returning an rfc2822.Message object like it does for remote files.  So I patched it to match the docs.

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-03-30 14:06

Message:
Logged In: YES 
user_id=357491

I just noticed that Skip uploaded test_urllibnet.py to test timeouts by connecting to python.org .  Is it okay to write tests that connect to the Net when the `network' resourse is enabled?  If so then I can add network tests to test_urllib.py .

Oh, and the beginning of the 2nd paragraph for my summary should have read "And I did find a bug, sort of" and not the mess of broken grammar rules as I initially typed it in.  =)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711002&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 22:11:14 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 14:11:14 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zl1G-0007qg-00@sc8-sf-web4.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
>Assigned to: Nobody/Anonymous (nobody)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-30 17:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 22:19:36 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 14:19:36 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zl9M-0008Ft-00@sc8-sf-web4.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 17:19

Message:
Logged In: YES 
user_id=33168

_POSIX_SEMAPHORES aren't used if
HAVE_BROKEN_POSIX_SEMAPHORES is defined.  This currently
occurs on Solaris 8 (at least).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-30 17:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Sun Mar 30 22:36:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 14:36:15 -0800
Subject: [Patches] [ python-Patches-712367 ] get build working on AIX
Message-ID: <E18zlPT-0006Lj-00@sc8-sf-web3.sourceforge.net>

Patches item #712367, was opened at 2003-03-30 17:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: get build working on AIX

Initial Comment:
Tested on AIX 4.3 and 5.1.  I may have tested this on
4.2 a long time ago.  Changes to configure and
setup.py.  The setup.py changes are build curses.

The configure changes create the export file
differently.  I was told by Gary Hooks at IBM that the
export file must have a period for AIX 4.2 and beyond
for dynamically imported modules to work properly (call
back into the interpreter).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 00:29:19 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 16:29:19 -0800
Subject: [Patches] [ python-Patches-711722 ] Cache lookup of __builtins__
Message-ID: <E18znAt-0004FT-00@sc8-sf-web4.sourceforge.net>

Patches item #711722, was opened at 2003-03-29 01:47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cache lookup of __builtins__

Initial Comment:
Rather than perform a bytecode optimization of 
LOAD_GLOBALS, takes an alternative approach of 
caching the lookup of builtins.

To be safe, it checks the cache only after trying a 
lookup in globals().  I can think of only one way to 
break this approach:  run the function accessing a 
builtin, then poke a new value into the builtins 
module, and then re-run the function:

def f(x):
    return oct(x)
print f(20)
__builtins__.oct = hex
print f(20)  # doesn't notice new def of oct()

The gives about a 2% speed-up to average programs, 
0% to programs that don't use builtins, and higher 
percentages to those with heavier use of builtins.  The 
speedup is limited by 1) having to still check globals 
and 2) the relative insignificance of builtin access time 
in most programs.  Still, it pretty much solves the 
problem of access time for builtins.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 19:29

Message:
Logged In: YES 
user_id=6380

Please drop it. OO also doesn't change semantics except for
__doc__ (which is a different kind of change). The
LOAD_GLOBAL->LOAD_CONST patch is acceptable for 2.4 (though
it will probably be done differently once None is a keyword).

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-30 16:59

Message:
Logged In: YES 
user_id=80475

I see.

Would this patch be acceptable as a -OO option or should 
I drop it?

Also, the same question applies to a tiny patch converting 
  LOAD_GLOBAL "None"   -->   LOAD_CONST Py_None


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-30 14:02

Message:
Logged In: YES 
user_id=6380

That prohibition isn't agreed yet, and would be new. Since
this *is* a change in existing semantics and rule, there
would have to be a period where the old semantics were
maintained but a warning was given about violating the new
rule. Your patch doesn't do any of that.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-29 12:11

Message:
Logged In: YES 
user_id=80475

Arghh, I don't see what the problem is.  The co_names 
cache variable is private and not part of the public 
interface for code objects.  The only way to see a change in 
behavior is for a program to violate the prohibition of 
sticking a name in another module's globals that affects a 
builtin (and, even then, it would have to occur between 
calls the the function).  Normal shadowing (using globals) 
would continue to work just fine.

While it gives only a minor timing gain, the big win would 
be removing the incentive to create python code like this:
  def f(x, y, int=int, True=True, chr=chr):
      . . .


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-03-29 06:37

Message:
Logged In: YES 
user_id=6380

-1. It changes semantics in an ad-hoc way.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711722&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 03:24:28 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 19:24:28 -0800
Subject: [Patches] [ python-Patches-662464 ] 659188: no docs for HTMLParser
Message-ID: <E18zpuO-0006BF-00@sc8-sf-web2.sourceforge.net>

Patches item #662464, was opened at 2003-01-04 23:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher Blunck (blunck2)
Assigned to: Nobody/Anonymous (nobody)
Summary: 659188: no docs for HTMLParser

Initial Comment:
Added some high level docs to explain how to use the class.
Provided docstrings for the handle_* callback methods.

----------------------------------------------------------------------

>Comment By: Christopher Blunck (blunck2)
Date: 2003-03-30 22:24

Message:
Logged In: YES 
user_id=531881

added documentation for handle_pi callback method in
libhtmlparser.tex

----------------------------------------------------------------------

Comment By: Christopher Blunck (blunck2)
Date: 2003-03-30 13:04

Message:
Logged In: YES 
user_id=531881

Sure.  I'll patch and post it later on today.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 12:25

Message:
Logged In: YES 
user_id=21627

Christopher, can you please indicate whether you are going
to provide a patch for the primary source of the
documentation, i.e. the TeX files?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-15 07:20

Message:
Logged In: YES 
user_id=21627

Can you please provide a patch for the Tex documentation
(Doc/lib/libhtmlparser.tex) as well? I think this is where
the submitter of bug 659188 was looking.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 03:59:55 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 19:59:55 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zqSh-00077y-00@sc8-sf-web2.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Mihai Ibanescu (misa)
Date: 2003-03-30 22:59

Message:
Logged In: YES 
user_id=205865

Also, this happens in 2.2.2 as well (the patch in Red Hat's
bugzilla is against 2.2.2 actually). Is there a plan to
release a 2.2.3? Is there value in backporting the patch?
(should apply cleanly on 2.2.2).

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 17:19

Message:
Logged In: YES 
user_id=33168

_POSIX_SEMAPHORES aren't used if
HAVE_BROKEN_POSIX_SEMAPHORES is defined.  This currently
occurs on Solaris 8 (at least).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-30 17:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 04:01:30 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 20:01:30 -0800
Subject: [Patches] [ python-Patches-711838 ] urllib2 doesn't support non-anonymous ftp
Message-ID: <E18zqUE-0007Bn-00@sc8-sf-web2.sourceforge.net>

Patches item #711838, was opened at 2003-03-29 11:25
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib2 doesn't support non-anonymous ftp

Initial Comment:
urllib2 doesn't support non-anonymous ftp.  Added
support based on how urllib did it.

More details about this bug in Red Hat's bugzilla:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=78168
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=80676

----------------------------------------------------------------------

>Comment By: Mihai Ibanescu (misa)
Date: 2003-03-30 23:01

Message:
Logged In: YES 
user_id=205865

Argh. I forgot to check the checkbox. Here we go.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:43

Message:
Logged In: YES 
user_id=21627

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file. In addition, even if you
*did* check this checkbox, a bug in SourceForge
prevents attaching a file when *creating* an issue.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711838&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 05:15:17 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 21:15:17 -0800
Subject: [Patches] [ python-Patches-553171 ] optionally make shelve less surprising
Message-ID: <E18zrdd-00011Y-00@sc8-sf-web1.sourceforge.net>

Patches item #553171, was opened at 2002-05-07 10:13
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
Status: Open
Resolution: None
Priority: 5
Submitted By: Alex Martelli (aleax)
Assigned to: Nobody/Anonymous (nobody)
Summary: optionally make shelve less surprising

Initial Comment:
shelve has highly surprising behavior wrt modifiable
values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4)   # doesn't "TAKE"!

Explaining to beginners that s['ciao'] is returning a
temporary object and the modification is done on the
temporary thus "silently ignored" is hard indeed.  It
also makes shelve far less convenient than it could 
be (whenever modifiable values must be shelved).

Having s keep track of all values it has returned may
perhaps break some existing program (due to extra 
memory consumption and/or to lack of "implicit 
copy"/"snapshot" behavior) so I've made the 'caching' 
change optional and by default off.  However it's now 
at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4)   # no surprises any more

I suspect the 'smart=1' should be made the default, 
but, if we at least put it in now, then perhaps we 
can migrate to having it as the default very slowly 
and gradually.


Alex


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 07:15

Message:
Logged In: YES 
user_id=21627

dannu's code is currently unavailable... I see no reason to
add yet another layer of indirection, and no other
application of such a wrapper within Python.

The trickiest aspect of this educational: If the default
behaviour does not change (as it shouldn't), how can
unsuspecting users avoid running into the trap? So this is
much more a documentation problem than a code problem.

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:59

Message:
Logged In: YES 
user_id=60314

sure, but along what lines -- my previous patch's, or dannu's?  let 
me know, and I'll get to work on it as soon as I'm back from 
Python-UK & short following trip (i..e around Apr 12)


----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-03-30 23:56

Message:
Logged In: YES 
user_id=80475

The issue has arisen a couple of times
of comp.lang.python.
I think this patch would be helpful.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:43

Message:
Logged In: YES 
user_id=21627

Would you then be willing to provide a complete patch
(documentation, NEWS entry, test case)?

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2003-03-30 23:26

Message:
Logged In: YES 
user_id=60314

Yes, Martin, I'm still quite convinced shelve's behavior is
generally surprising and often problematic, and even though
the fixed suggested by both me and dannu are each imperfect
(given the impossibility to find out, in general, whether an object
has been modified), I think one or the other would still be better
than the current situation.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 23:18

Message:
Logged In: YES 
user_id=21627

Alex, do you still think this should be implemented, in some
form or  other?

----------------------------------------------------------------------

Comment By: Holger P. Krekel (dannu)
Date: 2002-05-10 02:47

Message:
Logged In: YES 
user_id=83092

I'd suggest not changing shelve at all but providing 
a "cache-commit" dictionary (ccdict) which can wrap a
shelf-instance (or any other simple dictish instance)
and provides the 'non-surprising' behaviour. 

Some proof of concept code for the following
properties is provided here

http://home.trillke.net/~hpk/ccdict.py

Current properties are:

- ccdict wraps a dictionary-like object which
  in turn only needs to provide
  __getitem__, __setitem__, __delitem__,has_key

- on first access of an element
  ccdict makes a lookup on the underlying
  dict and caches the item.
- the next accesses work with the cached thing.
  Unsurprising dict-semantics are provided.

- deleting an item is deferred and actually happens
  on commit() time. deleting an item and later on
  assigning to it works as expected (i.e. the assignment
  takes preference).

- commit() transfers the items in the
  cache to the underlying dict and clears
  the cache.Prior to issuing commit
  no writeback to the underlying dict happens.

- deleting an ccdict-instance does *not* commit any  
changes. You have to explicitely call commit().
If you want to work readonly, don't call commit.

- clear() only cleares the cache and not the underlying
  dict 

- you can explicitely prune the cache (via cache.keys()
etc.) before calling commit(). This lets you
avoid writing back unmodified objects if this
is an issue.

It seems quite impossible to figure out automagically
which objects have been modified 
and so the solution is to do it explicitely 
(or don't commit for readonly).

holger

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 22:55

Message:
Logged In: YES 
user_id=80475

A few more thoughts:

Please change the "except:" lines to specify the exception 
being caught.

Also, if GvR shows interest in the patch, we should update 
the library reference and add unittests.

The docstring should also mention that the cache is kept in 
memory -- besides persistence, one of the forces for 
shelving is memory conservation.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-09 20:43

Message:
Logged In: YES 
user_id=80475

Nicely done!  The code is clean and runs in the smart mode 
without problems on my existing programs. I agree that the 
patch solves a real world problem.  The solution is clean, 
but a little expensive.

If there were a way to be able to tell if an entry had been 
altered, it would save the 100% writeback.  Unfortunately, 
I can't think of a way.

The docstring could read more smoothly and plainly.  Also, 
it should be clear that the cost of setting smart=1 is that 
100% of the entries get rewritten on close.

Two microscopically minor thoughts on the coding (feel free 
to disregard). Can some of the try/except blocks be 
replaced by something akin to 'if self.smart:'?  For the 
writeback loop, consider 'for k,v in cache.iteritems()' as 
it takes less memory and saves a lookup.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2002-05-07 18:38

Message:
Logged In: YES 
user_id=21627

Even more important than the backwards compatibility might
be the issue that it writes back all accessed objects on
close, which might be expensive if there have been many
read-only accesses.

So I think the option name could be also 'slow'; although
'writeback' might be more technical.

Also, I wonder whether write-back should be attempted if the
shelve was opened read-only.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=553171&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 05:22:15 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 21:22:15 -0800
Subject: [Patches] [ python-Patches-712367 ] get build working on AIX
Message-ID: <E18zrkN-0001EE-00@sc8-sf-web1.sourceforge.net>

Patches item #712367, was opened at 2003-03-31 00:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470

Category: Build
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
>Assigned to: Neal Norwitz (nnorwitz)
Summary: get build working on AIX

Initial Comment:
Tested on AIX 4.3 and 5.1.  I may have tested this on
4.2 a long time ago.  Changes to configure and
setup.py.  The setup.py changes are build curses.

The configure changes create the export file
differently.  I was told by Gary Hooks at IBM that the
export file must have a period for AIX 4.2 and beyond
for dynamically imported modules to work properly (call
back into the interpreter).

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 07:22

Message:
Logged In: YES 
user_id=21627

The patch itself is fine. However, we should also formally
establish a minimum supported AIX version, in PEP 11
(perhaps with a vision of warning users in 2.4, and actively
removing code that belongs to older versions in 2.5).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 05:25:02 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 21:25:02 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E18zrn4-0001If-00@sc8-sf-web1.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 17:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Mihai Ibanescu (misa)
>Assigned to: Martin v. L�wis (loewis)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 07:25

Message:
Logged In: YES 
user_id=21627

There are plans to provide Python 2.2.3. I see no problem
applying it to 2.2.2, as there shouldn't be any change in
visible behaviour.

----------------------------------------------------------------------

Comment By: Mihai Ibanescu (misa)
Date: 2003-03-31 05:59

Message:
Logged In: YES 
user_id=205865

Also, this happens in 2.2.2 as well (the patch in Red Hat's
bugzilla is against 2.2.2 actually). Is there a plan to
release a 2.2.3? Is there value in backporting the patch?
(should apply cleanly on 2.2.2).

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-31 00:19

Message:
Logged In: YES 
user_id=33168

_POSIX_SEMAPHORES aren't used if
HAVE_BROKEN_POSIX_SEMAPHORES is defined.  This currently
occurs on Solaris 8 (at least).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-31 00:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 05:33:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 21:33:48 -0800
Subject: [Patches] [ python-Patches-662464 ] 659188: no docs for HTMLParser
Message-ID: <E18zrvY-0000os-00@sc8-sf-web2.sourceforge.net>

Patches item #662464, was opened at 2003-01-05 05:10
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Christopher Blunck (blunck2)
>Assigned to: Martin v. L�wis (loewis)
Summary: 659188: no docs for HTMLParser

Initial Comment:
Added some high level docs to explain how to use the class.
Provided docstrings for the handle_* callback methods.

----------------------------------------------------------------------

Comment By: Christopher Blunck (blunck2)
Date: 2003-03-31 05:24

Message:
Logged In: YES 
user_id=531881

added documentation for handle_pi callback method in
libhtmlparser.tex

----------------------------------------------------------------------

Comment By: Christopher Blunck (blunck2)
Date: 2003-03-30 20:04

Message:
Logged In: YES 
user_id=531881

Sure.  I'll patch and post it later on today.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 19:25

Message:
Logged In: YES 
user_id=21627

Christopher, can you please indicate whether you are going
to provide a patch for the primary source of the
documentation, i.e. the TeX files?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-01-15 13:20

Message:
Logged In: YES 
user_id=21627

Can you please provide a patch for the Tex documentation
(Doc/lib/libhtmlparser.tex) as well? I think this is where
the submitter of bug 659188 was looking.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=662464&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 06:56:05 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Sun, 30 Mar 2003 22:56:05 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E18ztDB-000464-00@sc8-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 11:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-30 22:56

Message:
Logged In: YES 
user_id=108973

>I'm not sure how to place this. Is this an extension to
>pydoc? 

No. This module provides subclasses for 
SimpleXMLRPCServer and CGIXMLRPCServer. These 
subclasses serve pydoc-style documentation when you point 
your browser at them - see the examples in the patch 
summary.

> Should it go into Tools, or into Lib, or into some
> existing module?

The attached file should go into Lib.

> If this goes into Lib somewhere, it lacks documentation.

Fair enough. Conditional on me writing documentation, is this 
contribution acceptable as is?


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 06:59

Message:
Logged In: YES 
user_id=21627

I'm not sure how to place this. Is this an extension to
pydoc? Should it go into Tools, or into Lib, or into some
existing module?

If this goes into Lib somewhere, it lacks documentation.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 12:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 11:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 09:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 09:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 09:07:03 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 01:07:03 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E18zvFv-0007wi-00@sc8-sf-web2.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 20:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 11:07

Message:
Logged In: YES 
user_id=21627

I see. The code is fine, but it needs to come with a test 
function, to operate the module as a program. I suggest that 
the test server provides the get_source_code() operation just 
as your demo client does; the docstring of the class may 
provide an xmlrpclib fragment that retrieves the source code 
(AFAICT, the source code is not directly accessible through 
an URL, is it?)

I also recommend that you reconsider renaming the classes: 
If the module is named, say, DocXMLRPCServer, there is no 
need to have the Doc prefix on the class names. Instead, 
they can be named just "XMLRPCServer" etc.


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-31 08:56

Message:
Logged In: YES 
user_id=108973

>I'm not sure how to place this. Is this an extension to
>pydoc? 

No. This module provides subclasses for 
SimpleXMLRPCServer and CGIXMLRPCServer. These 
subclasses serve pydoc-style documentation when you point 
your browser at them - see the examples in the patch 
summary.

> Should it go into Tools, or into Lib, or into some
> existing module?

The attached file should go into Lib.

> If this goes into Lib somewhere, it lacks documentation.

Fair enough. Conditional on me writing documentation, is this 
contribution acceptable as is?


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:59

Message:
Logged In: YES 
user_id=21627

I'm not sure how to place this. Is this an extension to
pydoc? Should it go into Tools, or into Lib, or into some
existing module?

If this goes into Lib somewhere, it lacks documentation.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 21:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 21:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 19:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 19:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 09:22:30 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 01:22:30 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E18zvUs-0006xJ-00@sc8-sf-web3.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 11:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-31 01:22

Message:
Logged In: YES 
user_id=108973

Write test function: ok
Write documentation: ok

>If the module is named, say, DocXMLRPCServer, there is 
>no need to have the Doc prefix on the class names.

Hmmm. If you look at the core BaseHTTPRequestHandler 
derived classes,  each one is prefixed to match the module 
that it is found in. The only two modules that I can think of 
with identical class names are cStringIO and StringIO, which 
theoretically provide identical semantics.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 01:07

Message:
Logged In: YES 
user_id=21627

I see. The code is fine, but it needs to come with a test 
function, to operate the module as a program. I suggest that 
the test server provides the get_source_code() operation just 
as your demo client does; the docstring of the class may 
provide an xmlrpclib fragment that retrieves the source code 
(AFAICT, the source code is not directly accessible through 
an URL, is it?)

I also recommend that you reconsider renaming the classes: 
If the module is named, say, DocXMLRPCServer, there is no 
need to have the Doc prefix on the class names. Instead, 
they can be named just "XMLRPCServer" etc.


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-30 22:56

Message:
Logged In: YES 
user_id=108973

>I'm not sure how to place this. Is this an extension to
>pydoc? 

No. This module provides subclasses for 
SimpleXMLRPCServer and CGIXMLRPCServer. These 
subclasses serve pydoc-style documentation when you point 
your browser at them - see the examples in the patch 
summary.

> Should it go into Tools, or into Lib, or into some
> existing module?

The attached file should go into Lib.

> If this goes into Lib somewhere, it lacks documentation.

Fair enough. Conditional on me writing documentation, is this 
contribution acceptable as is?


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 06:59

Message:
Logged In: YES 
user_id=21627

I'm not sure how to place this. Is this an extension to
pydoc? Should it go into Tools, or into Lib, or into some
existing module?

If this goes into Lib somewhere, it lacks documentation.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 12:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 11:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 09:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 09:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 16:10:39 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 08:10:39 -0800
Subject: [Patches] [ python-Patches-712367 ] get build working on AIX
Message-ID: <E1901rr-0002Xa-00@sc8-sf-web2.sourceforge.net>

Patches item #712367, was opened at 2003-03-30 17:36
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470

Category: Build
Group: Python 2.3
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Neal Norwitz (nnorwitz)
Summary: get build working on AIX

Initial Comment:
Tested on AIX 4.3 and 5.1.  I may have tested this on
4.2 a long time ago.  Changes to configure and
setup.py.  The setup.py changes are build curses.

The configure changes create the export file
differently.  I was told by Gary Hooks at IBM that the
export file must have a period for AIX 4.2 and beyond
for dynamically imported modules to work properly (call
back into the interpreter).

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-31 11:10

Message:
Logged In: YES 
user_id=33168

Ok, I'll ask about what the best minimum version should be.
 Right now, I suspect AIX 4.2 which is the oldest version I
have access to in the snake-farm.

Checked in as:
 setup.py 1.158
 configure 1.389
 configure.in 1.400

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 00:22

Message:
Logged In: YES 
user_id=21627

The patch itself is fine. However, we should also formally
establish a minimum supported AIX version, in PEP 11
(perhaps with a vision of warning users in 2.4, and actively
removing code that belongs to older versions in 2.5).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712367&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 16:58:48 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 08:58:48 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E1902cS-000291-00@sc8-sf-web3.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 17:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Jason Tishler (jlt63)
Date: 2003-03-31 07:58

Message:
Logged In: YES 
user_id=86216

loewis> I'm in favour of applying this patch, and
loewis> also of patches that mandate recent Cygwin
loewis> releases;

I would like to apply an enhanced version of
this patch.  By enhanced, I mean using "gcc
-shared" (no more dllwrap and gcc -mdll) and
removing redundant gcc options, etc.

Additionally, I would like to fix
get_versions() so it can deal with versions
that only have two components (e.g., 3.2) as
opposed to requiring three (e.g. 2.95.3).

Are these changes acceptable?

loewis> if such patches are implemented, the minimum
loewis> required Cygwin version should be stated
loewis> somewhere.

I propose that the currently available Cygwin
and Mingw tool chains be that above stated
minimum. Is this acceptable? Unfortunately, I
have no idea where the above stated
"somewhere" shoud be.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 14:31

Message:
Logged In: YES 
user_id=87160

I can help with testing. I have access to W2K and Win98
(ugh) boxen. I don't mind installing a few older toolchains
if you think that's necessary.

I think any C/C++ python extension using plain distutils (no
fancy hacks added on) and has one or more DLL dependencies
is a good test case.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 13:15

Message:
Logged In: YES 
user_id=21627

I'm in favour of applying this patch, and also of patches
that mandate recent Cygwin releases; if such patches are
implemented, the minimum required Cygwin version should be
stated somewhere.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 12:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 11:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 09:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 15:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 17:02:23 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 09:02:23 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E1902fv-0002PL-00@sc8-sf-web3.sourceforge.net>

Patches item #709178, was opened at 2003-03-24 17:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Jason Tishler (jlt63)
Date: 2003-03-31 08:02

Message:
Logged In: YES 
user_id=86216

jkluebs> I can help with testing. I have access to W2K
jkluebs> and Win98 (ugh) boxen. I don't mind
jkluebs> installing a few older toolchains if you
jkluebs> think that's necessary.

Thanks for the offer. I'm set up for the
current Cygwin and Mingw tool chains. Let's
wait to see if testing with older ones is
necessary.

jkluebs> I think any C/C++ python extension using
jkluebs> plain distutils (no fancy hacks added on) and
jkluebs> has one or more DLL dependencies is a good
jkluebs> test case.

Can you point me to one that builds OOTB
under Python 2.2.2?

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-31 07:58

Message:
Logged In: YES 
user_id=86216

loewis> I'm in favour of applying this patch, and
loewis> also of patches that mandate recent Cygwin
loewis> releases;

I would like to apply an enhanced version of
this patch.  By enhanced, I mean using "gcc
-shared" (no more dllwrap and gcc -mdll) and
removing redundant gcc options, etc.

Additionally, I would like to fix
get_versions() so it can deal with versions
that only have two components (e.g., 3.2) as
opposed to requiring three (e.g. 2.95.3).

Are these changes acceptable?

loewis> if such patches are implemented, the minimum
loewis> required Cygwin version should be stated
loewis> somewhere.

I propose that the currently available Cygwin
and Mingw tool chains be that above stated
minimum. Is this acceptable? Unfortunately, I
have no idea where the above stated
"somewhere" shoud be.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 14:31

Message:
Logged In: YES 
user_id=87160

I can help with testing. I have access to W2K and Win98
(ugh) boxen. I don't mind installing a few older toolchains
if you think that's necessary.

I think any C/C++ python extension using plain distutils (no
fancy hacks added on) and has one or more DLL dependencies
is a good test case.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 13:15

Message:
Logged In: YES 
user_id=21627

I'm in favour of applying this patch, and also of patches
that mandate recent Cygwin releases; if such patches are
implemented, the minimum required Cygwin version should be
stated somewhere.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 12:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 11:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 09:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-27 15:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 17:23:29 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 09:23:29 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E19030L-00060f-00@sc8-sf-web2.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
>Resolution: Accepted
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Martin v. L�wis (loewis)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2003-03-31 12:23

Message:
Logged In: YES 
user_id=31435

I marked this as Accepted, and also don't see any problem 
with backporting to the 2.2 line.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 00:25

Message:
Logged In: YES 
user_id=21627

There are plans to provide Python 2.2.3. I see no problem
applying it to 2.2.2, as there shouldn't be any change in
visible behaviour.

----------------------------------------------------------------------

Comment By: Mihai Ibanescu (misa)
Date: 2003-03-30 22:59

Message:
Logged In: YES 
user_id=205865

Also, this happens in 2.2.2 as well (the patch in Red Hat's
bugzilla is against 2.2.2 actually). Is there a plan to
release a 2.2.3? Is there value in backporting the patch?
(should apply cleanly on 2.2.2).

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 17:19

Message:
Logged In: YES 
user_id=33168

_POSIX_SEMAPHORES aren't used if
HAVE_BROKEN_POSIX_SEMAPHORES is defined.  This currently
occurs on Solaris 8 (at least).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-30 17:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 18:21:04 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 10:21:04 -0800
Subject: [Patches] [ python-Patches-711835 ] Removing unnecessary lock operations
Message-ID: <E1903u4-000871-00@sc8-sf-web4.sourceforge.net>

Patches item #711835, was opened at 2003-03-29 11:12
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Mihai Ibanescu (misa)
Assigned to: Martin v. L�wis (loewis)
Summary: Removing unnecessary lock operations

Initial Comment:
PyThread_acquire_lock can be further optimized to do
less locking on the global lock mutex.

Original patch location:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=86281

----------------------------------------------------------------------

>Comment By: Mihai Ibanescu (misa)
Date: 2003-03-31 13:21

Message:
Logged In: YES 
user_id=205865

One of the glibc developers expressed some concern on the
2.3 implementation of the global lock using semaphores. I'd
be glad to  funnel any communication with the glibc community.

<quote>
(you) should do some timings on the current lock
implementation vs the one using semaphores.

POSIX semaphores have special requirements (e.g., sem_post
must be callable in signal handlers) which make semaphores
pretty expensive.  In NPTL, for instance, sem_post always
makes a syscall, there is no userlevel-only path.  This
makes using semaphores pretty expensive.  The same
restricting applies in one form or another to all POSIX
compliant semaphore implementations.
</quote>

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-31 12:23

Message:
Logged In: YES 
user_id=31435

I marked this as Accepted, and also don't see any problem 
with backporting to the 2.2 line.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 00:25

Message:
Logged In: YES 
user_id=21627

There are plans to provide Python 2.2.3. I see no problem
applying it to 2.2.2, as there shouldn't be any change in
visible behaviour.

----------------------------------------------------------------------

Comment By: Mihai Ibanescu (misa)
Date: 2003-03-30 22:59

Message:
Logged In: YES 
user_id=205865

Also, this happens in 2.2.2 as well (the patch in Red Hat's
bugzilla is against 2.2.2 actually). Is there a plan to
release a 2.2.3? Is there value in backporting the patch?
(should apply cleanly on 2.2.2).

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-03-30 17:19

Message:
Logged In: YES 
user_id=33168

_POSIX_SEMAPHORES aren't used if
HAVE_BROKEN_POSIX_SEMAPHORES is defined.  This currently
occurs on Solaris 8 (at least).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-03-30 17:11

Message:
Logged In: YES 
user_id=31435

Looks fine to me too.  Since Python switched to using 
semaphores on Linux for 2.3, it's unclear that there's a 
system that uses the condvar code anymore.  How will this 
get tested?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 11:49

Message:
Logged In: YES 
user_id=21627

This looks reasonable to me, but I may be missing something.

Tim, can you see a problem with that code?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=711835&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 18:36:46 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 10:36:46 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E19049G-0003IY-00@sc8-sf-web1.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
>Assigned to: Martin v. L�wis (loewis)
>Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-31 20:36

Message:
Logged In: YES 
user_id=89016

Checked in as:
Misc/NEWS 1.708
Objects/stringobject.c 2.206
Lib/test/test_unicode.py 1.80
Lib/test/test_str.py 1.2
Lib/test/string_tests.py 1.30

BTW "%c" % 256 still fails. Should this be fixed too?
"%c" % 256 raises an OverflowError now, u"%c" %
sys.maxunicode+1 raises a ValueError. At least they should
be changed to raise the same exception. If we fix "%c" % 256
what about chr()?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:18

Message:
Logged In: YES 
user_id=21627

Looks fine, please apply it. Also add a test case that fails
now but passes with the change, and add a NEWS entry.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 19:38:38 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 11:38:38 -0800
Subject: [Patches] [ python-Patches-701743 ] Reloading pseudo modules
Message-ID: <E190578-00030i-00@sc8-sf-web4.sourceforge.net>

Patches item #701743, was opened at 2003-03-11 19:59
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470

Category: Core (C code)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Nobody/Anonymous (nobody)
Summary: Reloading pseudo modules

Initial Comment:
Python allows to put something that is not a module in
sys.modules. Unfortunately reload() does not work wth
such a pseudo module ("TypeError: reload() argument
must be module" is raised). This patch changes
Python/import.c::PyImport_ReloadModule() so that it
works with anything that has a __name__ attribute that
can be found in sys.modules.keys().

----------------------------------------------------------------------

>Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-31 21:38

Message:
Logged In: YES 
user_id=89016

A use case can be found at
http://www.livinglogic.de/viewcvs/index.cgi/LivingLogic/xist/_xist/xsc.py?rev=2.235
(Look for the classmethod makemod() in the class Namespace).
This puts a class object into sys.modules instead of the
module that defines this class. This makes it possible to
derive from "modules". 

Of course the patch does not fully fix the problem, because
reload() does not repopulate the class object. Unfortunately
that's impossible to fix with Python code, as it's
impossible for Python code to distinguish the first import
from subsequent ones. If this was possible (and Python code
had access to the old "module"), a real reload could be
coded in pure Python for this specific case.

But with the patch at least it's possible to use the return
value of reload() afterwards to use the new "module".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 18:34

Message:
Logged In: YES 
user_id=21627

The patch looks fine now as far as it goes. I'm unsure what
the use case is, though: What object do you have in
sys.modules for which reload() would be meaningful? Can you
attach an example where reloading fails now but succeeds
with your patch applied?

As for reload modifying the module object: It needs to, or
else all clients would have to run reload; this would
include things like function default arguments. I guess it
returns a result for historical reasons.


----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-17 15:25

Message:
Logged In: YES 
user_id=89016

PyImport_ReloadModule() is only called by the implementation
of the reload builtin, so it seems that m==NULL can only
happen with broken extension modules. I've updated the patch
accordingly (raising a SystemError) and changed the error
case for a missing __name__ attribute to raise a TypeError
when an AttributeError is detected. Unfortunately this might
mask exceptions (e.g. when __name__ is implemented as a
property.)

Another problem is that reload() seems to repopulate the
existing module object when reloading real modules. Example:
Write a simple foo.py which contains "x = 1" and then:
>>> import foo
>>> foo.x
1
[ Now open your editor and change foo.py to "x = 2" ]
>>> foo2 = reload(foo)
>>> foo.x
2
>>> foo2.x
2
>>> print id(foo), id(foo2)
1077466884 1077466884
>>> 

Of course this can't work with pseudo modules. I wonder why
reload() has a return value at all, as it always modifies
its parameter for real modules.

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-15 14:51

Message:
Logged In: YES 
user_id=21627

I think the exceptions need to be reworked: "must be a
module" now only occurs if m is NULL. Under what
circumstances could that happen? Failure to provide __name__
is passed through; shouldn't this get diagnosed in a better way?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=701743&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 19:54:33 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 11:54:33 -0800
Subject: [Patches] [ python-Patches-712900 ] sre fixes for lastindex and minimizing repeats+assertions
Message-ID: <E1905MX-0006pY-00@sc8-sf-web1.sourceforge.net>

Patches item #712900, was opened at 2003-03-31 10:54
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712900&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Nobody/Anonymous (nobody)
Summary: sre fixes for lastindex and minimizing repeats+assertions

Initial Comment:
The attached patch fixes two bugs in _sre.c; it also 
does a bit of reorganization.  

First the bugs.  672491 points out that lastindex is 
calculated differently in 2.3 than in previous versions.  
This patch restores the previous behavior.  Since 
lastindex cannot be restored (when backtracking) from 
lastmark alone, it is now saved and restored 
independently (by the LASTMARK_SAVE and 
RESTORE macros).

The second bug appears when minimizing repeats are 
combined with assertions:

>>> re.match('([ab]*?)(?=(b)?)c', 'abc').groups()
('ab', 'b')

The second group should be None, since the 'b' is 
consumed by the first group.  To fix this, it is necessary 
to save lastmark before attempting to match the tail in 
OP_MIN_UNTIL and to restore it if the tail fails to match.

The reorganization has to do with the handling of the 
SRE_STATE's lastmark and mark array.  The mark 
array tracks the start and end of capturing groups; 
lastmark is the highest index in the array so far 
encountered.  Previously, whenever lastmark was 
restored back to a lower value (in 2.3a2 this is done in 
the lastmark_restore function), the tail of the mark array 
was NULLed out (using memset).  This patch adopts the 
rule that all indexes greater than lastmark are invalid, so 
restoring lastmark does not also require clearing the 
tail.  To ensure that indexes <= lastmark have valid 
pointers, OP_MARK checks if lastmark is being 
increased by more than one; if so, it NULLs out the 
intervening pointers.  This rule also required changes to 
the GROUPREF opcodes and the state_getslice 
function to ensure that they do not access indexes 
greater than lastmark.  For consistency, lastmark is 
now initialized to �1, to indicate that no entries in the 
mark array are valid.

Needless to say, the reorganization is not necessary to 
fix the bugs; it may be a bad idea.  It seems to be 
marginally faster than a version that fixes the bugs but is 
similar to the current code (including a memset inside 
the LASTMARK_RESTORE macro).

One other thing.  I have removed a test for string == 
Py_None from state_getslice, since I can�t find any way 
for string to be Py_None at that point (string is always 
the object providing the text to be searched; if it were 
Py_None, an exception should be raised by the 
getstring function called by state_init).  Perhaps I 
missed something?
 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=712900&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 21:30:47 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 13:30:47 -0800
Subject: [Patches] [ python-Patches-536883 ] SimpleXMLRPCServer auto-docing subclass
Message-ID: <E1906rf-0002xR-00@sc8-sf-web1.sourceforge.net>

Patches item #536883, was opened at 2002-03-29 20:52
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470

Category: Library (Lib)
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Martin v. L�wis (loewis)
Summary: SimpleXMLRPCServer auto-docing subclass

Initial Comment:
This SimpleXMLRPCServer subclass automatically serves 
HTML documentation, generated using pydoc, in response 
to an HTTP GET request (XML-RPC always uses POST).

Here are some examples:
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc1.py
http://www.sweetapp.com/cgi-bin/xmlrpc-test/rpc2.py


----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 23:30

Message:
Logged In: YES 
user_id=21627

Ok, leave the naming as-is, unless other reviewers comment
in one direction or the other.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-31 11:22

Message:
Logged In: YES 
user_id=108973

Write test function: ok
Write documentation: ok

>If the module is named, say, DocXMLRPCServer, there is 
>no need to have the Doc prefix on the class names.

Hmmm. If you look at the core BaseHTTPRequestHandler 
derived classes,  each one is prefixed to match the module 
that it is found in. The only two modules that I can think of 
with identical class names are cStringIO and StringIO, which 
theoretically provide identical semantics.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 11:07

Message:
Logged In: YES 
user_id=21627

I see. The code is fine, but it needs to come with a test 
function, to operate the module as a program. I suggest that 
the test server provides the get_source_code() operation just 
as your demo client does; the docstring of the class may 
provide an xmlrpclib fragment that retrieves the source code 
(AFAICT, the source code is not directly accessible through 
an URL, is it?)

I also recommend that you reconsider renaming the classes: 
If the module is named, say, DocXMLRPCServer, there is no 
need to have the Doc prefix on the class names. Instead, 
they can be named just "XMLRPCServer" etc.


----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-03-31 08:56

Message:
Logged In: YES 
user_id=108973

>I'm not sure how to place this. Is this an extension to
>pydoc? 

No. This module provides subclasses for 
SimpleXMLRPCServer and CGIXMLRPCServer. These 
subclasses serve pydoc-style documentation when you point 
your browser at them - see the examples in the patch 
summary.

> Should it go into Tools, or into Lib, or into some
> existing module?

The attached file should go into Lib.

> If this goes into Lib somewhere, it lacks documentation.

Fair enough. Conditional on me writing documentation, is this 
contribution acceptable as is?


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-30 16:59

Message:
Logged In: YES 
user_id=21627

I'm not sure how to place this. Is this an extension to
pydoc? Should it go into Tools, or into Lib, or into some
existing module?

If this goes into Lib somewhere, it lacks documentation.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2003-02-10 21:25

Message:
Logged In: YES 
user_id=108973

Patch 473586 has been accepted so this patch can be 
accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 21:26

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-04-04 19:55

Message:
Logged In: YES 
user_id=108973

Sorry, I was sloppy about the description:

This patch is dependant on patch 473586:
[473586] SimpleXMLRPCServer - fixes and CGI

So please don't check this in until that patch is accepted.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-04-04 19:31

Message:
Logged In: YES 
user_id=6380

Looks cute to me. Fredrik, any problem if I just check this
in?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=536883&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 21:33:31 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 13:33:31 -0800
Subject: [Patches] [ python-Patches-709178 ] remove -static option from cygwinccompiler
Message-ID: <E1906uJ-00037Q-00@sc8-sf-web1.sourceforge.net>

Patches item #709178, was opened at 2003-03-25 03:55
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470

Category: Distutils and setup.py
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: John Kabir Luebs (jkluebs)
Assigned to: Jason Tishler (jlt63)
Summary: remove -static option from cygwinccompiler

Initial Comment:
Currently, the cygwinccompiler.py compiler handling in
distutils is invoking the cygwin and mingw compilers
with the -static option. 

Logically, this means that the linker should choose to
link to static libraries instead of shared/dynamically
linked libraries.

Current win32 binutils expect import libraries to have
a .dll.a suffix and static libraries to have .a suffix.
If -static is passed, it will skip the .dll.a
libraries. This is pain if one has a tree with both
static and dynamic libraries using this naming
convention, and wish to use the dynamic libraries.

The -static option being passed in distutils is to get
around a bug in old versions of binutils where it would
get confused when it found the DLLs themselves.

The decision to use static or shared libraries is site
or package specific, and should be left to the setup
script or to command line options.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-03-31 23:33

Message:
Logged In: YES 
user_id=21627

jlt63: Your proposed changes all sound fine.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-31 19:02

Message:
Logged In: YES 
user_id=86216

jkluebs> I can help with testing. I have access to W2K
jkluebs> and Win98 (ugh) boxen. I don't mind
jkluebs> installing a few older toolchains if you
jkluebs> think that's necessary.

Thanks for the offer. I'm set up for the
current Cygwin and Mingw tool chains. Let's
wait to see if testing with older ones is
necessary.

jkluebs> I think any C/C++ python extension using
jkluebs> plain distutils (no fancy hacks added on) and
jkluebs> has one or more DLL dependencies is a good
jkluebs> test case.

Can you point me to one that builds OOTB
under Python 2.2.2?

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-31 18:58

Message:
Logged In: YES 
user_id=86216

loewis> I'm in favour of applying this patch, and
loewis> also of patches that mandate recent Cygwin
loewis> releases;

I would like to apply an enhanced version of
this patch.  By enhanced, I mean using "gcc
-shared" (no more dllwrap and gcc -mdll) and
removing redundant gcc options, etc.

Additionally, I would like to fix
get_versions() so it can deal with versions
that only have two components (e.g., 3.2) as
opposed to requiring three (e.g. 2.95.3).

Are these changes acceptable?

loewis> if such patches are implemented, the minimum
loewis> required Cygwin version should be stated
loewis> somewhere.

I propose that the currently available Cygwin
and Mingw tool chains be that above stated
minimum. Is this acceptable? Unfortunately, I
have no idea where the above stated
"somewhere" shoud be.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-29 00:31

Message:
Logged In: YES 
user_id=87160

I can help with testing. I have access to W2K and Win98
(ugh) boxen. I don't mind installing a few older toolchains
if you think that's necessary.

I think any C/C++ python extension using plain distutils (no
fancy hacks added on) and has one or more DLL dependencies
is a good test case.


----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 23:15

Message:
Logged In: YES 
user_id=21627

I'm in favour of applying this patch, and also of patches
that mandate recent Cygwin releases; if such patches are
implemented, the minimum required Cygwin version should be
stated somewhere.

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 22:16

Message:
Logged In: YES 
user_id=86216

John, would you be willing to help test or supply me with
test cases? I have built exactly one Win32 extension.

----------------------------------------------------------------------

Comment By: John Kabir Luebs (jkluebs)
Date: 2003-03-28 21:56

Message:
Logged In: YES 
user_id=87160

The -mdll --entry DllMain@12  option is guarded for an old
version of gcc that did not have the correct specs to accept
-shared.
I didn't touch it, even though it's crazy if anyone is using
such an old and buggy toolchain.

--shared and --dll are equivalent as far as ld is concerned. 

----------------------------------------------------------------------

Comment By: Jason Tishler (jlt63)
Date: 2003-03-28 19:41

Message:
Logged In: YES 
user_id=86216

Note that I only have minimal experience building
Win32 extensions modules...

This patch works "fine" with my *very* limited testing.
Specifically, I successfully rebuilt the Win32 readline
module with it applied.

BTW, this area of Distutils probably should be revisited
to bring it up to date. For example, the "-mdll --entry
_DllMain@12" options could be replaced by "-shared".

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-28 01:03

Message:
Logged In: YES 
user_id=21627

Jason, can you take a look? If not, please unassign it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=709178&group_id=5470


From noreply@sourceforge.net  Mon Mar 31 22:16:52 2003
From: noreply@sourceforge.net (SourceForge.net)
Date: Mon, 31 Mar 2003 14:16:52 -0800
Subject: [Patches] [ python-Patches-710127 ] Make "%c" % u"a" work
Message-ID: <E1907aG-0008Tc-00@sc8-sf-web3.sourceforge.net>

Patches item #710127, was opened at 2003-03-26 17:08
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Walter D�rwald (doerwalter)
Assigned to: Martin v. L�wis (loewis)
>Summary: Make "%c" % u"a" work

Initial Comment:
Currently "%c" % u"a" fails, while "%s" % u"a" works. 
This patch fixes this problem.

----------------------------------------------------------------------

>Comment By: Martin v. L�wis (loewis)
Date: 2003-04-01 00:16

Message:
Logged In: YES 
user_id=21627

I can't see why "%c" % 256 should pass; interpreting the 256
as a Unicode ordinal is stretching things too much (if 256
was a Unicode ordinal, then 255 should be a Unicode ordinal
too, and you would have to take into account the system
encoding).

I would think it would be consistent if both gave
OverflowError (Result too large to be represented); this
deserves another NEWS entry.

----------------------------------------------------------------------

Comment By: Walter D�rwald (doerwalter)
Date: 2003-03-31 20:36

Message:
Logged In: YES 
user_id=89016

Checked in as:
Misc/NEWS 1.708
Objects/stringobject.c 2.206
Lib/test/test_unicode.py 1.80
Lib/test/test_str.py 1.2
Lib/test/string_tests.py 1.30

BTW "%c" % 256 still fails. Should this be fixed too?
"%c" % 256 raises an OverflowError now, u"%c" %
sys.maxunicode+1 raises a ValueError. At least they should
be changed to raise the same exception. If we fix "%c" % 256
what about chr()?

----------------------------------------------------------------------

Comment By: Martin v. L�wis (loewis)
Date: 2003-03-29 15:18

Message:
Logged In: YES 
user_id=21627

Looks fine, please apply it. Also add a test case that fails
now but passes with the change, and add a NEWS entry.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=710127&group_id=5470