From gnn at neville-neil.com  Mon Aug  1 02:33:37 2005
From: gnn at neville-neil.com (George V. Neville-Neil)
Date: Mon, 01 Aug 2005 09:33:37 +0900
Subject: [Python-Dev] Extension of struct to handle non byte aligned values?
Message-ID: <m21x5eh4wu.wl%gnn@neville-neil.com>

Hi,

I'm attempting to write a Packet class, and a few other classes for
use in writing protocol conformance tests.  For the most part this is
going well except that I'd like to be able to pack and unpack byte
strings with values that are not 8 bit based quantities.  As an
example, I'd like to be able to grab just a single bit from a byte
string, and I'd also like to modify, for example, 13 bits.  These are
all reasonable quantities in an IPv4 packet.  I have looked at doing
this all in Python within my own classes but I believe this is a
general extension that would be good for the struct module.  I could
also write a new module, bitstruct, to do this but that seems silly.
I did not find anything out there that handles this case, so if I
missed that then please let me know.

My proposal would be for a new format character, 'z', which is
followed by a position in bits from 0 to 31 so that we get either a
byte, halfword, or longword based byte string back and then an
optional 'r' (for run length, and because 'l' and 's' are already
used) followed by a number of bits.  The default length is 1 bit.  I
believe this is sufficient for most packet protocols I know of
because, for the most part, protocols try to be 32 or 64bit aligned.
This would ALWAYS unpack into an int type.  So, you would see this:

bytestring = pack("z0r3z3r13", flags, fragment)

this would pack the flags and fragment offset in a packet at bits 0-3
and 3-13 respectively and return a 2 byte byte-string.

header_length = unpack("z4r4", packet.bytes)

would retrieve the header length from the packet, which is from bits 4
through 8.

Thoughts?

Thanks,
George


From greg.ewing at canterbury.ac.nz  Mon Aug  1 04:57:18 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 01 Aug 2005 14:57:18 +1200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <42EB8402.10902@gmail.com>
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com> <42EB8402.10902@gmail.com>
Message-ID: <42ED8F8E.4070404@canterbury.ac.nz>

Nick Coghlan wrote:
> New Hierarchy
> =============
> 
> Raisable (formerly Exception)
> +-- CriticalException (new)
>      +-- KeyboardInterrupt
>      +-- MemoryError
>      +-- SystemError
> +-- ControlFlowException (new)
>      +-- GeneratorExit
>      +-- StopIteration
>      +-- SystemExit
> +-- Exception (formerly StandardError)

If CriticalException and ControlFlowException are to be
siblings of Exception rather than subclasses of it, they
should be renamed so that they don't end with "Exception".
Otherwise there will be a confusing mismatch between the
actual inheritance hierarchy and the one suggested by
the naming.

Also, I'm not entirely happy about Exception no longer
being at the top, because so far the word "exception"
in relation to Python has invariably meant "anything
that can be raised". This terminology is even embedded
in the syntax with the try-except statement. Changing
this could to lead to some awkward circumlocutions
in the documentation and confusion in discussions.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From stephen at xemacs.org  Mon Aug  1 08:54:17 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 01 Aug 2005 15:54:17 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1122607673.9665.38.camel@geddy.wooz.org> (Barry Warsaw's
	message of "Thu, 28 Jul 2005 23:27:53 -0400")
References: <42E93940.6080708@v.loewis.de>
	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
Message-ID: <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "BAW" == Barry Warsaw <barry at python.org> writes:

    BAW> So are you saying that moving to svn will let us do more long
    BAW> lived branches?  Yay!

Yes, but you still have to be disciplined about it.  svn is not much
better than cvs about detecting and ignoring spurious conflicts due to
code that gets merged from branch A to branch B, then back to branch
A.  Unrestricted cherry-picking is still out.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From metawilm at gmail.com  Mon Aug  1 10:49:46 2005
From: metawilm at gmail.com (Willem Broekema)
Date: Mon, 1 Aug 2005 10:49:46 +0200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab1005073109217f3a33f1@mail.gmail.com>
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
Message-ID: <f6bc9b49050801014968cac94f@mail.gmail.com>

On 7/31/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 7/31/05, Willem Broekema <metawilm at gmail.com> wrote:
> > I does not seem right to me to think of KeyboardInterrupt as a means
> > to cause program halting. An interpreter could in principle recover
> > from it and resume execution of the program.
> >
> 
> Same goes for MemoryError as well, but you probably don't want to
> catch that exception either.

Well, an possible scenario is that if allocation of memory fails, then
the interpreter (not the Python program in it) can detect that it is
not caught explicitly and print possible ways of execution, like "try
the allocation again" or "abort the program", letting the user
determine how to proceed. Although in this case immediately retrying
the allocation will fail again, so the user has to have a way to free
some objects in the meantime.

I realize it's major work to add recovery features to the CPython
interpreter, so I don't think CPython will have anything like it soon
and therefore also Python-the-language will not. Instead, my reason
for mentioning this is to get the _concept_ of recoveries across. I
think including (hypothetical, for now) recovery features in a
discussion about exceptions is valuable, because that influences
whether one thinks a label like "critical" for an exception is
appropriate.

I'm working on an implementation of Python in Common Lisp. The CL
condition system offers recovery features, so this implementation
could, too. Instead of the interpreter handling the interrupt in an
application-specific way, as Fred said, the interpreter could handle
the interrupt by leaving the choice to the user.

Concretely, this is how KeyboardInterrupt is handled by a CL
interpreter, and thus also how a Python interpreter could handle it:

(defun foo () (loop for i from 0 do (format t "~A " i)))
(foo)
=> 0 1 2 3 <CTRL-C>
Error: Received signal number 2 (Keyboard interrupt)  [condition type:
INTERRUPT-SIGNAL]
Restart actions (select using :continue):
 0: continue computation
 1: Return to Top Level (an "abort" restart).
 2: Abort entirely from this process.
 
:continue 0
=> 4 5 6 ...

> But it doesn't sound like you are arguing against putting
> KeyboardInterrupt under CriticalException, but just the explanation I
> gave, right?

I hope the above makes the way I'm thinking more clear.  Like Phillip
J. Eby, I think that labeling KeyboardInterrupt a CriticalException
seems wrong; it is not an error and not critical.


- Willem

From mwh at python.net  Mon Aug  1 12:26:15 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 01 Aug 2005 11:26:15 +0100
Subject: [Python-Dev] Extension of struct to handle non byte aligned
 values?
In-Reply-To: <m21x5eh4wu.wl%gnn@neville-neil.com> (George V. Neville-Neil's
	message of "Mon, 01 Aug 2005 09:33:37 +0900")
References: <m21x5eh4wu.wl%gnn@neville-neil.com>
Message-ID: <2m4qaa0x88.fsf@starship.python.net>

"George V. Neville-Neil" <gnn at neville-neil.com> writes:

> Hi,
>
> I'm attempting to write a Packet class, and a few other classes for
> use in writing protocol conformance tests.  For the most part this is
> going well except that I'd like to be able to pack and unpack byte
> strings with values that are not 8 bit based quantities.

[...]

> Thoughts?

Well, the main thing that comes to mind is that I wouldn't regard the
struct interface as being something totally wonderful and perfect.

I am aware of a few attempts to make up a better interface, such as
ctypes and Bob's rather similar looking ptypes from macholib:

http://svn.red-bean.com/bob/py2app/trunk/src/macholib/ptypes.py

and various silly unreleased things I've done.  They all work on the
basic idea of a class schema that describes the binary structure, eg:

class Sound(Message):
    code = 0x06
    layout = [('mask', BYTE()),
              ('vol', CDI(1, SDI(BYTE(), 1/255.0), 1.0)),
              ('attenuation', CDI(2, SDI(BYTE(), 1/64.0), 1.0)),
              ('entitychan', SHORT()),
              ('soundnum', BYTE()),
              ('origin', COORD()*3)]

You may want to do something similar (presumably the struct module or
some other c stuff would be under the hood somewhere).

I don't really see a need to change CPython here, unless some general
binary parsing scheme becomes best-of-breed and a candidate for stdlib
inclusion.

Cheers,
mwh

PS: This is probably more comp.lang.python material.

-- 
  The use of COBOL cripples the mind; its teaching should, therefore,
  be regarded as a criminal offence.
           -- Edsger W. Dijkstra, SIGPLAN Notices, Volume 17, Number 5

From mwh at python.net  Mon Aug  1 12:33:26 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 01 Aug 2005 11:33:26 +0100
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <f6bc9b49050801014968cac94f@mail.gmail.com> (Willem Broekema's
	message of "Mon, 1 Aug 2005 10:49:46 +0200")
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
Message-ID: <2mzms2ymix.fsf@starship.python.net>

Willem Broekema <metawilm at gmail.com> writes:

> I realize it's major work to add recovery features to the CPython
> interpreter, so I don't think CPython will have anything like it soon
> and therefore also Python-the-language will not. Instead, my reason
> for mentioning this is to get the _concept_ of recoveries across. I
> think including (hypothetical, for now) recovery features in a
> discussion about exceptions is valuable, because that influences
> whether one thinks a label like "critical" for an exception is
> appropriate.

Heh, I talked about this at EuroPython... 

http://starship.python.net/crew/mwh/recexc.pdf

The technical barriers are insignificant, really.

Cheers,
mwh

-- 
  Our Constitution never promised us a good or efficient government,
  just a representative one. And that's what we got.
      -- http://www.advogato.org/person/mrorganic/diary.html?start=109

From stephen at xemacs.org  Mon Aug  1 15:52:06 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 01 Aug 2005 22:52:06 +0900
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <f6bc9b49050801014968cac94f@mail.gmail.com> (Willem Broekema's
	message of "Mon, 1 Aug 2005 10:49:46 +0200")
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
Message-ID: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Willem" == Willem Broekema <metawilm at gmail.com> writes:

    Willem> I hope the above makes the way I'm thinking more clear.
    Willem> Like Phillip J. Eby, I think that labeling
    Willem> KeyboardInterrupt a CriticalException seems wrong; it is
    Willem> not an error and not critical.

Uh, according to your example in Common LISP it is indeed an error,
and if an unhandled signal whose intended interpretation is "drop the
gun and put your hands on your head!" isn't critical, what is?<wink>
I didn't miss your point, but I don't see a good reason to oppose that
label based on the usual definitions of the words or Common LISP
usage, either.

It seems to me the relevant question is "is it likely that catching
KeyboardInterrupt with 'except Exception:' will get sane behavior from
a generic user-defined handler?"  I think not; usually you'd like
generic error recovery to _not_ bother the user, but KeyboardInterrupt
sort of demands interaction with the user, no?  So you're going to
need a separate routine for KeyboardInterrupt, anyway.  I expect
that's going to be the normal case.

So I would say KeyboardInterrupt should derive from CriticalException,
not from Exception.

I definitely agree that implementing recovery features is a good idea,
and in interactive operation (or with an option to the interpreter),
to allow for such recovery in the interpreter itself.  For example,
the interpreter could keep a small nest egg of memory for the purpose
of interacting with the user; this would be harder for a program to
do.  And in many quickie scripts it would be convenient if the
interpreter would drop into interactive mode, not die, if the program
encounters a critical exception.

But it's still a critical exception to the program written in Python,
even if it's easy for the user to handle and the interpreter provides
the capability to pass the buck to the user.  The program has
completely lost control!

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN

From gabriel.becedillas at corest.com  Mon Aug  1 19:36:42 2005
From: gabriel.becedillas at corest.com (Gabriel Becedillas)
Date: Mon, 01 Aug 2005 14:36:42 -0300
Subject: [Python-Dev] Syscall Proxying in Python
Message-ID: <42EE5DAA.8040200@corest.com>

Hi,
We embbeded Python 2.0.1 in our product a few years ago and we'd like to
upgrade to Python 2.4.1. This was not a simple task, because we needed 
to execute syscalls on a remote host. We modified Python's source code 
in severall places to call our own versions of some functions. For 
example, instead of calling fopen(...), the source code was modified to 
call remote_fopen(...), and the same was done with other libc functions. 
Socket functions where hooked too (we modified socket.c), Windows 
Registry functions, etc..
There are some syscalls that we don't want to execute remotely. For 
example when importing a module. That has to be local, and we didn't 
modified that.
Python scripts are executed locally, but syscalls are executed on a 
remote host, thus giving the illusion that the script is executing on 
the remote host.
As I said before, we're in the process of upgrading and we don't want to 
make such unmaintainable changes to Python's code. We'd like to make as 
few changes as possible. The aproach we're trying this time is far less 
intrusive: We'd like to link Python with special libraries that override 
those functions that we want to execute remotely. This way the only code 
that has to be changed is the one that has to be executed locally.
I wrote this mail to ask you guys for any useful advice in making this
changes to Python's core. The only places I figure out right now that 
have to execute locally all the time are import.c and pythonrun.c, but 
I'm not sure at all.
Maybe you guys figure out another way to achieve what we need.
Thanks in advance.

-- 


Gabriel Becedillas
Developer
CORE SECURITY TECHNOLOGIES

Florida 141 - 2? cuerpo - 7? piso
C1005AAC Buenos Aires - Argentina
Tel/Fax: (54 11) 5032-CORE (2673)
http://www.corest.com


From abo at minkirri.apana.org.au  Mon Aug  1 19:52:03 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 01 Aug 2005 10:52:03 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <42E93940.6080708@v.loewis.de>
	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <1122918723.9680.33.camel@warna.corp.google.com>

On Sun, 2005-07-31 at 23:54, Stephen J. Turnbull wrote:
> >>>>> "BAW" == Barry Warsaw <barry at python.org> writes:
> 
>     BAW> So are you saying that moving to svn will let us do more long
>     BAW> lived branches?  Yay!
> 
> Yes, but you still have to be disciplined about it.  svn is not much
> better than cvs about detecting and ignoring spurious conflicts due to
> code that gets merged from branch A to branch B, then back to branch
> A.  Unrestricted cherry-picking is still out.

Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge
properly. All the other cool stuff like renames etc is kinda undone by
that. For a definition of properly, see;

http://prcs.sourceforge.net/merge.html

This is why I don't bother migrating any existing CVS projects to SVN;
the benefits don't yet outweigh the pain of migrating. For new projects
sure, SVN is a better choice than CVS.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From abo at minkirri.apana.org.au  Mon Aug  1 20:08:51 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 01 Aug 2005 11:08:51 -0700
Subject: [Python-Dev] Syscall Proxying in Python
In-Reply-To: <42EE5DAA.8040200@corest.com>
References: <42EE5DAA.8040200@corest.com>
Message-ID: <1122919731.9688.43.camel@warna.corp.google.com>

On Mon, 2005-08-01 at 10:36, Gabriel Becedillas wrote:
> Hi,
> We embbeded Python 2.0.1 in our product a few years ago and we'd like to
> upgrade to Python 2.4.1. This was not a simple task, because we needed 
> to execute syscalls on a remote host. We modified Python's source code 
> in severall places to call our own versions of some functions. For 
> example, instead of calling fopen(...), the source code was modified to 
> call remote_fopen(...), and the same was done with other libc functions. 
> Socket functions where hooked too (we modified socket.c), Windows 
> Registry functions, etc..

Wow... you guys sure did it the hard way. If you had done it at the
Python level, you would have had a much easier time of both implementing
and updating it.

As an example, have a look at my osVFS stuff. This is a replacement for
the os module and open() that tricks Python into using a virtual file
system;

http://minkirri.apana.org.au/~abo/projects/osVFS


-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From metawilm at gmail.com  Mon Aug  1 22:53:19 2005
From: metawilm at gmail.com (Willem Broekema)
Date: Mon, 1 Aug 2005 22:53:19 +0200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <f6bc9b49050801135345872915@mail.gmail.com>

On 8/1/05, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Uh, according to your example in Common LISP it is indeed an error,

I think you are referring to the first word of this line:

Error: Received signal number 2 (Keyboard interrupt)  [condition type:
INTERRUPT-SIGNAL]

Well, that refers to the fact that it was raised with (error ...). It
says nothing about the type of a Keyboad interrupt condition. (The
function 'error' vs 'signal' mark the distinction between raising
conditions that must be handled otherwise you'll end up in the
debugger, and conditions that when not handled are silently ignored.)

The CL ANSI standard does not define what kind of condition a Keyboard
interrupt is, so the implementations have to make that decision.

Although this implementation (Allegro CL) has currently defined it as
a subclass of 'error', I'm told it should have been a 
'serious-condition' instead ('error' is a subclass of
'serious-condition', which is a subclass of 'condition'), precisely
because forms like ignore-errors, like a bare except in Python, will
catch it right now when they shouldn't. I assume most of the other
Lisp implementations have already defined it as serious-condition.

So, in short, Keyboard interrupts in Lisp are a serious-condition, not an error.

(And what is labeled CriticalException in this discussion, has in
serious-condition Lisp's counterpart.)

> and if an unhandled signal whose intended interpretation is "drop the
> gun and put your hands on your head!" isn't critical, what is?<wink>

Eh, are you serious? <wink>

> I didn't miss your point, but I don't see a good reason to oppose that
> label based on the usual definitions of the words or Common LISP
> usage, either.

Well, I'm not opposed to KeyboardInterrupt being in a class that's not
a subclass of 'Exception', when the latter is the class used in a bare
'except'. But when CriticalException, despite its name, is not a
subclass of Exception, that is a bit strange. I'd prefer the
'condition' and 'error' terminology, and to label a keyboard interrupt
a condition, not any kind of exception or error.

> It seems to me the relevant question is "is it likely that catching
> KeyboardInterrupt with 'except Exception:' will get sane behavior from
> a generic user-defined handler?" 

I agree with you that it should not be caught in a bare 'except' (or
an 'except Exception', when that is equivalent).


- Willem

From tdelaney at avaya.com  Tue Aug  2 01:12:07 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 2 Aug 2005 09:12:07 +1000
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com>

Nick Coghlan wrote:

> +-- Exception (formerly StandardError)
>      +-- AttributeError
>      +-- NameError
>          +-- UnboundLocalError
>      +-- RuntimeError
>          +-- NotImplementedError

Time to wade in ...

I've actually been wondering if NotImplementedError should actually be a
subclass of AttributeError.

Everywhere I can think of where I would want to catch
NotImplementedError, I would also want to catch AttributeError. My main
question is whether I would want the reverse to also be true - anywhere
I want to catch AttributeError, I would want to catch
NotImplementedError.

Perhaps instead it should be the other way around - AttributeError
inherits from NotImplementedError. This does make some kind of sense -
the attribute hasn't been implemented.

Both seem to have some advantages, but neither really feels right to me.
Thoughts?

Anyway, I came to this via another thing - NotImplementedError doesn't
play very well with super(). In many ways it's worse to call
super().method() that raises NotImplementedError than super().method()
where the attribute doesn't exist. In both cases, the class calling
super() needs to know whether or not it's at the end of the MRO for that
method - possible to find out in most cases that would raise
AttributeError, but impossible for a method that raises
NotImplementedError.

The only way I can think of to deal with this is to do a try: except
(AttributeError, NotImplementedError) around every super() attribute
call. This seems bad.

Tim Delaney

From bcannon at gmail.com  Tue Aug  2 02:03:54 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 1 Aug 2005 17:03:54 -0700
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com>
Message-ID: <bbaeab10050801170361616e22@mail.gmail.com>

On 8/1/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> Nick Coghlan wrote:
> 
> > +-- Exception (formerly StandardError)
> >      +-- AttributeError
> >      +-- NameError
> >          +-- UnboundLocalError
> >      +-- RuntimeError
> >          +-- NotImplementedError
> 
> Time to wade in ...
> 
> I've actually been wondering if NotImplementedError should actually be a
> subclass of AttributeError.
> 
> Everywhere I can think of where I would want to catch
> NotImplementedError, I would also want to catch AttributeError. My main
> question is whether I would want the reverse to also be true - anywhere
> I want to catch AttributeError, I would want to catch
> NotImplementedError.
> 
> Perhaps instead it should be the other way around - AttributeError
> inherits from NotImplementedError. This does make some kind of sense -
> the attribute hasn't been implemented.
> 
> Both seem to have some advantages, but neither really feels right to me.
> Thoughts?

The problem with subclassing NotImplementedError is you need to
remember it is used to signal that a magic method does not work for a
specific type and thus should try the __r*__ version.  That is not a
case, I feel, that has anything to do with attributes but
implementation support.

I am not going to subclass NotImplementedError unless a huge push for
it in a very specific direction.

-Brett (who is waiting on a PEP number...)

From anthony at interlink.com.au  Tue Aug  2 02:21:16 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon, 1 Aug 2005 17:21:16 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42E93940.6080708@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>
Message-ID: <200508011721.18567.anthony@interlink.com.au>

On Thursday 28 July 2005 13:00, Martin v. L?wis wrote:
> I'd like to see the Python source be stored in Subversion instead
> of CVS, 

I'm +1 on this, assuming we use the fsfs backend, and not the berkeley
DB one. I'm -1 if we're using the bdb backend (I've had nothing but
pain from it). 

> CVS has a number of limitations that have been elimintation by
> Subversion. For the development of Python, the most notable improvements
> are:
> - ability to rename files and directories, and to remove directories,
>   while keeping the history of these files.
> - support for change sets (sets of correlated changes to multiple
>   files) through global revision numbers.
> - support for offline diffs, which is useful when creating patches.

- tagging for releases will no longer cause the release manager to 
experience fits of burning rage (personal record was something like 
1h45m for 'cvs tag' to finish, from memory). 

My only concern is that we have sufficient volunteers to manage the
system. I'm happy to be one of these, but that's assuming we have other
people also volunteering. . . 

Anthony 


-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From stephen at xemacs.org  Tue Aug  2 04:07:50 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 02 Aug 2005 11:07:50 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> (Donovan
	Baarda's message of "Mon, 01 Aug 2005 10:52:03 -0700")
References: <42E93940.6080708@v.loewis.de>
	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
Message-ID: <8764up147d.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Donovan" == Donovan Baarda <abo at minkirri.apana.org.au> writes:

    Donovan> Yeah. IMHO the sadest thing about SVN is it doesn't do
    Donovan> branch/merge properly. All the other cool stuff like
    Donovan> renames etc is kinda undone by that.  [...]  This is why
    Donovan> I don't bother migrating any existing CVS projects to
    Donovan> SVN; the benefits don't yet outweigh the pain of
    Donovan> migrating.

FWIW, XEmacs just had this discussion, and we basically came to the
conclusion that for a multi-developer project it's _definitely_ worth
the effort if it can be done by cvs2svn (which for us it probably
can't, due to some black magic we did on the CVS repository a few
years ago :-( ).  For the record, I was opposed for exactly the reason
you give, but changed my mind.

The point is that with several developers there's almost surely
someone enthusiastic enough about svn to bear the burden of fooling
with the script for a couple of hours to see if it works, a fascist
policy about migrating account names makes that almost trivial, and
after that it's all gravy: the administration does not look any worse,
the security issues are similar, and the change is likely to incite
only a few people to press for account name changes after the move.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From greg.ewing at canterbury.ac.nz  Tue Aug  2 04:57:40 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 02 Aug 2005 14:57:40 +1200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab10050801170361616e22@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742A5@au3010avexu1.global.avaya.com>
	<bbaeab10050801170361616e22@mail.gmail.com>
Message-ID: <42EEE124.7070406@canterbury.ac.nz>

Brett Cannon wrote:

> The problem with subclassing NotImplementedError is you need to
> remember it is used to signal that a magic method does not work for a
> specific type and thus should try the __r*__ version.

No, that's done by *returning* NotImplemented, not by
raising an exception at all.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From stephen at xemacs.org  Tue Aug  2 05:25:48 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 02 Aug 2005 12:25:48 +0900
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <f6bc9b49050801135345872915@mail.gmail.com> (Willem Broekema's
	message of "Mon, 1 Aug 2005 22:53:19 +0200")
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
Message-ID: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Willem" == Willem Broekema <metawilm at gmail.com> writes:

    Willem> So, in short, Keyboard interrupts in Lisp are a
    Willem> serious-condition, not an error.

    Willem> (And what is labeled CriticalException in this discussion,
    Willem> has in serious-condition Lisp's counterpart.)

I don't see it that way.  Rather, "Raisable" is the closest equivalent
to "serious-condition", and "CriticalException" is an intermediate
class that has no counterpart in Lisp usage.

    >> and if an unhandled signal whose intended interpretation is
    >> "drop the gun and put your hands on your head!" isn't critical,
    >> what is?<wink>

    Willem> Eh, are you serious? <wink>

Yes.  Unhandled, KeyboardInterrupt means that the user has forcibly
taken control away from the program without giving it a chance to
preserve state, finish responding to (realtime) external conditions,
or even activate vacation(1), and the program is entirely at the mercy
of the user.  Usually, the program then proceeds to die without
dignity.  If it's a realtime application, killing it is probably the
only merciful thing to do.

If you were the program, wouldn't you consider that critical?

    Willem> But when CriticalException, despite its name, is not a
    Willem> subclass of Exception, that is a bit strange.

Granted.  It doesn't bother me, but since it bothers both you and
Philip Eby, I concede the point; we should find a better name (or not
bother with such a class, see below).

    Willem> I'd prefer the 'condition' and 'error' terminology, and to
    Willem> label a keyboard interrupt a condition, not any kind of
    Willem> exception or error.

Now, that does bother me.<wink>  Anything we will not permit a program
to ignore with a bare "except: pass" if it so chooses had better be
more serious than merely a "condition".  Also, to me a "condition" is
something that I poll for, it does not interrupt me.  To me, a
condition (even a serious one) is precisely the kind of thing that I
should be able to ignore with a bare except!

Your description of the CL hierarchy makes me wonder if there's any
benefit to having a class between Raisable and KeyboardInterrupt.
Unlike SystemShutdown or PowerFailure, KeyboardInterrupt does imply
presence of a user demanding attention; I suppose that warrants
special treatment.  On the other hand, I don't see a need for a class
whose members share only the property that they are not catchable with
a bare except, leading to

Raisable -+- Exception
          +- KeyboardInterrupt
          +- SystemShutdown
          +- PowerFailure
          +- (etc)

or even

Exception -+- CatchableException
          +- KeyboardInterrupt
          +- SystemShutdown
          +- PowerFailure
          +- (etc)

The latter is my mental model, and would work well with bare excepts.
It also would encourage the programmer to think about whether
an Exception should be catchable or is a special case, but I don't
think that's really helpful except for Python developers, who
presumably would be aware of the issues.

The former would be a compromise to allow "except Exception" to be a
natural idiom, which I prefer to bare excepts on stylistic grounds.
On balance, that's what I advocate.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From gnn at neville-neil.com  Tue Aug  2 04:08:11 2005
From: gnn at neville-neil.com (George V. Neville-Neil)
Date: Tue, 02 Aug 2005 11:08:11 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com>
References: <42E93940.6080708@v.loewis.de>
	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
Message-ID: <m24qa9f5v8.wl%gnn@neville-neil.com>

At Mon, 01 Aug 2005 10:52:03 -0700,
Donovan Baarda wrote:
> 
> On Sun, 2005-07-31 at 23:54, Stephen J. Turnbull wrote:
> > >>>>> "BAW" == Barry Warsaw <barry at python.org> writes:
> > 
> >     BAW> So are you saying that moving to svn will let us do more long
> >     BAW> lived branches?  Yay!
> > 
> > Yes, but you still have to be disciplined about it.  svn is not much
> > better than cvs about detecting and ignoring spurious conflicts due to
> > code that gets merged from branch A to branch B, then back to branch
> > A.  Unrestricted cherry-picking is still out.
> 
> Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge
> properly. All the other cool stuff like renames etc is kinda undone by
> that. For a definition of properly, see;
> 
> http://prcs.sourceforge.net/merge.html
> 
> This is why I don't bother migrating any existing CVS projects to SVN;
> the benefits don't yet outweigh the pain of migrating. For new projects
> sure, SVN is a better choice than CVS.

Since Python is Open Source are you looking at Per Force which you can
use for free and seems to be a happy medium between something like CVS
and something horrific like Clear Case?

Later,
George

From pje at telecommunity.com  Tue Aug  2 06:31:42 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 00:31:42 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>

At 12:25 PM 8/2/2005 +0900, Stephen J. Turnbull wrote:
> >>>>> "Willem" == Willem Broekema <metawilm at gmail.com> writes:
>
>     Willem> So, in short, Keyboard interrupts in Lisp are a
>     Willem> serious-condition, not an error.
>
>     Willem> (And what is labeled CriticalException in this discussion,
>     Willem> has in serious-condition Lisp's counterpart.)
>
>I don't see it that way.  Rather, "Raisable" is the closest equivalent
>to "serious-condition",

I don't think that Lisp's idea of an exception hierarchy has much bearing here.


>     >> and if an unhandled signal whose intended interpretation is
>     >> "drop the gun and put your hands on your head!" isn't critical,
>     >> what is?<wink>
>
>     Willem> Eh, are you serious? <wink>
>
>Yes.  Unhandled, KeyboardInterrupt means that the user has forcibly
>taken control away from the program without giving it a chance to
>preserve state, finish responding to (realtime) external conditions,
>or even activate vacation(1), and the program is entirely at the mercy
>of the user.  Usually, the program then proceeds to die without
>dignity.  If it's a realtime application, killing it is probably the
>only merciful thing to do.
>
>If you were the program, wouldn't you consider that critical?

You just said, "Unhandled, KeyboardInterrupt means..."  If the program 
doesn't *want* to handle KeyboardInterrupt, then it obviously *isn't* 
critical, because it doesn't care.  Conversely, if it *does* handle 
KeyboardInterrupt, then once again, it's not critical by your definition.

So, clearly, KeyboardInterrupt is thus *not* critical, and doesn't belong 
in the CriticalException hierarchy.

Note, by the way, that Python programs can disable a KeyboardInterrupt from 
ever occurring in the first place, whereas none of the other 
CriticalException classes can be "disabled" because they're actually 
*error* conditions, while KeyboardInterrupt is just an asynchronous 
notification - for control flow purposes.  Ergo, it's a control flow 
exception.  (Similarly, a Python program can avoid raising any of the other 
control flow errors; they are by and large optional features.)


>     Willem> I'd prefer the 'condition' and 'error' terminology, and to
>     Willem> label a keyboard interrupt a condition, not any kind of
>     Willem> exception or error.
>
>Now, that does bother me.<wink>  Anything we will not permit a program
>to ignore with a bare "except: pass" if it so chooses had better be
>more serious than merely a "condition".  Also, to me a "condition" is
>something that I poll for, it does not interrupt me.  To me, a
>condition (even a serious one) is precisely the kind of thing that I
>should be able to ignore with a bare except!

On the contrary, it is control-flow exceptions that bare except clauses are 
most harmful to: StopIteration, SystemExit, and...  you guessed 
it...  KeyboardInterrupt.

An exception that's being used for control flow is precisely the kind of 
thing you don't want anything but an explicit except clause to 
catch.  Whether critical errors should also pass bare except clauses is a 
distinct issue, one which KeyboardInterrupt really doesn't enter into.

If you think that a KeyboardInterrupt is an error, then it's an indication 
that Python's documentation and the current exception class hierarchy has 
failed to educate you sufficiently, and that we *really* need to add a 
class like ControlFlowException into the hierarchy to help make sure that 
other people don't end up sharing your misunderstanding.  ;-)


From stephen at xemacs.org  Tue Aug  2 09:13:10 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 02 Aug 2005 16:13:10 +0900
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	(Phillip J. Eby's message of "Tue, 02 Aug 2005 00:31:42 -0400")
References: <f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
Message-ID: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Phillip" == Phillip J Eby <pje at telecommunity.com> writes:

    Phillip> You just said, "Unhandled, KeyboardInterrupt means..."
    Phillip> If the program doesn't *want* to handle
    Phillip> KeyboardInterrupt, then it obviously *isn't* critical,
    Phillip> because it doesn't care.  Conversely, if it *does* handle
    Phillip> KeyboardInterrupt, then once again, it's not critical by
    Phillip> your definition.

That's not my definition.  By that argument, no condition that can be
handled can be critical.

By my definition, the condition only needs to prevent the program from
continuing normally when it arises.  KeyboardInterrupt is a convention
that is used to tell a program that continuing normally is not
acceptable behavior, and therefore "critical" by my definition.

Under either definition, we'll still need to do something special with
MemoryError, KeyboardInterrupt, et amicae, and they still shouldn't be
caught by a generic "except Exception".  We agree on that, don't we?

    Phillip> Note, by the way, that Python programs can disable a
    Phillip> KeyboardInterrupt [...].  Ergo, it's a control flow
    Phillip> exception.

Sure, in some sense---but not in the Python language AFAIK.  Which
control constructs in the Python language define semantics for
continuation after KeyboardInterrupt occurs?  Anything that can stop a
program but the language doesn't define semantics for continuation is
critical and exceptional by my definition.

    Willem> I'd prefer the 'condition' and 'error' terminology, and to
    Willem> label a keyboard interrupt a condition, not any kind of
    Willem> exception or error.

    >> Now, that does bother me.<wink> [...]

    Phillip> On the contrary, it is control-flow exceptions that bare
    Phillip> except clauses are most harmful to: StopIteration,
    Phillip> SystemExit, and...  you guessed it...  KeyboardInterrupt.

That is a Python semantics issue, but as far as I can see there's
unanimity on it.  I and (AFAICS) Willem were discussing the
connotations of the _names_ at this point, and whether they were
suggestive of the semantics we (all!) seem to agree on.  I do not find
the word "condition" suggestive of the "things 'bare except' should
not catch" semantics.  I believe enough others will agree with me that
the word "condition", even "serious condition", should be avoided.

    Phillip> An exception that's being used for control flow is
    Phillip> precisely the kind of thing you don't want anything but
    Phillip> an explicit except clause to catch.

Which is exactly the conclusion I reached:

    [It] makes me wonder if there's any benefit to having a class [ie,
    CriticalException] between Raisable and KeyboardInterrupt.  ...I
    don't see a need for a class whose members share only the property
    that they are not catchable with a bare except....

Now, somebody proposed:

Raisable -+- Exception
          +- ...
          +- ControlFlowException -+- StopIteration
                                   +- KeyboardInterrupt

As I wrote above, I see no use for that; I think that's what you're
saying too, right?  AIUI, you want

Raisable -+- Exception
          +- ...
          +- StopIteration
          +- KeyboardInterrupt

so that only the appropriate control construct or an explicit except
can catch a control flow exception.  At least, you've convinced me
that "critical exception" is not a concept that should be implemented
in the Python language specification.  Rather, (for those who think as
I do, if there are others<wink>) "critical exception" would be an
intuitive guide to a subclass of exceptions that shouldn't be caught
by a bare except (or a handler for any superclass except Raisable, for
that matter).

By the same token, "control flow exception" is a pedagogical concept,
not something that should be reified in a ControlFlowException class,
right?

    Phillip> If you think that a KeyboardInterrupt is an error,

I have used the word "error" only in quoting Willem, and that's quite
deliberate.  I don't think that a condition need be an error to be
"critical".

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From martin at v.loewis.de  Tue Aug  2 09:58:12 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 02 Aug 2005 09:58:12 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <m24qa9f5v8.wl%gnn@neville-neil.com>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com>
Message-ID: <42EF2794.1000209@v.loewis.de>

George V. Neville-Neil wrote:
> Since Python is Open Source are you looking at Per Force which you can
> use for free and seems to be a happy medium between something like CVS
> and something horrific like Clear Case?

No. The PEP is only about Subversion. Why should we be looking at Per
Force? Only because Python is Open Source?

I think anything but Subversion is ruled out because:
- there is no offer to host that anywhere (for subversion, there is
  already svn.python.org)
- there is no support for converting a CVS repository (for subversion,
  there is cvs2svn)

Regards,
Martin

From mal at egenix.com  Tue Aug  2 11:56:59 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 02 Aug 2005 11:56:59 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EB5AD1.60703@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>
	<42EA061A.9040609@egenix.com>		<42EA98CC.4060003@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>
	<42EB5AD1.60703@v.loewis.de>
Message-ID: <42EF436B.3050308@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
>  > The PSF does have a reasonable budget, so why not use it to
>  > maintain the infrastructure needed for Python development and
>  > let a company do the administration of the needed servers and
>  > the importing of the CSV and tracker items into their
>  > systems ?
> 
> In principle, this might be a good idea. In practice, it falls
> short of details: which company, what precisely are their procedures,
> etc. It's not always the case that giving money to somebody really
> gives you back the value you expect.

True, but if we never ask, we'll never know :-)

My question was: Would asking a professional hosting company
be a reasonable approach ?

>From the answers, I take it that there's not much trust in these
offers, so I guess there's not much desire to PSF money into this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 02 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Tue Aug  2 12:00:42 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 Aug 2005 20:00:42 +1000
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <f6bc9b49050801135345872915@mail.gmail.com>	<bbaeab100507291734337930a2@mail.gmail.com>	<42EB576F.3060309@egenix.com>	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>	<42EC21F8.3040704@gmail.com>	<bbaeab100507301923742b7b60@mail.gmail.com>	<f6bc9b4905073103367c19832@mail.gmail.com>	<bbaeab1005073109217f3a33f1@mail.gmail.com>	<f6bc9b49050801014968cac94f@mail.gmail.com>	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>	<f6bc9b49050801135345872915@mail.gmail.com>	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <42EF444A.4040108@gmail.com>

Stephen J. Turnbull wrote:
> Now, somebody proposed:
> 
> Raisable -+- Exception
>           +- ...
>           +- ControlFlowException -+- StopIteration
>                                    +- KeyboardInterrupt
> 
> As I wrote above, I see no use for that

The use for it is :

   try:
     # do stuff
   except ControlFlowException:
     raise
   except Raisable:
     # handle anything else

Sure, you could write it as:

   try:
     # do stuff
   except (CriticalException, Exception, Warning):
     # handle anything else

But the former structure better reflects the programmers intent (handle 
everything except control flow exceptions).

It's a fact that Python uses exceptions for control flow - KeyboardInterrupt 
[1], StopIteration, SystemExit (and soon to be GeneratorExit as well). 
Grouping them under a common parent allows them to be dealt with as a group, 
rather than their names being spelt out explicitly.

Actually having this in the exception hierarchy is beneficial from a 
pedagogical point of view as well - the hierarchy is practically the first 
thing you encounter when you run "help ('exceptions')" at the interactive prompt.

I have a Python 2.5 candidate hierarchy below, which uses dual inheritance to 
avoid breaking backward compatibility - any existing except clauses will catch 
all of the exceptions they used to catch. The only new inheritance introduced 
is to new exceptions, also avoiding backward compatibility problems, as any 
existing except clauses will let by all of the exceptions they used to let by. 
There are no removals, but the deprecation process is started in order to 
change the names of ReferenceError and RuntimeWarning to WeakReferenceError 
and SemanticsWarning.

With this hierarchy, the recommended parent class for application errors 
becomes Error, and "except Error:" is preferred to any of "except:", "except 
Exception:" and "except StandardError:" (although these three continue to 
catch everything they used to catch).

The recommended workaround for libraries raising errors which still inherit 
directly from Exception is:
   try:
     # Use library
   except (ControlFlowException, CriticalError):
     raise
   except Exception:
     # Do stuff

(Remove the 'Exception' part if the library is so outdated that it still 
raises string exceptions)

Applications which use exceptions to control the flow of execution rather than 
to indicate an error (e.g. breaking out of multiple nested loops) are free to 
use ControlFlowException directly, or else define their own subclasses of 
ControlFlowException.

This hierarchy achieves my main goal for the exception reorganisation, which 
is to make it easy for scripts and applications to avoid inadvertently 
swallowing the control flow exceptions and critical errors, while still being 
able to provide generic error handlers for application faults. (Hmm, the 
pre-PEP doesn't include that as a goal in the 'Philosophy' section. . .)

Python 2.4 Compatible Improved Exception Hierarchy v 0.1
========================================================

Exception
+-- ControlFlowException (new)
      +-- GeneratorExit (new)
      +-- StopIteration
      +-- SystemExit
      +-- KeyboardInterrupt (dual-inheritance new)
+-- StandardError
      +-- KeyboardInterrupt (dual-inheritance new)
      +-- CriticalError (new)
          +-- MemoryError
          +-- SystemError
      +-- Error (new)
          +-- AssertionError
          +-- AttributeError
          +-- EOFError
          +-- ImportError
          +-- TypeError
          +-- ReferenceError (deprecated), WeakReferenceError (new alias)
          +-- ArithmeticError
              +-- FloatingPointError
              +-- DivideByZeroError
              +-- OverflowError
          +-- EnvironmentError
              +-- OSError
                  +-- WindowsError
              +-- IOError
          +-- LookupError
              +-- IndexError
              +-- KeyError
          +-- NameError
              +-- UnboundLocalError
          +-- RuntimeError
              +-- NotImplementedError
          +-- SyntaxError
              +-- IndentationError
                  +-- TabError
          +-- ValueError
              +-- UnicodeError
                  +-- UnicodeDecodeError
                  +-- UnicodeEncodeError
                  +-- UnicodeTranslateError
+-- Warning
      +-- DeprecationWarning
      +-- FutureWarning
      +-- PendingDeprecationWarning
      +-- RuntimeWarning (deprecated), SemanticsWarning (new alias)
      +-- SyntaxWarning
      +-- UserWarning

Cheers,
Nick.

[1] PJE has convinced me that I was right in thinking that KeyboardInterrupt 
was a better fit under ControlFlowExceptions than it was under CriticalError.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mal at egenix.com  Tue Aug  2 12:07:57 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 02 Aug 2005 12:07:57 +0200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <42EF444A.4040108@gmail.com>
References: <f6bc9b49050801135345872915@mail.gmail.com>	<bbaeab100507291734337930a2@mail.gmail.com>	<42EB576F.3060309@egenix.com>	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>	<42EC21F8.3040704@gmail.com>	<bbaeab100507301923742b7b60@mail.gmail.com>	<f6bc9b4905073103367c19832@mail.gmail.com>	<bbaeab1005073109217f3a33f1@mail.gmail.com>	<f6bc9b49050801014968cac94f@mail.gmail.com>	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>	<f6bc9b49050801135345872915@mail.gmail.com>	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42EF444A.4040108@gmail.com>
Message-ID: <42EF45FD.5090800@egenix.com>

Nick Coghlan wrote:
> I have a Python 2.5 candidate hierarchy below, which uses dual inheritance to 
> avoid breaking backward compatibility - any existing except clauses will catch 
> all of the exceptions they used to catch. The only new inheritance introduced 
> is to new exceptions, also avoiding backward compatibility problems, as any 
> existing except clauses will let by all of the exceptions they used to let by. 
> There are no removals, but the deprecation process is started in order to 
> change the names of ReferenceError and RuntimeWarning to WeakReferenceError 
> and SemanticsWarning.

+1.

I like this approach of using multiple inheritence to solve the
b/w compatibility problem.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 02 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mwh at python.net  Tue Aug  2 12:07:59 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 02 Aug 2005 11:07:59 +0100
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com> (Donovan
	Baarda's message of "Mon, 01 Aug 2005 10:52:03 -0700")
References: <42E93940.6080708@v.loewis.de>
	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
Message-ID: <2mslxszm68.fsf@starship.python.net>

Donovan Baarda <abo at minkirri.apana.org.au> writes:

> This is why I don't bother migrating any existing CVS projects to SVN;
> the benefits don't yet outweigh the pain of migrating.

I think they do.  I was on dialup for a while, and would have _loved_
Python to be using SVN then -- and given how long diffs can take even
over my broadband connection...

Cheers,
mwh

PS: Wot, noone's suggested git yet? :)

-- 
  C++ is a siren song.  It *looks* like a HLL in which you ought to
  be able to write an application, but it really isn't.
                                       -- Alain Picard, comp.lang.lisp

From mark.russell at redmoon.me.uk  Tue Aug  2 14:24:07 2005
From: mark.russell at redmoon.me.uk (Mark Russell)
Date: Tue, 02 Aug 2005 13:24:07 +0100
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <42EF444A.4040108@gmail.com>
References: <f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42EF444A.4040108@gmail.com>
Message-ID: <1122985447.6108.9.camel@localhost>

On Tue, 2005-08-02 at 11:00, Nick Coghlan wrote:
> With this hierarchy, the recommended parent class for application errors 
> becomes Error, ...

And presumably Error could also be the recommended exception for
quick'n'dirty scripts.

Mark Russell


From pinard at iro.umontreal.ca  Tue Aug  2 16:49:08 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue, 2 Aug 2005 10:49:08 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EF2794.1000209@v.loewis.de>
References: <1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
Message-ID: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>

[Martin von L?wis]

> The PEP is only about Subversion.  I think anything but Subversion is
> ruled out because:

> - there is no offer to host that anywhere (for subversion, there is
> already svn.python.org)

> - there is no support for converting a CVS repository (for subversion,
> there is cvs2svn)

I quickly discussed Subversion with a few friends.

While some say Subversion is the most reasonable avenue nowadays, others
them told me they found something more appealing than Subversion:

   http://www.venge.net/monotone/

The hosting paradigm is fairly different, and for a few weeks now, they
have a CVS repository converter.

In my very naive eyes, the centralised aspects of Python development
are be better represented with Subversion.  It is notable also that
Subversion if more Python-friendly than Monotone, with its Lua-based
scripting.  I did not deepen why, but at first glance, Monotone does not
seduce me.  On the other hand, the two guys saying good about Monotone
are well informed (and also well known), so I would not dismiss their
opinion so lightly.  So, it might be worth at least a quick look? :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From foom at fuhm.net  Tue Aug  2 16:53:33 2005
From: foom at fuhm.net (James Y Knight)
Date: Tue, 2 Aug 2005 10:53:33 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
References: <f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
Message-ID: <9EDA49FB-1E9B-4558-9441-90A65ECC5A52@fuhm.net>

On Aug 2, 2005, at 12:31 AM, Phillip J. Eby wrote:
> If you think that a KeyboardInterrupt is an error, then it's an  
> indication
> that Python's documentation and the current exception class  
> hierarchy has
> failed to educate you sufficiently, and that we *really* need to add a
> class like ControlFlowException into the hierarchy to help make  
> sure that
> other people don't end up sharing your misunderstanding.  ;-)

No... KeyboardInterrupt (just like other asynchronous exceptions)  
really should be treated as a critical error. Doing anything other  
than killing your process off after receiving it is just inviting  
disaster. Because the exception can have occurred absolutely  
anywhere, it is unsuitable for normal use. Aborting a function  
between two arbitrary bytecodes and trying to continue operation is  
simply a recipe for disaster. For example, in threadable.py between  
line 200 "saved_state = self._release_save()" and 201 "try:    #  
restore state no matter what (e.g., KeyboardInterrupt)" would be a  
bad place to hit control-c if you ever need to use that Condition  
again. This kind of problem is pervasive and unavoidable.

If you want to do a clean shutdown on control-c, the only sane way is  
to install a custom signal handler that doesn't throw an asynchronous  
exception at you.

There's a reason asynchronously killing off threads was deprecated in  
java.

James

From raymond.hettinger at verizon.net  Tue Aug  2 16:55:43 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 02 Aug 2005 10:55:43 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>
Message-ID: <000001c59772$4224c8a0$92b2958d@oemcomputer>

[Fran?ois Pinard]
> While some say Subversion is the most reasonable avenue nowadays,
others
> them told me they found something more appealing than Subversion:
> 
>    http://www.venge.net/monotone/

The current release is 0.21 which suggests that it is not ready for
primetime.


Raymond


From foom at fuhm.net  Tue Aug  2 17:08:11 2005
From: foom at fuhm.net (James Y Knight)
Date: Tue, 2 Aug 2005 11:08:11 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python  3.0
In-Reply-To: <5.1.1.6.0.20050731124043.027e3768@mail.telecommunity.com>
References: <42EC21F8.3040704@gmail.com> <42EB576F.3060309@egenix.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<5.1.1.6.0.20050731124043.027e3768@mail.telecommunity.com>
Message-ID: <42688208-3A8E-492F-86D8-E4FE76FB294D@fuhm.net>

On Jul 31, 2005, at 12:49 PM, Phillip J. Eby wrote:

> I think you're ignoring the part where most exception handlers are  
> already broken.  At least adding CriticalException and  
> ControlFlowException makes it possible to add this:
>
>     try:
>         ...
>     except (CriticalException,ControlFlowException):
>         raise
>     except:
>         ...
>
> This isn't great, I admit, but at least it would actually *work*.
>
> I also don't see how changing the recommended base class from  
> Exception to Error causes *problems* for every library.  Sure, it  
> forces them to move (eventually!), but it's a trivial change, and  
> makes it *possible* to do the right thing with exceptions (e.g.  
> except Error:) as soon as all the libraries you depend on have  
> moved to using Error.

Exactly. That is the problem. Adding a new class above Exception in  
the hierarchy allows everything to work nicely *now*. Recommended  
practice has been to have exceptions derive from Exception for a  
looong time. Changing everybody now will take approximately forever,  
which means the Error class is pretty much useless. By keeping the  
definition of Exception as "the standard thing you should derive from  
and catch", and adding a superclass with things you shouldn't catch,  
you make conversion a lot simpler. If you're not worried about  
compatibility with ye olde string exceptions, you can start using  
"except Exception" immediately. If you are, you can do as your  
example above. And when Python v.Future comes around, "except  
Exception" will be the only reasonable thing to do.

If, on the other hand, we use Exception as the base class and Error  
as the thing you should use, I predict that even by the time Python  
v.Future comes out, many libraries/prgrams will still have exceptions  
deriving from Exception, thus making the Exception/Error distinction  
somewhat broken.

James

From tjreedy at udel.edu  Tue Aug  2 17:09:30 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 2 Aug 2005 11:09:30 -0400
Subject: [Python-Dev] __autoinit__ (Was: Proposal: reducing self.x=x;
	self.y=y; self.z=z boilerplate code)
References: <139701967.20050731214526@intercable.ru>
Message-ID: <dco2ba$drr$1@sea.gmane.org>


"falcon" <falcon at intercable.ru> wrote in message 
news:139701967.20050731214526 at intercable.ru...
> Hello python-list,
>
> As I Understood, semantic may be next:
[snip]

This was properly posted to the general Python discussion group/list.
Reposted here, to the Python development list/group, it is offtopic.

If you did not get a satisfactory answer to your first post to the general 
group/list, it may be because your question is confusing.  So you might 
want to try again there with different words.

Terry J. Reedy


From pje at telecommunity.com  Tue Aug  2 17:39:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 11:39:19 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050802113712.025aa098@mail.telecommunity.com>

At 04:13 PM 8/2/2005 +0900, Stephen J. Turnbull wrote:
>Now, somebody proposed:
>
>Raisable -+- Exception
>           +- ...
>           +- ControlFlowException -+- StopIteration
>                                    +- KeyboardInterrupt
>
>As I wrote above, I see no use for that; I think that's what you're
>saying too, right?  AIUI, you want
>
>Raisable -+- Exception
>           +- ...
>           +- StopIteration
>           +- KeyboardInterrupt
>
>so that only the appropriate control construct or an explicit except
>can catch a control flow exception.

No, I want ControlFlowException to exist as a parent so that code today can 
work around the fact that bare "except:" and "except Exception:" catch 
everything.  In Python 3.0, we should have "except Error:" and be able to 
have it catch everything but control flow exceptions and possibly critical 
errors.


From pje at telecommunity.com  Tue Aug  2 17:48:03 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 11:48:03 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <42EF444A.4040108@gmail.com>
References: <87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com>

At 08:00 PM 8/2/2005 +1000, Nick Coghlan wrote:
>Python 2.4 Compatible Improved Exception Hierarchy v 0.1
>========================================================
>
>Exception
>+-- ControlFlowException (new)
>       +-- GeneratorExit (new)
>       +-- StopIteration
>       +-- SystemExit
>       +-- KeyboardInterrupt (dual-inheritance new)
>+-- StandardError
>       +-- KeyboardInterrupt (dual-inheritance new)
>       +-- CriticalError (new)
>           +-- MemoryError
>           +-- SystemError
>       +-- Error (new)

Couldn't we make Error a parent of StandardError, here, and then make the 
CriticalError subclasses dual-inherit StandardError, i.e.:

     Error
         CriticalError
             MemoryError (also subclass StandardError)
             SystemError (also subclass StandardError)
         StandardError
             ...

In this way, we can encourage people to inherit from Error.  Or maybe we 
should just make the primary hierarchy the way we want it to be, and only 
cross-link exceptions to StandardError that were previously under 
StandardError, i.e.:

     Raisable
         ControlFlowException
             ...  (cross-inherit to StandardError as needed)
         CriticalError
             ...  (cross-inherit to StandardError as needed)
         Exception
             ...

This wouldn't avoid "except Exception" and bare except being problems, but 
at least you can catch the uncatchables and reraise them.

Hm.  Maybe we should include a Reraisable base for ControlFlowException and 
CriticalError?  Then you could do "except Reraisable: raise" as a nice way 
to do the right thing until Python 3.0.

It seems to me that multiple inheritance is definitely the right idea, 
though.  That way, we can get the hierarchy we really want with only a 
minimum of boilerplate in pre-3.0 to make it actually work.


From pje at telecommunity.com  Tue Aug  2 17:57:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 11:57:15 -0400
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <9EDA49FB-1E9B-4558-9441-90A65ECC5A52@fuhm.net>
References: <5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<bbaeab100507291734337930a2@mail.gmail.com>
	<42EB576F.3060309@egenix.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050802115018.027f4360@mail.telecommunity.com>

At 10:53 AM 8/2/2005 -0400, James Y Knight wrote:
>No... KeyboardInterrupt (just like other asynchronous exceptions)
>really should be treated as a critical error. Doing anything other
>than killing your process off after receiving it is just inviting
>disaster. Because the exception can have occurred absolutely
>anywhere, it is unsuitable for normal use. Aborting a function
>between two arbitrary bytecodes and trying to continue operation is
>simply a recipe for disaster. For example, in threadable.py between
>line 200 "saved_state = self._release_save()" and 201 "try:    #
>restore state no matter what (e.g., KeyboardInterrupt)" would be a
>bad place to hit control-c if you ever need to use that Condition
>again. This kind of problem is pervasive and unavoidable.

In my personal experience with using KeyboardInterrupt I've only ever 
needed to do some minor cleanup of external state, such as removing 
lockfiles, abandoning connections, etc., so I haven't encountered this 
issue before.  I can see, however, why it would be a problem if you were 
trying to keep the program *running* - but I've been assuming that 
KeyboardInterrupt is something that always means "attempt to shutdown 
gracefully".  I suppose considering it a critical error might put it more 
clearly in that category.

I'm not 100% convinced, but you've definitely given me something to think 
about.  On the other hand, any exception can happen "between two arbitrary 
bytecodes", so there are always circumstances that need special attention, 
or require a "with block_signals" statement or something.  I suppose this 
issue may have to come down to BDFL pronouncement.


From pinard at iro.umontreal.ca  Tue Aug  2 18:06:20 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue, 2 Aug 2005 12:06:20 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <000001c59772$4224c8a0$92b2958d@oemcomputer>
References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>
	<000001c59772$4224c8a0$92b2958d@oemcomputer>
Message-ID: <20050802160620.GA9652@alcyon.progiciels-bpi.ca>

[Raymond Hettinger]

> >    http://www.venge.net/monotone/

> The current release is 0.21 which suggests that it is not ready for
> primetime.

It suggests it, yes, and to me as well.  On the other hand, there is
a common prejudice that something requires many releases, or frequent
releases, to be qualified as good.  While it might be true on average,
this is not necessarily true: some packages need not so many steps for
becoming very usable, mature or stable.  (Note that I'm not asserting
anything about Monotone, here.)  We should merely keep an open mind.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From bcannon at gmail.com  Tue Aug  2 18:56:05 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 2 Aug 2005 09:56:05 -0700
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com>
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>
	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42EF444A.4040108@gmail.com>
	<5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com>
Message-ID: <bbaeab1005080209565974cc95@mail.gmail.com>

On 8/2/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:00 PM 8/2/2005 +1000, Nick Coghlan wrote:
[SNIP]
> Or maybe we
> should just make the primary hierarchy the way we want it to be, and only
> cross-link exceptions to StandardError that were previously under
> StandardError, i.e.:
> 
>      Raisable
>          ControlFlowException
>              ...  (cross-inherit to StandardError as needed)
>          CriticalError
>              ...  (cross-inherit to StandardError as needed)
>          Exception
>              ...
> 
> This wouldn't avoid "except Exception" and bare except being problems, but
> at least you can catch the uncatchables and reraise them.
> 

I think that is acceptable.  Using multiple inheritance to make sure
that the exceptions that have been moved out of the main exception
branch seems like it will be the best solution for giving some form of
backwards-compatibility for now while allowing things to still move
forward and not cripple the changes we want to make.

> Hm.  Maybe we should include a Reraisable base for ControlFlowException and
> CriticalError?  Then you could do "except Reraisable: raise" as a nice way
> to do the right thing until Python 3.0.
> 

As in exceptions that don't inherit from
Error/StandError/whatever_the_main_exception_is can easily be caught
separately?

> It seems to me that multiple inheritance is definitely the right idea,
> though.  That way, we can get the hierarchy we really want with only a
> minimum of boilerplate in pre-3.0 to make it actually work.
> 

Yeah.  I think name aliasing and multiple inheritance will take us a
long way.  Warnings should be able to take us the rest of the way.

-Brett (who is still waiting for a number; Barry, David, you out there?)

From gabriel.becedillas at corest.com  Tue Aug  2 20:59:58 2005
From: gabriel.becedillas at corest.com (Gabriel Becedillas)
Date: Tue, 02 Aug 2005 15:59:58 -0300
Subject: [Python-Dev] Syscall Proxying in Python
In-Reply-To: <1122919731.9688.43.camel@warna.corp.google.com>
References: <42EE5DAA.8040200@corest.com>
	<1122919731.9688.43.camel@warna.corp.google.com>
Message-ID: <42EFC2AE.6020205@corest.com>

Donovan Baarda wrote:
> On Mon, 2005-08-01 at 10:36, Gabriel Becedillas wrote:
> 
>>Hi,
>>We embbeded Python 2.0.1 in our product a few years ago and we'd like to
>>upgrade to Python 2.4.1. This was not a simple task, because we needed 
>>to execute syscalls on a remote host. We modified Python's source code 
>>in severall places to call our own versions of some functions. For 
>>example, instead of calling fopen(...), the source code was modified to 
>>call remote_fopen(...), and the same was done with other libc functions. 
>>Socket functions where hooked too (we modified socket.c), Windows 
>>Registry functions, etc..
> 
> 
> Wow... you guys sure did it the hard way. If you had done it at the
> Python level, you would have had a much easier time of both implementing
> and updating it.
> 
> As an example, have a look at my osVFS stuff. This is a replacement for
> the os module and open() that tricks Python into using a virtual file
> system;
> 
> http://minkirri.apana.org.au/~abo/projects/osVFS
> 
> 

Hi, thanks for your reply.
The problem I see with the aproach you're sugesting is that I have to 
rewrite a lot of code to make it work the way I want. We allready have 
the syscall proxying stuff with an stdio layer on top of it. I should 
have to rewrite some parts of some modules and use my own versions of 
stdio functions, and that is pretty much the same as we have done before.
There are also native objects that use stdio functions, and I should 
replace those ones too, or modules that have some native code that uses 
stdio, or sockets. I should duplicate those files, and make the same 
kind of search/replace work that we have done previously and that we'd 
like to avoid.
Please let me know if I misunderstood you.
Thanks again.

-- 


Gabriel Becedillas
Developer
CORE SECURITY TECHNOLOGIES

Florida 141 - 2? cuerpo - 7? piso
C1005AAC Buenos Aires - Argentina
Tel/Fax: (54 11) 5032-CORE (2673)
http://www.corest.com

From metawilm at gmail.com  Tue Aug  2 21:39:49 2005
From: metawilm at gmail.com (Willem Broekema)
Date: Tue, 2 Aug 2005 21:39:49 +0200
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <f6bc9b4905080212395deeb000@mail.gmail.com>

On 8/2/05, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> I don't see it that way.  Rather, "Raisable" is the closest equivalent
> to "serious-condition", and "CriticalException" is an intermediate
> class that has no counterpart in Lisp usage.

That would imply that all raisables are 'serious' in the Lisp sense,
which is defined as "all conditions serious enough to require
interactive intervention if not handled". Yet Python warnings are
raisable (as raisable is the root), but are certainly not serious in
the Lisp sense.

(This is complicated by that warnings are raised using 'signal'. More below.)

Willem:
> I'd prefer the 'condition' and 'error' terminology, and to
> label a keyboard interrupt a condition, not any kind of
> exception or error.

To clarify myself: a 'serious-condition' in CL stands for "all
conditions serious enough to require interactive intervention if not
handled"; I meant to label KI a 'serious-condition'.

Stephen: 
> Now, that does bother me.<wink>  Anything we will not permit a program
> to ignore with a bare "except: pass" if it so chooses had better be
> more serious than merely a "condition".  Also, to me a "condition" is
> something that I poll for, it does not interrupt me.  To me, a
> condition (even a serious one) is precisely the kind of thing that I
> should be able to ignore with a bare except!

If I understand your position correctly, it is probably not changed
yet by the above clarification. <wink>

Maybe it will surprise you, that in Lisp a bare except (ignore-errors)
does not catch non-serious things like warnings. And if left
uncatched, a warning leaks out to the top level, gets printed and
subsequently ignored. That's because non-serious conditions are
(usually) raised using 'signal', not 'error'. The default top-level
warnings handler just prints it, but does not influence the program
control flow, so the execution resumes just after the (warn ..) form.

This probably marks a very important difference between Python and CL.
I think one could say that where in Python one would use a bare except
to catch both non-serious and serious exceptions, in CL one normally
doesn't bother with catching the non-serious ones because they will
not create havoc at an outer level anyway. So in Python a warning must
be catched by a bare except, while in Lisp it would not. And from this
follow different contraints on the hierarchy.

By the way, this is the condition hierarchy in Allegro CL (most of
which is prescribed by the ANSI standard):
<http://www.franz.com/support/documentation/7.0/doc/errors.htm>


- Willem

From martin at v.loewis.de  Tue Aug  2 23:16:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 02 Aug 2005 23:16:05 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EF436B.3050308@egenix.com>
References: <42E93940.6080708@v.loewis.de>
	<42EA061A.9040609@egenix.com>		<42EA98CC.4060003@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>
	<42EB5AD1.60703@v.loewis.de> <42EF436B.3050308@egenix.com>
Message-ID: <42EFE295.6040906@v.loewis.de>

M.-A. Lemburg wrote:
> True, but if we never ask, we'll never know :-)
>
> My question was: Would asking a professional hosting company
> be a reasonable approach ?

It would be an option, yes, of course. It's not an approach that
*I* would be willing to implement, though.

>>From the answers, I take it that there's not much trust in these
> offers, so I guess there's not much desire to PSF money into this.

I haven't received any offers to make a qualified statement. I only
know that I would oppose an approach to ask somebody but our
volunteers to do it for free, and I also know that I don't want to
spend my time researching commercial alternatives (although I
wouldn't mind if you spent your time).

Regards,
Martin


From martin at v.loewis.de  Tue Aug  2 23:25:56 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 02 Aug 2005 23:25:56 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>
References: <1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050802144908.GA7898@alcyon.progiciels-bpi.ca>
Message-ID: <42EFE4E4.4020507@v.loewis.de>

Fran?ois Pinard wrote:
> So, it might be worth at least a quick look? :-)

Certainly not my look - although I'm willing to integrate
anything that people contribute into the PEP.

Regards,
Martin

From bcannon at gmail.com  Wed Aug  3 02:34:01 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 2 Aug 2005 17:34:01 -0700
Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python 3.0
Message-ID: <bbaeab1005080217346b2af653@mail.gmail.com>

OK, having taken in all of the suggestions, here is another revision
round.  I think I still have a place or two I partially ignored people
just because there was not a severe uproar and I still think the
original idea is good (renaming RuntimeError, for instance).  I also
added notes on handling the transition and rejected idea.

There is now only one open issue, which is whether
ControlFlowException should be removed.

And I am still waiting on a PEP number to be able to check this into
CVS and push me to flesh out the references.  =)


--------------------------------------------------------------

PEP: XXX
Title: Exception Reorganization for Python 3.0
Version: $Revision: 1.5 $
Last-Modified: $Date: 2005/06/07 13:17:37 $
Author: Brett Cannon <brett at python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 28-Jul-2005
Post-History: XX-XXX-XXX

.. contents::


Abstract
========

Python, as of version 2.4, has 38 exceptions (including warnings) in
the built-in namespace in a rather shallow hierarchy.
This list of classes has grown over the years without a chance to
learn from mistakes and cleaning up the hierarchy.
This PEP proposes doing a reorganization for Python 3.0 when
backwards-compatibility is not an issue.
Along with this reorganization, adding a requirement that all objects passed to
a ``raise`` statement must inherit from a specific superclass is proposed.
Lastly, the removal of bare ``except`` class is suggested.


Rationale
=========

Exceptions are a critical part of Python.
While exceptions are traditionally used to signal errors in a program,
they have also grown to be used for flow control for things such as
iterators.
There importance is great.

But the organization of the exception hierarchy is suboptimal.
Mostly for backwards-compatibility reasons, the hierarchy has stayed
very flat and old exceptions who usefulness have not been proven have
been left in.
Making exceptions more hierarchical would help facilitate exception
handling by making catching exceptions using inheritance much more
logical.
This should also help lead to less errors from being too broad in what
exceptions are caught in an ``except`` clause.

A required superclass for all exceptions is also being proposed
[Summary2004-08-01]_.
By requiring any object that is used in a ``raise`` statement to
inherit from a specific superclass, certain attributes (such as those
laid out in PEP 344 [PEP344]_) can be guaranteed to exist.
This also will lead to the planned removal of string exceptions.

Lastly, bare ``except`` clauses are to be removed [XXX Guido's reply to my
initial draft]_.
Often people use a bare ``except`` when what they really wanted were
non-critical exceptions to be caught while more system-specific ones,
such as MemoryError, to pass through and to halt the interpreter.
This leads to errors that can be hard to debug thanks to exceptions' sometimes
unpredictable execution flow.
It also causes ``except`` statements to follow the "explicit is better than
implicit" tenant of Python [XXX]_.


Philosophy of Reorganization
============================

There are several goals in this reorganization that defined the
philosophy used to guide the work.
One goal was to prune out unneeded exceptions.
Extraneous exceptions should not be left in since it just serves to
clutter the built-in namespace.
Unneeded exceptions also dilute the importance of other exceptions by
splitting uses between several exceptions when all uses should have
been under a single exception.

Another goal was to introduce any exceptions that were deemed needed
to fill any holes in the hierarchy.
Most new exceptions were done to flesh out the inheritance hierarchy
to make it easier to catch a category of exceptions with a simpler
``except`` clause.

Changing inheritance to make it more reasonable was a goal.
As stated above, having proper inheritance allows for more accurate
``except`` statements when catching exceptions based on the
inheritance tree.

Lastly, any renaming to make an exception's use more obvious from its
name was done.
Having to look up what an exception is meant to be used for because
the name does not proper reflect its usage is annoying and slows down
debugging.
Having a proper name also makes debugging easier on new programmers.
But for simplicity of existing user's and for transitioning to Python
3.0, only exceptions whose names were fairly out of alignment with
their stated purpose have been renamed.

New Hierarchy
=============

Exception
+-- CriticalException (new)
    +-- KeyboardInterrupt
    +-- MemoryError
    +-- SystemError
+-- ControlFlowException (new)
    +-- StopIteration
    +-- GeneratorExit
    +-- SystemExit
+-- StandardError
    +-- AssertionError
    +-- SyntaxError
        +-- IndentationError
            +-- TabError
    +-- UserException (rename of RuntimeError)
    +-- ArithmeticError
        +-- FloatingPointError
        +-- DivideByZeroError
        +-- OverflowError
    +-- UnicodeError
        +-- UnicodeDecodeError
        +-- UnicodeEncodeError
        +-- UnicodeTranslateError
    +-- LookupError
        +-- IndexError
        +-- KeyError
    +-- TypeError
    +-- AttributeError
    +-- EnvironmentError
	+-- OSError
	+-- IOError
	    +-- EOFError (new inheritance)
    +-- ImportError
    +-- NotImplementedError (new inheritance)
    +-- NamespaceError (rename of NameError)
        +-- UnboundGlobalError (new)
        +-- UnboundLocalError
	+-- UnboundFreeError (new)
    +-- WeakReferenceError (rename of ReferenceError)
    +-- ValueError
+-- Warning
    +-- UserWarning
    +-- AnyDeprecationWarning (new)
	+-- PendingDeprecationWarning 
        +-- DeprecationWarning
    +-- SyntaxWarning
    +-- SemanticsWarning (rename of RuntimeWarning)
    +-- FutureWarning


Differences Compared to Python 2.4
==================================

Changes to exceptions from Python 2.4 can take shape in three forms:
removal, renaming, or change in their superclass.
There are also new exceptions introduced in the proposed hierarchy.


New Exceptions
--------------

CriticalException
'''''''''''''''''

The superclass for exceptions for which a severe error has occurred that one
would not want to recover from.
The name is meant to reflect the point that these exceptions are
usually raised only when the interpreter should most likely be
terminated.
All classes that inherit from this class are raised when the virtual machine
has a asynchronous exception to raise about its state.


ControlFlowException
''''''''''''''''''''

This exception exists as a superclass for all exceptions that directly deal
with control flow.
Inheriting from Exception instead of StandardError 
prevents them from being caught accidently when one wants to catch errors.
The name, by not mentioning "Error", does not lead to one to confuse
the subclasses as errors.


UnboundGlobalError
''''''''''''''''''

Raised when a global variable was not found.


UnboundFreeError
''''''''''''''''

Raised when a free variable is not found.


AnyDeprecationWarning
'''''''''''''''''''''

A common superclass for all deprecation-related exceptions.
While having DeprecationWarning inherit from PendingDeprecationWarning was
suggested because a DeprecationWarning can be viewed as a
PendingDeprecationWarning that is happening now, the logic was not agreed upon
by a majority.
But since the exceptions are related, creating a common superclass is
warranted.


Removed Exceptions
------------------

WindowsError
''''''''''''

Too OS-specific to be kept in the built-in exception hierarchy.


Renamed Exceptions
------------------

RuntimeError
''''''''''''

Renamed UserException.

Meant for use as a generic exception to be used when one does not want to
create a new exception class but do not want to raise an exception that might
be caught based on inheritance, RuntimeError is poorly named.
It's name in Python 2.4 seems to suggest an error that occurred at runtime,
possibly an error in the VM.
Renaming the exception to UserException more clearly states the purpose for
the exception as quick-and-dirty exception for the user to use.
The name also keeps it in line with UserWarning.


ReferenceError
''''''''''''''

Renamed WeakReferenceError.

ReferenceError was added to the built-in exception hierarchy in Python
2.2 [exceptionsmodule]_.
Taken directly from the ``weakref`` module, its name comes directly
from its original name when it resided in the module.
Unfortunately its name does not suggest its connection to weak
references and thus deserves a renaming.


NameError
'''''''''

Renamed NamespaceError.

While NameError suggests its common use, it is not entirely apparent.
Making it more of a superclass for namespace-related exceptions warrants a
renaming to make it abundantly clear its use.
Plus the documentation of the exception module[XXX]_ states that it is
actually meant for global names and not for just any exception.


RuntimeWarning
''''''''''''''

Renamed SemanticsWarning.

RuntimeWarning is to represent semantic changes coming in the future.
But while saying that affects "runtime" is true, flat-out stating it
is a semantic change is much clearer, eliminating any possible
association of "runtime" with the virtual machine specifically.


Changed Inheritance
-------------------

AttributeError
''''''''''''''

Inherits from StandardError.

Originally inheriting from NotImplementedError, AttributeError is typically
raised because of the lack of an attribute which does not necessarily mean it
was not implemented but just not set yet.
Thus it has been decoupled from NotImplementedError.


EOFError
''''''''

Subclasses IOError.

Since an EOF comes from I/O it only makes sense that it be considered
an I/O error.


Required Superclass for ``raise``
=================================

By requiring all objects passed to a ``raise`` statement inherit from
a specific superclass, one is guaranteed that all exceptions will have
certain attributes.
If PEP 342 [PEP344]_ is accepted, the attributes outlined there will
be guaranteed to be on all exceptions raised.
This should help facilitate debugging by making the querying of
information from exceptions much easier.

The proposed hierarchy has Exception as the required class that one
must inherit from.


Implementation
--------------

Enforcement is straight-forward.
Modifying ``RAISE_VARARGS`` to do an inheritance check first before raising
an exception should be enough.  For the C API, all functions that set an
exception will have the same inheritance check.


Removal of Bare ``except`` Clauses
==================================

One of Python's basic tenants is "explicit is better than implicit".
Unfortunately a bare ``except`` clause implicitly states it should
catch all exceptions.
While useful as a way to catch all exceptions when any object can be
raised, requiring a specific superclass be inherited in order to raise
an object gives a single class to catch to cover all exceptions.
With this in mind, the removal of bare ``except`` statements is justified.


Implementation
--------------

A simple change to the grammar is all that is needed for implementation.


Transition Plan
===============

Exception Hierarchy Changes
---------------------------

New Exceptions
''''''''''''''

New exceptions can simply be added to the built-in namespace.
Any pre-existing objects with the same name will mask the new exceptions,
preserving backwards-compatibility.


Renamed Exceptions
''''''''''''''''''

Renamed exceptions will directly subclass the new names.
When the old exceptions are instantiated (which occurs when an exception is
caught, either by a ``try`` statement or by propagating to the top of the
execution stack), a PendingDeprecationWarning will be raised.

This should properly preserve backwards-compatibility as old usage won't change
and the new names can be used to also catch exceptions using the old name.
The warning of the deprecation is also kept simple.


New Inheritance for Old Exceptions
''''''''''''''''''''''''''''''''''

Using multiple inheritance to our advantage, exceptions whose inheritance has
changed in such a way as for them to not necessarily be caught by pre-existing
``except`` clauses can be made backwards-compatible.
By inheriting from both the new superclasses as well as the original
superclasses existing ``except`` clauses will continue to work as before while
allowing the new inheritance to be used for new clauses.

A PendingDeprecationWarning will be raised based on whether the bytecode
``COMPARE_OP(10)`` results in an exception being caught that would not have
under the new hierarchy.  This will require hard-coding in the implementation
of the bytecode.


Removed Exceptions
''''''''''''''''''

Exceptions scheduled for removal will be transitioned much like the old names
of renamed exceptions.
Upon instantiation a PendingDeprecationWarning will be raised stating the the
exception is due to be removed by Python 3.0 .


Required Superclass for ``raise``
---------------------------------

A SemanticsWarning will be raised when an object is passed to ``raise`` that
does not have the proper inheritance.


Removal of Bare ``except`` Clauses
----------------------------------

A PendingDeprecationWarning will be raised when a bare ``except`` clause is
used.


Rejected Ideas
==============

Threads on python-dev discussing this PEP can be found at [XXX]_.


KeyboardInterrupt inheriting from ControlFlowException
------------------------------------------------------

KeyboardInterrupt has been a contentious point within this hierarchy.
Some view the exception as more control flow being caused by the user.
But with its asynchronous cause thanks to the user being able to trigger the
exception at any point in code it has a more proper place inheriting from
CriticalException.  It also keeps the name of the exception from being
"CriticalError".


Renaming Exception to Raisable, StandardError to Exception
----------------------------------------------------------

While the naming makes sense and emphasizes the required superclass as what
must be inherited from for raising an object, the naming is not required.
Keeping the existing names minimizes code change to use the new names.


DeprecationWarning Inheriting From PendingDeprecationWarning
------------------------------------------------------------

Originally proposed because a DeprecationWarning can be viewed as a
PendingDeprecationWarning that is being removed in the next version.
But enough people thought the inheritance could logically work the other way
the idea was dropped.


AttributeError Inheriting From TypeError or NameError
-----------------------------------------------------

Viewing attributes as part of the interface of a type caused the idea of
inheriting from TypeError.
But that partially defeats the thinking of duck typing and thus was dropped.

Inheriting from NameError was suggested because objects can be viewed as having
their own namespace that the attributes lived in and when they are not found it
is a namespace failure.  This was also dropped as a possibility since not
everyone shared this view.


Removal of EnvironmentError
---------------------------

Originally proposed based on the idea that EnvironmentError was an unneeded
distinction, the BDFL overruled this idea [XXX]_.


Introduction of MacError and UnixError
--------------------------------------

Proposed to add symmetry to WindowsError, the BDFL said they won't be used
enough [XXX]_.  The idea of then removing WindowsError was proposed and
accepted as reasonable, thus completely negating the idea of adding these
exceptions.


SystemError Subclassing SystemExit
----------------------------------

Proposed because a SystemError is meant to lead to a system exit, the idea was
removed since CriticalException signifies this better.


ControlFlowException Under StandardError
----------------------------------------

It has been suggested that ControlFlowException inherit from StandardError.
This idea has been rejected based on the thinking that control flow exceptions
are typically not desired to be caught in a generic fashion as StandardError
will usually be used.


Open Issues
===========

Remove ControlFlowException?
----------------------------

It has been suggested that ControlFlowException is not needed.
Since the desire to catch any control flow exception will be atypical, the
suggestion is to just remove the exception and let the exceptions that
inherited from it inherit directly from Exception.  This still preserves the
seperation from StandardError which is one of the driving factors behind the
introduction of the exception.

Acknowledgements
================

Thanks to Robert Brewer, Josiah Carlson, Nick Coghlan, Timothy
Delaney, Jack Diedrich, Fred L. Drake, Jr., Philip J. Eby, Greg Ewing,
James Y. Knight, MA Lemburg, Guido van Rossum, Stephen J. Turnbull and
everyone else I missed for participating in the discussion.


References
==========

.. [PEP342] PEP 342 (Coroutines via Enhanced Generators)
            (http://www.python.org/peps/pep-0342.html)

.. [PEP344] PEP 344 (Exception Chaining and Embedded Tracebacks)
            (http://www.python.org/peps/pep-0344.html)

.. [exceptionsmodules] 'exceptions' module
            (http://docs.python.org/lib/module-exceptions.html)

.. [Summary2004-08-01] python-dev Summary (An exception is an
exception, unless it doesn't inherit from Exception)
            (http://www.python.org/dev/summary/2004-08-01_2004-08-15.html#an-exception-is-an-exception-unless-it-doesn-t-inherit-from-exception)

.. [Summary2004-09-01] python-dev Summary (Cleaning the Exception House)
            (http://www.python.org/dev/summary/2004-09-01_2004-09-15.html#cleaning-the-exception-house)

.. [python-dev1] python-dev email (Exception hierarchy)
            (http://mail.python.org/pipermail/python-dev/2004-August/047908.html)

.. [python-dev2] python-dev email (Dangerous exceptions)
            (http://mail.python.org/pipermail/python-dev/2004-September/048681.html)


Copyright
=========

This document has been placed in the public domain.

From stephen at xemacs.org  Wed Aug  3 02:49:15 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 03 Aug 2005 09:49:15 +0900
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <f6bc9b4905080212395deeb000@mail.gmail.com> (Willem Broekema's
	message of "Tue, 2 Aug 2005 21:39:49 +0200")
References: <bbaeab100507291734337930a2@mail.gmail.com>
	<5.1.1.6.0.20050730145112.027146e8@mail.telecommunity.com>
	<42EC21F8.3040704@gmail.com>
	<bbaeab100507301923742b7b60@mail.gmail.com>
	<f6bc9b4905073103367c19832@mail.gmail.com>
	<bbaeab1005073109217f3a33f1@mail.gmail.com>
	<f6bc9b49050801014968cac94f@mail.gmail.com>
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b49050801135345872915@mail.gmail.com>
	<871x5d10lf.fsf@tleepslib.sk.tsukuba.ac.jp>
	<f6bc9b4905080212395deeb000@mail.gmail.com>
Message-ID: <87wtn3hmk4.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Willem" == Willem Broekema <metawilm at gmail.com> writes:

    Willem> On 8/2/05, Stephen J. Turnbull <stephen at xemacs.org> wrote:

    >> I don't see it that way.  Rather, "Raisable" is the closest
    >> equivalent to "serious-condition", and "CriticalException" is
    >> an intermediate class that has no counterpart in Lisp usage.

    Willem> That would imply that all raisables are 'serious' in the
    Willem> Lisp sense,

No, it implies that Phillip was right when he wrote that the Lisp
hierarchy of signals is not relevant (as a whole) to the discussion of
Python Raisables.  Of course partial analogies are useful.

In any case, Nick's idiom of "except ControlFlowException: raise"
clarified everything for me.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From abo at minkirri.apana.org.au  Wed Aug  3 03:03:04 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue, 02 Aug 2005 18:03:04 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050802160620.GA9652@alcyon.progiciels-bpi.ca>
References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>
	<000001c59772$4224c8a0$92b2958d@oemcomputer>
	<20050802160620.GA9652@alcyon.progiciels-bpi.ca>
Message-ID: <1123030984.1821.124.camel@warna.corp.google.com>

On Tue, 2005-08-02 at 09:06, Fran?ois Pinard wrote:
> [Raymond Hettinger]
> 
> > >    http://www.venge.net/monotone/
> 
> > The current release is 0.21 which suggests that it is not ready for
> > primetime.
> 
> It suggests it, yes, and to me as well.  On the other hand, there is
> a common prejudice that something requires many releases, or frequent
> releases, to be qualified as good.  While it might be true on average,
> this is not necessarily true: some packages need not so many steps for
> becoming very usable, mature or stable.  (Note that I'm not asserting
> anything about Monotone, here.)  We should merely keep an open mind.

It is true that some well designed/developed software becomes reliable
very quicky. However, it still takes heavy use over time to prove that.
You don't want to be the guy who finds out that this is not one of those
bits of software.

IMHO you need maturity for revision control software... you are relying
on it for history. The only open source options worth considering for
Python are CVS and SVN, and even SVN is questionable (see bdb backend
issues).

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From python at rcn.com  Wed Aug  3 03:02:55 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 2 Aug 2005 21:02:55 -0400
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab1005080217346b2af653@mail.gmail.com>
Message-ID: <000301c597c7$1551dde0$92b2958d@oemcomputer>

The Py3.0 PEPs are a bit disconcerting.  Without 3.0 actively in
development, it is difficult to get the participation, interest, and
seriousness of thought that we apply to the current release.  The PEPs
may have the effect of prematurely finalizing discussions on something
that still has an ethereal if not pie-in-the-sky quality to it.  I would
hate for 3.0 development to start with constraints that got set in stone
before the project became a reality.

With respect to exception re-organization, the conversation has been
thought provoking but a little too much of a design-from-scratch
quality.  Each proposed change needs to be rooted in a specific problem
with the current hierarchy (i.e. what use cases cannot currently be
dealt with under the existing tree).  Setting a high threshold for
change will increase the likelihood that old code can be easily ported
and decrease the likelihood of either throwing away previous good
decisions or adopting new ideas that later prove unworkable.  IOW,
unless the current tree is thought to be really bad, then the new tree
ought to be very close to what we have now.


Raymond


> -----Original Message-----
> From: python-dev-bounces+python=rcn.com at python.org [mailto:python-dev-
> bounces+python=rcn.com at python.org] On Behalf Of Brett Cannon
> Sent: Tuesday, August 02, 2005 8:34 PM
> To: Python Dev
> Subject: [Python-Dev] PEP, take 2: Exception Reorganization for Python
3.0
> 
> OK, having taken in all of the suggestions, here is another revision
> round.  I think I still have a place or two I partially ignored people
> just because there was not a severe uproar and I still think the
> original idea is good (renaming RuntimeError, for instance).  I also
> added notes on handling the transition and rejected idea.
> 
> There is now only one open issue, which is whether
> ControlFlowException should be removed.
> 
> And I am still waiting on a PEP number to be able to check this into
> CVS and push me to flesh out the references.  =)

From pje at telecommunity.com  Wed Aug  3 03:17:56 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Aug 2005 21:17:56 -0400
Subject: [Python-Dev] PEP,
 take 2: Exception Reorganization for  Python 3.0
In-Reply-To: <000301c597c7$1551dde0$92b2958d@oemcomputer>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>

At 09:02 PM 8/2/2005 -0400, Raymond Hettinger wrote:
>The Py3.0 PEPs are a bit disconcerting.  Without 3.0 actively in
>development, it is difficult to get the participation, interest, and
>seriousness of thought that we apply to the current release.  The PEPs
>may have the effect of prematurely finalizing discussions on something
>that still has an ethereal if not pie-in-the-sky quality to it.  I would
>hate for 3.0 development to start with constraints that got set in stone
>before the project became a reality.
>
>With respect to exception re-organization, the conversation has been
>thought provoking but a little too much of a design-from-scratch
>quality.  Each proposed change needs to be rooted in a specific problem
>with the current hierarchy (i.e. what use cases cannot currently be
>dealt with under the existing tree).  Setting a high threshold for
>change will increase the likelihood that old code can be easily ported
>and decrease the likelihood of either throwing away previous good
>decisions or adopting new ideas that later prove unworkable.  IOW,
>unless the current tree is thought to be really bad, then the new tree
>ought to be very close to what we have now.

+1.  The main things that need fixing, IMO, are the need for critical and 
control flow exceptions to be distinguished from "normal" errors.  The rest 
is mostly too abstract for me to care about in 2.x.


From abo at minkirri.apana.org.au  Wed Aug  3 03:22:28 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue, 02 Aug 2005 18:22:28 -0700
Subject: [Python-Dev] Syscall Proxying in Python
In-Reply-To: <42EFC2AE.6020205@corest.com>
References: <42EE5DAA.8040200@corest.com>
	<1122919731.9688.43.camel@warna.corp.google.com>
	<42EFC2AE.6020205@corest.com>
Message-ID: <1123032148.1859.131.camel@warna.corp.google.com>

On Tue, 2005-08-02 at 11:59, Gabriel Becedillas wrote:
> Donovan Baarda wrote:
[...]
> > Wow... you guys sure did it the hard way. If you had done it at the
> > Python level, you would have had a much easier time of both implementing
> > and updating it.
[...]
> Hi, thanks for your reply.
> The problem I see with the aproach you're sugesting is that I have to 
> rewrite a lot of code to make it work the way I want. We allready have 
> the syscall proxying stuff with an stdio layer on top of it. I should 
> have to rewrite some parts of some modules and use my own versions of 
> stdio functions, and that is pretty much the same as we have done before.
> There are also native objects that use stdio functions, and I should 
> replace those ones too, or modules that have some native code that uses 
> stdio, or sockets. I should duplicate those files, and make the same 
> kind of search/replace work that we have done previously and that we'd 
> like to avoid.
> Please let me know if I misunderstood you.

Nope... you got it all figured out. I guess it depends on what degree of
"proxying" you want... I thought there was some stuff you wanted
re-directed, and some you didn't. The point is, you _can_ do this at the
Python level, and you only have to modify Python code, not C Python
source. 

However, if you want to proxy everything, then the glib wrapper is
probably the best approach, provided you really want to code in C and
have your own Python binary.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From pinard at iro.umontreal.ca  Wed Aug  3 03:27:54 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue, 2 Aug 2005 21:27:54 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123030984.1821.124.camel@warna.corp.google.com>
References: <20050802144908.GA7898@alcyon.progiciels-bpi.ca>
	<000001c59772$4224c8a0$92b2958d@oemcomputer>
	<20050802160620.GA9652@alcyon.progiciels-bpi.ca>
	<1123030984.1821.124.camel@warna.corp.google.com>
Message-ID: <20050803012754.GA21052@alcyon.progiciels-bpi.ca>

[Donovan Baarda]

> It is true that some well designed/developed software becomes reliable
> very quicky. However, it still takes heavy use over time to prove that.

There is wisdom in your say! :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From kbk at shore.net  Wed Aug  3 04:28:58 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Tue, 2 Aug 2005 22:28:58 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200508030228.j732SwHG022094@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  354 open ( -3) /  2888 closed ( +3) /  3242 total ( +0)
Bugs    :  909 open (+11) /  5152 closed ( +8) /  6061 total (+19)
RFE     :  191 open ( +0) /   178 closed ( +0) /   369 total ( +0)

Patches Closed
______________

PEP 342 Generator enhancements  (2005-06-18)
       http://python.org/sf/1223381  closed by  pje

Provide tuple of "special" exceptions  (2004-10-01)
       http://python.org/sf/1038256  closed by  ncoghlan

Patch for (Doc) #1243553  (2005-07-24)
       http://python.org/sf/1243910  closed by  montanaro

New / Reopened Bugs
___________________

"new" not marked as deprecated in the docs  (2005-07-30)
       http://python.org/sf/1247765  opened by  J?rgen Hermann

error in popen2() reference  (2005-07-30)
CLOSED http://python.org/sf/1248036  opened by  Lorenzo Luengo

pdb 'next' does not skip list comprehension  (2005-07-31)
       http://python.org/sf/1248119  opened by  Joseph Heled

set of pdb breakpoint fails  (2005-07-31)
       http://python.org/sf/1248127  opened by  Joseph Heled

shelve .sync operation not documented  (2005-07-31)
       http://python.org/sf/1248199  opened by  paul rubin

dir should accept dirproxies for __dict__  (2005-07-31)
       http://python.org/sf/1248658  opened by  Ronald Oussoren

2.3.5 SRPM fails to build without tkinter installed  (2005-07-31)
       http://python.org/sf/1248997  opened by  Laurie Harper

rfc822 module, bug in parsedate_tz  (2005-08-01)
       http://python.org/sf/1249573  opened by  nemesis

isinstance() fails depending on how modules imported  (2005-08-01)
       http://python.org/sf/1249615  opened by  Hugh Gibson

Encodings and aliases do not match runtime  (2005-08-01)
       http://python.org/sf/1249749  opened by  liturgist

container methods raise KeyError not IndexError  (2005-08-01)
       http://python.org/sf/1249837  opened by  Wilfredo Sanchez

numarray in debian python 2.4.1  (2005-08-01)
CLOSED http://python.org/sf/1249867  opened by  LovePanda

numarray in debian python 2.4.1  (2005-08-01)
CLOSED http://python.org/sf/1249873  opened by  LovePanda

numarray in debian python 2.4.1  (2005-08-01)
       http://python.org/sf/1249903  opened by  LovePanda

IDLE does not start. 2.4.1  (2005-08-01)
CLOSED http://python.org/sf/1249965  opened by  codepyro

gethostbyname(gethostname()) fails on misconfigured system  (2005-08-02)
       http://python.org/sf/1250170  opened by  Tadeusz Andrzej Kadlubowski

incorrect description of range function  (2005-08-02)
       http://python.org/sf/1250306  opened by  John Gleeson

The -m option to python does not search zip files  (2005-08-02)
       http://python.org/sf/1250389  opened by  Paul Moore

Tix: PanedWindow.panes nonfunctional  (2005-08-02)
       http://python.org/sf/1250469  opened by  Majromax

Bugs Closed
___________

Segfault in Python interpreter 2.3.5  (2005-07-26)
       http://python.org/sf/1244864  closed by  birkenfeld

logging module doc needs to note that it changed in 2.4  (2005-07-25)
       http://python.org/sf/1244683  closed by  vsajip

manual.cls contains an invalid pdf-inquiry  (2005-07-14)
       http://python.org/sf/1238210  closed by  fdrake

error in popen2() reference  (2005-07-30)
       http://python.org/sf/1248036  closed by  birkenfeld

numarray in debian python 2.4.1  (2005-08-01)
       http://python.org/sf/1249867  closed by  birkenfeld

numarray in debian python 2.4.1  (2005-08-01)
       http://python.org/sf/1249873  closed by  birkenfeld

IDLE does not start. 2.4.1  (2005-08-01)
       http://python.org/sf/1249965  closed by  codepyro

Incorrect documentation of re.UNICODE  (2005-07-22)
       http://python.org/sf/1243192  closed by  birkenfeld


From ncoghlan at gmail.com  Wed Aug  3 11:05:54 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 03 Aug 2005 19:05:54 +1000
Subject: [Python-Dev] Pre-PEP: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab1005080209565974cc95@mail.gmail.com>
References: <bbaeab100507291734337930a2@mail.gmail.com>	
	<f6bc9b4905073103367c19832@mail.gmail.com>	
	<bbaeab1005073109217f3a33f1@mail.gmail.com>	
	<f6bc9b49050801014968cac94f@mail.gmail.com>	
	<87hde9229l.fsf@tleepslib.sk.tsukuba.ac.jp>	
	<f6bc9b49050801135345872915@mail.gmail.com>	
	<5.1.1.6.0.20050802001600.0258dbe0@mail.telecommunity.com>	
	<87d5owzu9l.fsf@tleepslib.sk.tsukuba.ac.jp>	
	<42EF444A.4040108@gmail.com>	
	<5.1.1.6.0.20050802113926.02895d08@mail.telecommunity.com>
	<bbaeab1005080209565974cc95@mail.gmail.com>
Message-ID: <42F088F2.3020908@gmail.com>

Brett Cannon wrote:
> On 8/2/05, Phillip J. Eby <pje at telecommunity.com> wrote:
>>It seems to me that multiple inheritance is definitely the right idea,
>>though.  That way, we can get the hierarchy we really want with only a
>>minimum of boilerplate in pre-3.0 to make it actually work.
> 
> Yeah.  I think name aliasing and multiple inheritance will take us a
> long way.  Warnings should be able to take us the rest of the way.
> 
> -Brett (who is still waiting for a number; Barry, David, you out there?)

And it will let us get rid of some of the ugliness in my v 0.1 proposal, too 
(like Error being a child of StandardError, instead of the other way around).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Wed Aug  3 15:10:33 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 03 Aug 2005 23:10:33 +1000
Subject: [Python-Dev] PEP,
 take 2: Exception Reorganization for  Python 3.0
In-Reply-To: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
Message-ID: <42F0C249.20007@gmail.com>

Phillip J. Eby wrote:
> +1.  The main things that need fixing, IMO, are the need for critical and 
> control flow exceptions to be distinguished from "normal" errors.  The rest 
> is mostly too abstract for me to care about in 2.x.

I guess, before we figure out "where would we like to go?", we really need to 
know "what's wrong with where we are right now?"

Like you, the only real problem I have with the current hierarchy is that 
"except Exception:" is just as bad as a bare except in terms of catching 
exceptions it shouldn't (like SystemExit). I find everything else about the 
hierarchy is pretty workable (mainly because it *is* fairly flat - if I want 
to catch a couple of different exception types, which is fairly rare, I can 
just list them).

I like James's suggestion that instead of trying to switch people to using 
something other than "except Exception:", we just aim to adjust the hierarchy 
so that "except Exception" becomes the right thing to do. Changing the 
inheritance structure a bit is far easier than trying to shift several years 
of accumulated user experience. . .

Anyway, with the hierarchy below, "except Exception:" still overreaches, but 
can be corrected by preceding it with "except (ControlFlow, CriticalError): 
raise".

"except Exception:" stops overreaching once the links from Exception to 
StopIteration and SystemExit, and the links from StandardError to 
KeyboardInterrupt, SystemError and MemoryError are removed (probably difficult 
to do before Py3k but not impossible).

This hierarchy also means that inheriting application and library errors from 
Exception can continue to be recommended practice. Adapting the language to 
fit the users rather than the other way around seems to be a pretty good call 
on this point. . .

The only changes from the Python 2.4 hierarchy are:
    New exceptions:
      - Raisable (new base)
      - ControlFlow (inherits from Raisable)
      - CriticalError (inherits from Raisable)
      - GeneratorExit (inherits from ControlFlow)
    Added inheritance:
      - Exception from Raisable
      - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow
      - SystemError, MemoryError from CriticalError

Python 2.4 Compatible Improved Exception Hierarchy v 0.2 [1]
============================================================

   Raisable (new)
   +-- ControlFlow (new)
       +-- GeneratorExit (new)
       +-- StopIteration (inheritance new)
       +-- SystemExit (inheritance new)
       +-- KeyboardInterrupt (inheritance new)
   +-- CriticalError (new)
       +-- MemoryError (inheritance new)
       +-- SystemError (inheritance new)
   +-- Exception (inheritance new)
       +-- StopIteration
       +-- SystemExit
       +-- StandardError
           +-- KeyboardInterrupt
           +-- MemoryError
           +-- SystemError
           +-- AssertionError
           +-- AttributeError
           +-- EOFError
           +-- ImportError
           +-- TypeError
           +-- ReferenceError
           +-- ArithmeticError
               +-- FloatingPointError
               +-- DivideByZeroError
               +-- OverflowError
           +-- EnvironmentError
               +-- OSError
                   +-- WindowsError
               +-- IOError
           +-- LookupError
               +-- IndexError
               +-- KeyError
           +-- NameError
               +-- UnboundLocalError
           +-- RuntimeError
               +-- NotImplementedError
           +-- SyntaxError
               +-- IndentationError
                   +-- TabError
           +-- ValueError
               +-- UnicodeError
                   +-- UnicodeDecodeError
                   +-- UnicodeEncodeError
                   +-- UnicodeTranslateError
       +-- Warning
           +-- DeprecationWarning
           +-- FutureWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning

Cheers,
Nick.

[1] I've started putting version numbers on these suggestions, since someone 
referred to "Nick's exception hierarchy" in one of the threads, and I had no 
idea which of my suggestions they meant. I think I'm up to three or four 
different variants by now. . .

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From pje at telecommunity.com  Wed Aug  3 15:24:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Aug 2005 09:24:27 -0400
Subject: [Python-Dev] PEP,
 take 2: Exception Reorganization for   Python 3.0
In-Reply-To: <42F0C249.20007@gmail.com>
References: <5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
	<bbaeab1005080217346b2af653@mail.gmail.com>
	<5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com>

At 11:10 PM 8/3/2005 +1000, Nick Coghlan wrote:
>     New exceptions:
>       - Raisable (new base)
>       - ControlFlow (inherits from Raisable)
>       - CriticalError (inherits from Raisable)
>       - GeneratorExit (inherits from ControlFlow)
>     Added inheritance:
>       - Exception from Raisable
>       - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow
>       - SystemError, MemoryError from CriticalError

+1

I'd also like to see a "Reraisable" or something like that to cover both 
CriticalError and ControlFlow, but it could be a tuple of those two bases 
rather than a class.  But that's just a "would be nice" feature.


From rrr at ronadam.com  Wed Aug  3 18:33:31 2005
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 03 Aug 2005 12:33:31 -0400
Subject: [Python-Dev] PEP,
 take 2: Exception Reorganization for  Python 3.0
In-Reply-To: <42F0C249.20007@gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>	<5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
	<42F0C249.20007@gmail.com>
Message-ID: <42F0F1DB.7050704@ronadam.com>

Nick Coghlan wrote:
> Phillip J. Eby wrote:
> 
>>+1.  The main things that need fixing, IMO, are the need for critical and 
>>control flow exceptions to be distinguished from "normal" errors.  The rest 
>>is mostly too abstract for me to care about in 2.x.
> 
> 
> I guess, before we figure out "where would we like to go?", we really need to 
> know "what's wrong with where we are right now?"
> 
> Like you, the only real problem I have with the current hierarchy is that 
> "except Exception:" is just as bad as a bare except in terms of catching 
> exceptions it shouldn't (like SystemExit). I find everything else about the 
> hierarchy is pretty workable (mainly because it *is* fairly flat - if I want 
> to catch a couple of different exception types, which is fairly rare, I can 
> just list them).

More often than not, 9 out 10 times, when ever I use "except Exception:" 
or a bare except:, what I am doing is the equivalent to:

    try:
        <statement that may fail>     # will either fail or not
    except:
        pass
    else:
        <dependent statements>

Usually I end up using "if hasattr():" or some other way to pre test the 
statement if possible as I find "except: pass" to be ugly. And putting 
both the statement that may fail together with the depending statements 
in the try:, catches too much.  Finding subtle errors hidden by a try 
block can be rather difficult at times.

Could inverse exceptions be an option?  Exceptions don't work this way 
so it would probably need to be sugar for "except <exception>:pass; else:".

Possibly?

    try:
        <statement that may fail>
    except not <a_exception>:          "except None:" as an option?
        <dependent statements>

Ok, this isn't exactly clear, and probably a -2 for several reasons.

The exception tree organization should also take into account inverse 
relationships as well.  Exceptions used for control flow are often of 
the type "if not exception: do something".

Cheers,
Ron


From bcannon at gmail.com  Wed Aug  3 18:55:56 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 09:55:56 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <000301c597c7$1551dde0$92b2958d@oemcomputer>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
Message-ID: <bbaeab10050803095538a125e7@mail.gmail.com>

On 8/2/05, Raymond Hettinger <python at rcn.com> wrote:
> The Py3.0 PEPs are a bit disconcerting.  Without 3.0 actively in
> development, it is difficult to get the participation, interest, and
> seriousness of thought that we apply to the current release.  The PEPs
> may have the effect of prematurely finalizing discussions on something
> that still has an ethereal if not pie-in-the-sky quality to it.  I would
> hate for 3.0 development to start with constraints that got set in stone
> before the project became a reality.
> 

I don't view this PEP (nor any other PYthon 3000 PEP) as set in stone
until we are one or two versions away from Python 3.0 .  I view these
PEPs as just provoking discussion and getting the ball rolling now
instead of rushing to get it done when it does come time to start
thinking about these things.  Even if we get everyone to agree on this
PEP I still won't consider it finalized until there is one more round
of discussion just before we start implementing for Python 3.0 .

> With respect to exception re-organization, the conversation has been
> thought provoking but a little too much of a design-from-scratch
> quality.  Each proposed change needs to be rooted in a specific problem
> with the current hierarchy (i.e. what use cases cannot currently be
> dealt with under the existing tree).  Setting a high threshold for
> change will increase the likelihood that old code can be easily ported
> and decrease the likelihood of either throwing away previous good
> decisions or adopting new ideas that later prove unworkable.  IOW,
> unless the current tree is thought to be really bad, then the new tree
> ought to be very close to what we have now.
> 

So are you saying that the renaming is bad, or the whole reorg?  It
seems everyone agrees with the moving of the control flow exceptions
and CriticalException, although the renamings might just me wishing
for it.

-Brett

From gvanrossum at gmail.com  Wed Aug  3 19:18:42 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 3 Aug 2005 10:18:42 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab10050803095538a125e7@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
Message-ID: <ca471dc20508031018328f1b18@mail.gmail.com>

So here's a radical proposal (hear the scratching of the finglernail
on the blackboard? :-).

Start with Brett's latest proposal. Goal: keep bare "except:" but
change it to catch only the part of the hierarchy rooted at
StandardError.

- Call the root of the hierarchy Raisable.
- Rename CriticalException to CriticalError
  (this should happen anyway).
- Rename ControlFlowException to ControlFlowRaisable
  (anything except Error or Exception).
- Rename StandardError to Exception.
- Make Warning a subclass of Exception.

I'd want the latter point even if the rest of this idea is rejected;
when a Warning is raised (as opposed to just printing a message or
being suppressed altogether) it should be treated just like any other
normal exception, i.e. StandardError.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Wed Aug  3 19:28:32 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 10:28:32 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <42F0C249.20007@gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
	<42F0C249.20007@gmail.com>
Message-ID: <bbaeab1005080310284fc1e6d7@mail.gmail.com>

On 8/3/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Phillip J. Eby wrote:
> > +1.  The main things that need fixing, IMO, are the need for critical and
> > control flow exceptions to be distinguished from "normal" errors.  The rest
> > is mostly too abstract for me to care about in 2.x.
> 
> I guess, before we figure out "where would we like to go?", we really need to
> know "what's wrong with where we are right now?"
> 
> Like you, the only real problem I have with the current hierarchy is that
> "except Exception:" is just as bad as a bare except in terms of catching
> exceptions it shouldn't (like SystemExit). I find everything else about the
> hierarchy is pretty workable (mainly because it *is* fairly flat - if I want
> to catch a couple of different exception types, which is fairly rare, I can
> just list them).
> 

Does no one else feel some of the names could be improved upon?  While
we might all have gotten used to them I still don't believe newbies
necessarily grasp what they are all for based on their names.

Then again, if this PEP is viewed more as handling macro issues with
the currnt hierarchy and name changes can be done when we get closer
to Python 3.0 I am happy to drop renaming until we are closer to
actual implementation with a section just listing suggested name
changes and stating that they are just being considered possibl
renaming which will not be finalized until we are closer to Python 3.0
.

> I like James's suggestion that instead of trying to switch people to using
> something other than "except Exception:", we just aim to adjust the hierarchy
> so that "except Exception" becomes the right thing to do. Changing the
> inheritance structure a bit is far easier than trying to shift several years
> of accumulated user experience. . .
> 
> Anyway, with the hierarchy below, "except Exception:" still overreaches, but
> can be corrected by preceding it with "except (ControlFlow, CriticalError):
> raise".
> 
> "except Exception:" stops overreaching once the links from Exception to
> StopIteration and SystemExit, and the links from StandardError to
> KeyboardInterrupt, SystemError and MemoryError are removed (probably difficult
> to do before Py3k but not impossible).
> 
> This hierarchy also means that inheriting application and library errors from
> Exception can continue to be recommended practice. Adapting the language to
> fit the users rather than the other way around seems to be a pretty good call
> on this point. . .
> 

Well, then StandardError becomes kind of stupid.  The only use it
would serve is a superclass for all non-critical, non-control-flow
built-in exceptions.  I really don't know how often that is going to
be needed.

I do realize it keeps inheritance working for existing code, though. 
I guess that would just have to be a trade-off for
backwards-compatibility.

OK, I am convinced; I will revert back to Raisable/Exception instead
of Exception/StandardError.

-Brett

From bcannon at gmail.com  Wed Aug  3 19:29:45 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 10:29:45 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<5.1.1.6.0.20050802211559.025bb640@mail.telecommunity.com>
	<42F0C249.20007@gmail.com>
	<5.1.1.6.0.20050803092046.025af648@mail.telecommunity.com>
Message-ID: <bbaeab100508031029407cac85@mail.gmail.com>

On 8/3/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:10 PM 8/3/2005 +1000, Nick Coghlan wrote:
> >     New exceptions:
> >       - Raisable (new base)
> >       - ControlFlow (inherits from Raisable)
> >       - CriticalError (inherits from Raisable)
> >       - GeneratorExit (inherits from ControlFlow)
> >     Added inheritance:
> >       - Exception from Raisable
> >       - StopIteration, SystemExit, KeyboardInterrupt from ControlFlow
> >       - SystemError, MemoryError from CriticalError
> 
> +1
> 
> I'd also like to see a "Reraisable" or something like that to cover both
> CriticalError and ControlFlow, but it could be a tuple of those two bases
> rather than a class.  But that's just a "would be nice" feature.

Eh, I am not so hot on this idea.  I see your argument, Phillip, but I
just don't think it will be useful enough to warrant its introduction.
 Could add to the exceptions module, though.

-Brett

From bcannon at gmail.com  Wed Aug  3 19:44:30 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 10:44:30 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc20508031018328f1b18@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
Message-ID: <bbaeab1005080310441c24032f@mail.gmail.com>

On 8/3/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> So here's a radical proposal (hear the scratching of the finglernail
> on the blackboard? :-).
> 
> Start with Brett's latest proposal.

Including renaming (I want to know if you support the renamings at
all, if I should make them more of an idea to be considered when we
get closer to Python 3.0, or just drop them) and the new exceptions?

> Goal: keep bare "except:" but
> change it to catch only the part of the hierarchy rooted at
> StandardError.
> 

Why the change of heart?  Backwards-compatibility?  Way to keep
newbies from choosing Raisable or such as what to catch?

> - Call the root of the hierarchy Raisable.

Fine by me.  Will change it before I check in the PEP tonight.

> - Rename CriticalException to CriticalError
>   (this should happen anyway).

I thought I changed that in the latest version.  I will change it.

> - Rename ControlFlowException to ControlFlowRaisable
>   (anything except Error or Exception).

No objection from me.

> - Rename StandardError to Exception.

So just ditch StandardError, which is fine by me, or go with Nick's v2
proposal and have all pre-existing exceptions inherit from it?  I
assume the latter since you said you wanted bare 'except' clauses to
catch StandardError.

> - Make Warning a subclass of Exception.
> 
> I'd want the latter point even if the rest of this idea is rejected;
> when a Warning is raised (as opposed to just printing a message or
> being suppressed altogether) it should be treated just like any other
> normal exception, i.e. StandardError.
> 

Since warnings only become raised if the warnings filter lists it as
an error I can see how this is a reasonable suggestion.  And if bare
'except' clauses catch StandardError and not Exception they will still
propagate to the top unless people explicitly catch Exception or lower
which seems fair.

-Brett

From gvanrossum at gmail.com  Wed Aug  3 21:00:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 3 Aug 2005 12:00:58 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab1005080310441c24032f@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
Message-ID: <ca471dc2050803120036668662@mail.gmail.com>

On 8/3/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 8/3/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> > So here's a radical proposal (hear the scratching of the finglernail
> > on the blackboard? :-).
> >
> > Start with Brett's latest proposal.
> 
> Including renaming (I want to know if you support the renamings at
> all, if I should make them more of an idea to be considered when we
> get closer to Python 3.0, or just drop them) and the new exceptions?

Most of the renamings sound fine to me.

> > Goal: keep bare "except:" but
> > change it to catch only the part of the hierarchy rooted at
> > StandardError.
> 
> Why the change of heart?  Backwards-compatibility?  Way to keep
> newbies from choosing Raisable or such as what to catch?

The proposal accepts that there's a need to catch "all errors that are
reasonable to catch": that's why it separates StandardError  from the
root exception class.

So now we're going to recommend that everyone who was using bare
'except:' write 'except StandardError:' instead.

So why not have a default?

Because of EIBTI?

Seems a weak argument; we have defaults for lots of things.

> > - Call the root of the hierarchy Raisable.
> 
> Fine by me.  Will change it before I check in the PEP tonight.
> 
> > - Rename CriticalException to CriticalError
> >   (this should happen anyway).
> 
> I thought I changed that in the latest version.  I will change it.

I may have missed the change.

> > - Rename ControlFlowException to ControlFlowRaisable
> >   (anything except Error or Exception).
> 
> No objection from me.

I actually find it ugly; but it's not an error and it would be weird
if there was an xxxException that didn't derive from Exception.

> > - Rename StandardError to Exception.
> 
> So just ditch StandardError, which is fine by me, or go with Nick's v2
> proposal and have all pre-existing exceptions inherit from it?  I
> assume the latter since you said you wanted bare 'except' clauses to
> catch StandardError.

What do you think? Of course the critical and control flow ones should
*not* inherit from it.

[...brain hums...]

OK, I'm changing my mind again about the names again.

Exception as the root and StandardError can stay; the only new
proposal would then be to make bare 'except:' call StandardError.

> > - Make Warning a subclass of Exception.
> >
> > I'd want the latter point even if the rest of this idea is rejected;
> > when a Warning is raised (as opposed to just printing a message or
> > being suppressed altogether) it should be treated just like any other
> > normal exception, i.e. StandardError.
> 
> Since warnings only become raised if the warnings filter lists it as
> an error I can see how this is a reasonable suggestion.  And if bare
> 'except' clauses catch StandardError and not Exception they will still
> propagate to the top unless people explicitly catch Exception or lower
> which seems fair.

Unclear what you mean; I want bare except; to catch Warnings! IOW I
want Warning to inherit from whatever the thing is that bare except:
catches (if we keep it) and that is the start of all the "normal"
exceptions excluding critical and control flow exceptions.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mal at egenix.com  Wed Aug  3 21:01:10 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 03 Aug 2005 21:01:10 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EFE295.6040906@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
Message-ID: <42F11476.9000507@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>True, but if we never ask, we'll never know :-)
>>
>>My question was: Would asking a professional hosting company
>>be a reasonable approach ?
> 
> It would be an option, yes, of course. It's not an approach that
> *I* would be willing to implement, though.

Fair enough.

>>>From the answers, I take it that there's not much trust in these
>>offers, so I guess there's not much desire to PSF money into this.
> 
> I haven't received any offers to make a qualified statement. I only
> know that I would oppose an approach to ask somebody but our
> volunteers to do it for free, and I also know that I don't want to
> spend my time researching commercial alternatives (although I
> wouldn't mind if you spent your time).

I don't quite understand what you meant here: are you opposing
spending PSF money on a hosting company if and only if volunteers
who take on the job don't get paid ?

I've done a bit of research on the subject and so far only found
CollabNet and VA offering commercial services in this area. VA hosts
SourceForge so that's a non-option, I guess :-)

I know that Greg Stein worked for CollabNet, so thought it might be a
good idea to ask him about the idea to move things to CollabNet.
Of course, before taking this route, I wanted to get a feeling
for the general attitude towards a commercial approach, which
is why I tossed in the idea.

Other non-commercial alternatives are Berlios and Savannah, but
I'm not sure whether they'd offer Subversion support.

BTW, have you considered using Trac as issue tracker on
svn.python.org ? They have a very good subversion
integration, it's easy to use, comes with a wiki and
looks great. Oh, and it's written in Python :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 03 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From nick.bastin at gmail.com  Wed Aug  3 21:18:35 2005
From: nick.bastin at gmail.com (Nicholas Bastin)
Date: Wed, 3 Aug 2005 15:18:35 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EF2794.1000209@v.loewis.de>
References: <42E93940.6080708@v.loewis.de> <200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
Message-ID: <66d0a6e105080312181e25fa08@mail.gmail.com>

On 8/2/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> George V. Neville-Neil wrote:
> > Since Python is Open Source are you looking at Per Force which you can
> > use for free and seems to be a happy medium between something like CVS
> > and something horrific like Clear Case?
> 
> No. The PEP is only about Subversion. Why should we be looking at Per
> Force? Only because Python is Open Source?

Perforce is a commercial product, but it can be had for free for
verified Open Source projects, which Python shouldn't have any problem
with.  There are other problems, like you have to renew the agreement
every year, but it might be worth considering, given the fact that
it's an excellent system.

> I think anything but Subversion is ruled out because:
> - there is no offer to host that anywhere (for subversion, there is
>   already svn.python.org)

We could host a Perforce repository just as easily, I would think.

> - there is no support for converting a CVS repository (for subversion,
>   there is cvs2svn)

I'd put $20 on the fact that cvs2svn will *not* work out of the box
for converting the python repository.  Just call it a hunch.  In any
case, the Perforce-supplied cvs2p4 should work at least as well.

--
Nick

From martin at v.loewis.de  Wed Aug  3 21:22:10 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 03 Aug 2005 21:22:10 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F11476.9000507@egenix.com>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>
Message-ID: <42F11962.2070107@v.loewis.de>

M.-A. Lemburg wrote:
>>I haven't received any offers to make a qualified statement. I only
>>know that I would oppose an approach to ask somebody but our
>>volunteers to do it for free, and I also know that I don't want to
>>spend my time researching commercial alternatives (although I
>>wouldn't mind if you spent your time).
> 
> 
> I don't quite understand what you meant here: are you opposing
> spending PSF money on a hosting company if and only if volunteers
> who take on the job don't get paid ?

No. I'm opposed to approaching somebody to do it for free, except
the somebody are the pydotorg volunteers (IOW, I won't take gifts
from anybody else in this matter).

> I've done a bit of research on the subject and so far only found
> CollabNet and VA offering commercial services in this area. VA hosts
> SourceForge so that's a non-option, I guess :-)

It's not that I dislike VA - I personally think they are doing a
great job with SourceForge, and I like SourceForge a lot. There
are just some issues with it (like that they offer no Subversion).

The question would be: what precisely is the commercial offering from
VA: does it provide subversion? how is the user management done?
etc.

> I know that Greg Stein worked for CollabNet, so thought it might be a
> good idea to ask him about the idea to move things to CollabNet.
> Of course, before taking this route, I wanted to get a feeling
> for the general attitude towards a commercial approach, which
> is why I tossed in the idea.

Ok - I expect that the project might be *done* before we even have
a single commercial offer, with a precise service description,
and a precise price tag. That makes commercial offers so difficult:
that it is so time expensive to use them, that you might spend
less time doing it yourself.

> Other non-commercial alternatives are Berlios and Savannah, but
> I'm not sure whether they'd offer Subversion support.

For me, they fall into the "I won't take gifts" category.

> BTW, have you considered using Trac as issue tracker on
> svn.python.org ?

You mean, me personally? I quite like the Subversion tracker,
and don't want to trade it for anything else. I know Guido
wants to use Roundup (which is also written in Python),
and obviously so does Richard Jones.

The main questions are the same as with this PEP: how to do
the migration from SF (without losing data), and how to
do the ongoing maintenance. It's just that finding answers
to these questions is so much harder, therefore, this PEP
is *only* about CVS.

Regards,
Martin


From fdrake at acm.org  Wed Aug  3 21:28:25 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 3 Aug 2005 15:28:25 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F11476.9000507@egenix.com>
References: <42E93940.6080708@v.loewis.de> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>
Message-ID: <200508031528.25776.fdrake@acm.org>

On Wednesday 03 August 2005 15:01, M.-A. Lemburg wrote:
 > Other non-commercial alternatives are Berlios and Savannah, but
 > I'm not sure whether they'd offer Subversion support.

Berlios does offer Subversion; the docutils project is using the Berlios 
Subversion and SourceForge for everything else.

I don't know whether Savannah is offering Subversion right now, but the last 
time I looked at it, it appeared nearly un-maintained.  But that may just be 
the understated nature of that community.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From bcannon at gmail.com  Wed Aug  3 22:10:58 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 13:10:58 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050803120036668662@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
Message-ID: <bbaeab100508031310733c9889@mail.gmail.com>

On 8/3/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/3/05, Brett Cannon <bcannon at gmail.com> wrote:
> > On 8/3/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> > > So here's a radical proposal (hear the scratching of the finglernail
> > > on the blackboard? :-).
> > >
> > > Start with Brett's latest proposal.
> >
> > Including renaming (I want to know if you support the renamings at
> > all, if I should make them more of an idea to be considered when we
> > get closer to Python 3.0, or just drop them) and the new exceptions?
> 
> Most of the renamings sound fine to me.
> 

OK, great.  I will leave in the new names and the new exceptions.

> > > Goal: keep bare "except:" but
> > > change it to catch only the part of the hierarchy rooted at
> > > StandardError.
> >
> > Why the change of heart?  Backwards-compatibility?  Way to keep
> > newbies from choosing Raisable or such as what to catch?
> 
> The proposal accepts that there's a need to catch "all errors that are
> reasonable to catch": that's why it separates StandardError  from the
> root exception class.
> 
> So now we're going to recommend that everyone who was using bare
> 'except:' write 'except StandardError:' instead.
> 
> So why not have a default?
> 

Because you can easily write it without a default.

> Because of EIBTI?
> 

Don't know the acronym (and neither does acronymfinder.com).

> Seems a weak argument; we have defaults for lots of things.
> 

OK.  I was fine with bare 'except' clauses to begin with so this is
not a huge point of contention for me personally.

[SNIP]

> > So just ditch StandardError, which is fine by me, or go with Nick's v2
> > proposal and have all pre-existing exceptions inherit from it?  I
> > assume the latter since you said you wanted bare 'except' clauses to
> > catch StandardError.
> 
> What do you think? Of course the critical and control flow ones should
> *not* inherit from it.
> 

Well, Nick and Jame's point of tweaking the names so that they more
reflect what people expect instead of what they are meant to actually
be is interesting.

But, in terms of backwards-compatibility, Exception/StandardError is
most exacting in terms of matching what already exists.  But with
renamings I don't know how critical this kind of low-level
backwards-compatibility is critical.

Personally I just prefer the names Exception/StandardError for
unexplained aesthetic reasons.

> [...brain hums...]
> 
> OK, I'm changing my mind again about the names again.
> 
> Exception as the root and StandardError can stay; the only new
> proposal would then be to make bare 'except:' call StandardError.
> 

OK.  I will then also leave ControlFlowException as-is.

> > > - Make Warning a subclass of Exception.
> > >
> > > I'd want the latter point even if the rest of this idea is rejected;
> > > when a Warning is raised (as opposed to just printing a message or
> > > being suppressed altogether) it should be treated just like any other
> > > normal exception, i.e. StandardError.
> >
> > Since warnings only become raised if the warnings filter lists it as
> > an error I can see how this is a reasonable suggestion.  And if bare
> > 'except' clauses catch StandardError and not Exception they will still
> > propagate to the top unless people explicitly catch Exception or lower
> > which seems fair.
> 
> Unclear what you mean; I want bare except; to catch Warnings! IOW I
> want Warning to inherit from whatever the thing is that bare except:
> catches (if we keep it) and that is the start of all the "normal"
> exceptions excluding critical and control flow exceptions.
> 

OK, that squares that one away.  And it makes sense since you can view
Warnings as even less critical exceptions than the non-control and
non-critical exceptions and thus should be caught by a default
`except' clause.

-Brett

From mwh at python.net  Wed Aug  3 22:13:43 2005
From: mwh at python.net (Michael Hudson)
Date: Wed, 03 Aug 2005 21:13:43 +0100
Subject: [Python-Dev] PEP,
 take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc20508031018328f1b18@mail.gmail.com> (Guido van Rossum's
	message of "Wed, 3 Aug 2005 10:18:42 -0700")
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
Message-ID: <2mfytqzslk.fsf@starship.python.net>

Guido van Rossum <gvanrossum at gmail.com> writes:

> So here's a radical proposal (hear the scratching of the finglernail
> on the blackboard? :-).
>
> Start with Brett's latest proposal. Goal: keep bare "except:" but
> change it to catch only the part of the hierarchy rooted at
> StandardError.
>
> - Call the root of the hierarchy Raisable.
> - Rename CriticalException to CriticalError
>   (this should happen anyway).
> - Rename ControlFlowException to ControlFlowRaisable
>   (anything except Error or Exception).
> - Rename StandardError to Exception.
> - Make Warning a subclass of Exception.
>
> I'd want the latter point even if the rest of this idea is rejected;
> when a Warning is raised (as opposed to just printing a message or
> being suppressed altogether) it should be treated just like any other
> normal exception, i.e. StandardError.

In the above you need to ensure that all raised exceptions inherit
from Raisable, because sometimes you really do want to catch almost
anything (e.g. code.py).

Has anyone thought about the C side of this?  There are a few
slightly-careless calls to PyErr_Clear() in the codebase, and they can
cause just as much (more!) heartache as bare except: clauses.

I'll note in passing that I'm not sure that any reorganization of the
exception hierachy will make this kind of catching-too-much bug go
away.  The issue is just thorny, and each case is different.

I'm also still not convinced that the backwards compatibility breaking
Python 3.0 will ever actually happen, but I guess that's a different
consideration...

Cheers,
mwh

-- 
  Haha! You had a *really* weak argument! <wink>
                                      -- Moshe Zadka, comp.lang.python

From bcannon at gmail.com  Wed Aug  3 22:23:21 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 13:23:21 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <2mfytqzslk.fsf@starship.python.net>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<2mfytqzslk.fsf@starship.python.net>
Message-ID: <bbaeab1005080313237eadba19@mail.gmail.com>

On 8/3/05, Michael Hudson <mwh at python.net> wrote:
> Guido van Rossum <gvanrossum at gmail.com> writes:
> 
> > So here's a radical proposal (hear the scratching of the finglernail
> > on the blackboard? :-).
> >
> > Start with Brett's latest proposal. Goal: keep bare "except:" but
> > change it to catch only the part of the hierarchy rooted at
> > StandardError.
> >
> > - Call the root of the hierarchy Raisable.
> > - Rename CriticalException to CriticalError
> >   (this should happen anyway).
> > - Rename ControlFlowException to ControlFlowRaisable
> >   (anything except Error or Exception).
> > - Rename StandardError to Exception.
> > - Make Warning a subclass of Exception.
> >
> > I'd want the latter point even if the rest of this idea is rejected;
> > when a Warning is raised (as opposed to just printing a message or
> > being suppressed altogether) it should be treated just like any other
> > normal exception, i.e. StandardError.
> 
> In the above you need to ensure that all raised exceptions inherit
> from Raisable, because sometimes you really do want to catch almost
> anything (e.g. code.py).
> 

That's part of the PEP.

> Has anyone thought about the C side of this?

I have thought about it somewhat, but I have not dived in to try to
write a patch.

>  There are a few
> slightly-careless calls to PyErr_Clear() in the codebase, and they can
> cause just as much (more!) heartache as bare except: clauses.
> 

I fail to see how clearing the exception state has any effect on the
implementation for the PEP.

> I'll note in passing that I'm not sure that any reorganization of the
> exception hierachy will make this kind of catching-too-much bug go
> away.  The issue is just thorny, and each case is different.
> 

It will never go away as long as catching exceptions based on
inheritance exists.  Don't know if any language has ever solved it. 
Best we can do is try to minimize it.

> I'm also still not convinced that the backwards compatibility breaking
> Python 3.0 will ever actually happen, but I guess that's a different
> consideration...

Perhaps not.  Might end up doing so much of a slow transition that it
will just be a bigger codebase change from 2.9 (or whatever the end of
the 2.x branch is) to 3.0 .

> --
>   Haha! You had a *really* weak argument! <wink>
>                                       -- Moshe Zadka, comp.lang.python

Hopefully I don't.  =)

-Brett

From rowen at cesmail.net  Wed Aug  3 22:35:24 2005
From: rowen at cesmail.net (Russell E. Owen)
Date: Wed, 03 Aug 2005 13:35:24 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
References: <bbaeab1005080217346b2af653@mail.gmail.com>
Message-ID: <rowen-64DE4A.13352403082005@sea.gmane.org>

In article <bbaeab1005080217346b2af653 at mail.gmail.com>,
 Brett Cannon <bcannon at gmail.com> wrote:

> New Hierarchy
> =============
> 
> Exception
> +-- CriticalException (new)
>     +-- KeyboardInterrupt
>     +-- MemoryError
>     +-- SystemError
> +-- ControlFlowException (new)
>     +-- StopIteration
>     +-- GeneratorExit
>     +-- SystemExit
> +-- StandardError
>     +-- AssertionError
>     +-- SyntaxError
>         +-- IndentationError
>             +-- TabError
>     +-- UserException (rename of RuntimeError)
>     +-- ArithmeticError
>         +-- FloatingPointError
>         +-- DivideByZeroError
>         +-- OverflowError
>     +-- UnicodeError
>         +-- UnicodeDecodeError
>         +-- UnicodeEncodeError
>         +-- UnicodeTranslateError
>     +-- LookupError
>         +-- IndexError
>         +-- KeyError
>     +-- TypeError
>     +-- AttributeError
>     +-- EnvironmentError
> 	+-- OSError
> 	+-- IOError
> 	    +-- EOFError (new inheritance)
>     +-- ImportError
>     +-- NotImplementedError (new inheritance)
>     +-- NamespaceError (rename of NameError)
>         +-- UnboundGlobalError (new)
>         +-- UnboundLocalError
> 	+-- UnboundFreeError (new)
>     +-- WeakReferenceError (rename of ReferenceError)
>     +-- ValueError
> +-- Warning
>     +-- UserWarning
>     +-- AnyDeprecationWarning (new)
> 	+-- PendingDeprecationWarning 
>         +-- DeprecationWarning
>     +-- SyntaxWarning
>     +-- SemanticsWarning (rename of RuntimeWarning)
>     +-- FutureWarning

I am wondering why OSError and IOError are not under StandardError? This 
seems a serious misfeature to me (perhaps the posting was just 
misformatted?).

Having one class for "normal" errors (not exceptions whose sole purpose 
is to halt the program and not so critical that any continuation is 
hopeless) sure would make it easier to write code that output a 
traceback and tried to continue. I'd love it.

-- Russell


From gvanrossum at gmail.com  Wed Aug  3 22:47:27 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 3 Aug 2005 13:47:27 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab100508031310733c9889@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
	<bbaeab100508031310733c9889@mail.gmail.com>
Message-ID: <ca471dc205080313474d969ed8@mail.gmail.com>

> > Because of EIBTI?
> 
> Don't know the acronym (and neither does acronymfinder.com).

Sorry. Explicit is Better than Implicit.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Wed Aug  3 23:12:09 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 14:12:09 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <rowen-64DE4A.13352403082005@sea.gmane.org>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<rowen-64DE4A.13352403082005@sea.gmane.org>
Message-ID: <bbaeab100508031412553661d7@mail.gmail.com>

On 8/3/05, Russell E. Owen <rowen at cesmail.net> wrote:
> In article <bbaeab1005080217346b2af653 at mail.gmail.com>,
>  Brett Cannon <bcannon at gmail.com> wrote:
> 
> > New Hierarchy
> > =============
> >
> > Exception
[SNIP]
> > +-- StandardError
[SNIP]
> >     +-- EnvironmentError
> >       +-- OSError
> >       +-- IOError
> >           +-- EOFError (new inheritance)
[SNIP]
> 
> I am wondering why OSError and IOError are not under StandardError? This
> seems a serious misfeature to me (perhaps the posting was just
> misformatted?).
> 

Look again; they are with an inheritance for both of (OS|IO)Error <-
EnvironmentError <- StandardError <- Exception.

> Having one class for "normal" errors (not exceptions whose sole purpose
> is to halt the program and not so critical that any continuation is
> hopeless) sure would make it easier to write code that output a
> traceback and tried to continue. I'd love it.
> 

That is what StandardError is for.

-Brett

From foom at fuhm.net  Thu Aug  4 00:26:12 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 3 Aug 2005 18:26:12 -0400
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050803120036668662@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
Message-ID: <C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>

On Aug 3, 2005, at 3:00 PM, Guido van Rossum wrote:
> [...brain hums...]
>
> OK, I'm changing my mind again about the names again.
>
> Exception as the root and StandardError can stay; the only new
> proposal would then be to make bare 'except:' call StandardError.

I don't see how that can work. Any solution that is expected to  
result in a usable hierarchy this century must preserve "Exception"  
as the object that user exceptions should derive from (and therefore  
that users should generally catch, as well). There is way too much  
momentum behind that to change it.

James

From bcannon at gmail.com  Thu Aug  4 00:47:54 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 15:47:54 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
	<C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
Message-ID: <bbaeab10050803154754b9206a@mail.gmail.com>

On 8/3/05, James Y Knight <foom at fuhm.net> wrote:
> On Aug 3, 2005, at 3:00 PM, Guido van Rossum wrote:
> > [...brain hums...]
> >
> > OK, I'm changing my mind again about the names again.
> >
> > Exception as the root and StandardError can stay; the only new
> > proposal would then be to make bare 'except:' call StandardError.
> 
> I don't see how that can work. Any solution that is expected to
> result in a usable hierarchy this century must preserve "Exception"
> as the object that user exceptions should derive from (and therefore
> that users should generally catch, as well). There is way too much
> momentum behind that to change it.
> 

Oh, I bet Guido can make them change.  =)

Look at it this way; going with the Raisable/Exception change and
having bare 'except's catch Exception will still lead to a semantic
change since CriticalError and ControlFlowException will not be
caught.  Breakage is going to happen, so why not just do a more
thorough change that leads to more breakage?

Obviously you are saying to minimize it while Guido is saying to go
for a more thorough change.  So how much more code is going to crap
out with this change?  Everything under our control will be fine since
we can change it.  User-defined exceptions might need to be changed if
they inherit directly from Exception instead of StandardError, which
is probably the common case, but changing a superclass is not hard. 
That kind of breakage is not bad since you can easily systematically
change superclasses of exceptions from Exception to StandardError
without much effort thanks to regexes.

I honestly think the requirement of inheriting from a specific
superclass will lead to more breakage since you can't grep for
exceptions that don't at least inherit from *some* exception
universally.

-Brett

From gvanrossum at gmail.com  Thu Aug  4 01:19:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 3 Aug 2005 16:19:18 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
	<C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
Message-ID: <ca471dc2050803161953a41eee@mail.gmail.com>

[Guido van Rossum]
> > OK, I'm changing my mind again about the names again.
> >
> > Exception as the root and StandardError can stay; the only new
> > proposal would then be to make bare 'except:' call StandardError.

[James Y Knight]
> I don't see how that can work. Any solution that is expected to
> result in a usable hierarchy this century must preserve "Exception"
> as the object that user exceptions should derive from (and therefore
> that users should generally catch, as well). There is way too much
> momentum behind that to change it.

This is actually a good point, and what I was thinking when I first
responded to Brett.

Sorry for the waivering -- being at OSCON always is a serious attack
on my system.

I'm still searching for a solution that lets us call everything in the
hierarchy "exception" and *yet* has Exception at the mid-point in that
hierarchy where Brett has StandardException. The problem with Raisable
is that it doesn't contain the word exception; perhaps we can call it
BaseException? We've got a few more things called base-such-and-such,
e.g. basestring (not that I like that one all that much).

BTW I just noticed UserException -- shouldn't this be UserError?

Also... We should have a guideline for when to use "...Exception" and
when to use "...Error". Maybe we can use ...Exception for the first
two levels of the hierarchy, ...Error for errors, and other endings
for things that aren't errors (like SystemExit)? Then the top of the
tree would look like this:

BaseException (or RootException?)
+-- CriticalException
+-- ControlFlowException
+-- Exception
    +-- (all regular exceptions start here)
    +-- Warning

All common errors and warnings derive from Exception; bare 'except:' 
would be the same as 'except Exception:'. (I like that particularly
because I've been writing that in lots of code already. :-)

A refinement might be to introduce something called Error, which would
change the last part of the avove hierarchy as follows:

(first three lines same as above)
+-- Exception
    +-- Error
        +-- (all regular ...Error exceptions start here)
    +-- Warning
        +-- (all warnings start here)

This has a nice symmetry between Error and Warning.

Downside is that this "breaks" all user code that currently tries to
be correct by declaring exceptions as deriving from Exception, which
is pretty common; they would have to derive from Error to be
politically correct.

I don't immediately see what's best -- maybe Exception and Error
should be two names for the same object??? But that's ugly too as a
long-term solution.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Thu Aug  4 01:34:52 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 03 Aug 2005 19:34:52 -0400
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050803161953a41eee@mail.gmail.com>
Message-ID: <000601c59883$f35fb280$8421cb97@oemcomputer>

> The problem with Raisable
> is that it doesn't contain the word exception; perhaps we can call it
> BaseException? 

+1


> A refinement might be to introduce something called Error, which would
> change the last part of the avove hierarchy as follows:
 . . .
> This has a nice symmetry between Error and Warning.
> 
> Downside is that this "breaks" all user code that currently tries to
> be correct by declaring exceptions as deriving from Exception, which
> is pretty common; they would have to derive from Error to be
> politically correct.
> 
> I don't immediately see what's best -- maybe Exception and Error
> should be two names for the same object??? But that's ugly too as a
> long-term solution.

-1 

Who really cares about the distinction?  Besides, the correct choice may
depend on your point of view or specific application (i.e. a case could
be made that NotImplementedError is sometimes just a regular exception
that can be expected to arise and be handled in the normal course of
business).  Unless we can point to real problems that people are having
today, then these kind of changes are likely unwarranted.


Raymond


From bcannon at gmail.com  Thu Aug  4 01:40:36 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 16:40:36 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050803161953a41eee@mail.gmail.com>
References: <bbaeab1005080217346b2af653@mail.gmail.com>
	<000301c597c7$1551dde0$92b2958d@oemcomputer>
	<bbaeab10050803095538a125e7@mail.gmail.com>
	<ca471dc20508031018328f1b18@mail.gmail.com>
	<bbaeab1005080310441c24032f@mail.gmail.com>
	<ca471dc2050803120036668662@mail.gmail.com>
	<C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
	<ca471dc2050803161953a41eee@mail.gmail.com>
Message-ID: <bbaeab100508031640a744c93@mail.gmail.com>

On 8/3/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> [Guido van Rossum]
> > > OK, I'm changing my mind again about the names again.
> > >
> > > Exception as the root and StandardError can stay; the only new
> > > proposal would then be to make bare 'except:' call StandardError.
> 
> [James Y Knight]
> > I don't see how that can work. Any solution that is expected to
> > result in a usable hierarchy this century must preserve "Exception"
> > as the object that user exceptions should derive from (and therefore
> > that users should generally catch, as well). There is way too much
> > momentum behind that to change it.
> 
> This is actually a good point, and what I was thinking when I first
> responded to Brett.
> 
> Sorry for the waivering -- being at OSCON always is a serious attack
> on my system.
> 

As long as you don't change your mind again on bare 'except's I won't
feel like strangling you.  =)

> I'm still searching for a solution that lets us call everything in the
> hierarchy "exception" and *yet* has Exception at the mid-point in that
> hierarchy where Brett has StandardException. The problem with Raisable
> is that it doesn't contain the word exception; perhaps we can call it
> BaseException? We've got a few more things called base-such-and-such,
> e.g. basestring (not that I like that one all that much).
> 

BaseException is what comes to mind initially.  You also mention
RootException below.  PureException seems too cutesy. 
SuperclassException might work.  SuperException doesn't sound right. 
Co-worker suggested UhOh, but I don't think that will work either.  =)

> BTW I just noticed UserException -- shouldn't this be UserError?
> 

Yep, and I already changed it in my personal copy.

> Also... We should have a guideline for when to use "...Exception" and
> when to use "...Error". Maybe we can use ...Exception for the first
> two levels of the hierarchy, ...Error for errors, and other endings
> for things that aren't errors (like SystemExit)? Then the top of the
> tree would look like this:
> 

That makes the most sense.  Error for actual errors, exception when
another suffix (e.g., Exit, Iteration) does not fit.

> BaseException (or RootException?)
> +-- CriticalException
> +-- ControlFlowException
> +-- Exception
>     +-- (all regular exceptions start here)
>     +-- Warning
> 
> All common errors and warnings derive from Exception; bare 'except:'
> would be the same as 'except Exception:'. (I like that particularly
> because I've been writing that in lots of code already. :-)
> 
> A refinement might be to introduce something called Error, which would
> change the last part of the avove hierarchy as follows:
> 
> (first three lines same as above)
> +-- Exception
>     +-- Error
>         +-- (all regular ...Error exceptions start here)
>     +-- Warning
>         +-- (all warnings start here)
> 
> This has a nice symmetry between Error and Warning.
> 
> Downside is that this "breaks" all user code that currently tries to
> be correct by declaring exceptions as deriving from Exception, which
> is pretty common; they would have to derive from Error to be
> politically correct.
> 
> I don't immediately see what's best -- maybe Exception and Error
> should be two names for the same object??? But that's ugly too as a
> long-term solution.

Yuck.

I say introduce Error (or StandardError or BaseError) and just live
with the fact that older code will not necessarily follow the proper
naming scheme.  We can provide a script that will change source
directly for any class that inherits from Exception to some other
class, namely Error.

-Brett

From aahz at pythoncraft.com  Thu Aug  4 02:26:18 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 3 Aug 2005 17:26:18 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e105080312181e25fa08@mail.gmail.com>
References: <200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
Message-ID: <20050804002618.GA2779@panix.com>

On Wed, Aug 03, 2005, Nicholas Bastin wrote:
>
> I'd put $20 on the fact that cvs2svn will *not* work out of the box
> for converting the python repository.  Just call it a hunch.  In any
> case, the Perforce-supplied cvs2p4 should work at least as well.

Maybe.  OTOH, I went to a CVS->SVN talk today at OSCON, and I'd be
suspicious of claims that Python's repository is more difficult to
convert than others that have successfully made the switch (such as KDE).
I'd rather not rely on licensing of a closed-source system; one of the
points made during the talk was that the Linux project had to scramble
when they lost their Bitkeeper license (but they didn't switch to SVN
because they wanted a distributed model -- one of things I appreciated
about this talk was the lack of One True Way-ism).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From stephen at xemacs.org  Thu Aug  4 05:36:55 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 04 Aug 2005 12:36:55 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050804002618.GA2779@panix.com> (aahz@pythoncraft.com's
	message of "Wed, 3 Aug 2005 17:26:18 -0700")
References: <200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<20050804002618.GA2779@panix.com>
Message-ID: <87ek9afk4o.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "aahz" == aahz  <aahz at pythoncraft.com> writes:

    aahz> I'd rather not rely on licensing of a closed-source system;
    aahz> one of the points made during the talk was that the Linux
    aahz> project had to scramble when they lost their Bitkeeper
    aahz> license

Python is unlikely to throw away its license in the same way, I should
think.  For additional security, you could try to negotiate a
perpetual license on a particular version, or a license that required
substantial notice (say, six months) for termination.  I would imagine
you could get them; the only reason for the vendor not to give them
would be spite.

The problem with both of those options is the one that Martin already
pointed out: negotiation takes effort.  There are several good open
source alternatives, one of which (svn) is well-established and gets
excellent reviews for those goals it sets itself, which happen to be
solving the problems (as opposed to missing features) of CVS.  Why
spend effort on negotiating licenses and preparing for potential
vendor relationship problems, unless there's acknowledged need for
features svn doesn't provide?

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From bcannon at gmail.com  Thu Aug  4 05:43:41 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 3 Aug 2005 20:43:41 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
Message-ID: <bbaeab1005080320431cfca77@mail.gmail.com>

OK, once the cron job comes around and is run,
http://www.python.org/peps/pep-0348.html will not be a 404 but be the
latest version of the PEP.

Differences since my last public version is that it has
BaseException/Exception as the naming hierarchy, Warning inherits from
Exception, UserException is UserError, and StandardError inherits from
Exception.  I also added better annotations on the tree for noticing
where inheritance changed and whether it become broader (and thus had
a new exception in its MRO) or more restrictive (and thus lost an
exception).  Basically everything that Guido has brought up today
(08-03).

I may have made some mistakes changing over to BaseException/Exception
thanks to their names being so similar and tossing back in
StandardError so if people catch what seems like odd sentences that is
why (obviously let me know of the mistake).

-Brett

From stephen at xemacs.org  Thu Aug  4 06:17:50 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 04 Aug 2005 13:17:50 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F11476.9000507@egenix.com> (M.'s message of "Wed, 03 Aug
	2005 21:01:10 +0200")
References: <42E93940.6080708@v.loewis.de>
	<1122676547.10752.61.camel@geddy.wooz.org>
	<42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>
Message-ID: <87acjyfi8h.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "M" == "M.-A. Lemburg" <mal at egenix.com> writes:

    M> Other non-commercial alternatives are Berlios and Savannah, but
    M> I'm not sure whether they'd offer Subversion support.

Savannah doesn't offer great reliability or support, at least to judge
by the frequency with which the GNU Emacs and GNU Arch projects have
been unable to access various services on Savannah, including mailing
lists and CVS.

I also wonder if Savannah poses security risks.  They've been
successfully cracked (ISTR more than once) in the last couple of
years, and took 6-10 weeks to get back to normal.  This makes them
reluctant to make minor variations in their established procedures for
the convenience of projects.  For example, it took a couple of months
for GNU Arch to arrange sftp access so that they could host the Arch
project in an Arch repository (Arch can use sftp but not plain ssh as
a transport).

SunSITE.dk does provide reliable service and timely support.  XEmacs
has been very happy with it.  But Martin v. Loewis apparently hasn't
had the same good experience with negotiating with them, and at least
some negotiation and relationship maintenance is necessary---it's a
closer, more personal relationship than with SF or Savannah.  In
particular for Subversion support (I was told they allow it on a case
by case basis, and once success is demonstrated they plan to offer it
in general).  As I say, we've been happy with SunSITE, but the amount
of effort is basically the same as if we ran our own repository, just
directed more toward "vendor relations" and away from "sys admin"
(which suits us).

FWIW, XEmacs has moved or reorganized CVS repositories five times
since 1999.  Although it's not all in the PEP, if you add the
discussion on this list Martin has covered the important issues we
encountered or worried about.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From martin at v.loewis.de  Thu Aug  4 07:42:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 04 Aug 2005 07:42:54 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e105080312181e25fa08@mail.gmail.com>
References: <42E93940.6080708@v.loewis.de>
	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	<1122918723.9680.33.camel@warna.corp.google.com>	<m24qa9f5v8.wl%gnn@neville-neil.com>
	<42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
Message-ID: <42F1AADE.50908@v.loewis.de>

Nicholas Bastin wrote:
>>No. The PEP is only about Subversion. Why should we be looking at Per
>>Force? Only because Python is Open Source?
> 
> 
> Perforce is a commercial product, but it can be had for free for
> verified Open Source projects, which Python shouldn't have any problem
> with.  There are other problems, like you have to renew the agreement
> every year, but it might be worth considering, given the fact that
> it's an excellent system.

So we should consider it because it is an excellent system... I don't
know what that means, in precise, day-to-day usage terms (i.e. what
precisely would it do for us that, say, Subversion can't do).

>>I think anything but Subversion is ruled out because:
>>- there is no offer to host that anywhere (for subversion, there is
>>  already svn.python.org)
> 
> 
> We could host a Perforce repository just as easily, I would think.

Interesting offer. I'll add this to the PEP - who is "we" in this
context?

>>- there is no support for converting a CVS repository (for subversion,
>>  there is cvs2svn)
> 
> 
> I'd put $20 on the fact that cvs2svn will *not* work out of the box
> for converting the python repository.  Just call it a hunch. 

You could have read the PEP before losing that money :-) It did work
out of the box.

Regards,
Martin

From mal at egenix.com  Thu Aug  4 10:51:56 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 04 Aug 2005 10:51:56 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F11962.2070107@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com>	<42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de>
Message-ID: <42F1D72C.8070202@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>>I haven't received any offers to make a qualified statement. I only
>>>know that I would oppose an approach to ask somebody but our
>>>volunteers to do it for free, and I also know that I don't want to
>>>spend my time researching commercial alternatives (although I
>>>wouldn't mind if you spent your time).
>>
>>
>>I don't quite understand what you meant here: are you opposing
>>spending PSF money on a hosting company if and only if volunteers
>>who take on the job don't get paid ?
> 
> No. I'm opposed to approaching somebody to do it for free, except
> the somebody are the pydotorg volunteers (IOW, I won't take gifts
> from anybody else in this matter).

Ok.

>>I've done a bit of research on the subject and so far only found
>>CollabNet and VA offering commercial services in this area. VA hosts
>>SourceForge so that's a non-option, I guess :-)
> 
> 
> It's not that I dislike VA - I personally think they are doing a
> great job with SourceForge, and I like SourceForge a lot. There
> are just some issues with it (like that they offer no Subversion).
> 
> The question would be: what precisely is the commercial offering from
> VA: does it provide subversion? how is the user management done?
> etc.

I guess this was a misunderstanding on my part: VA doesn't offer
their commercial solution in an ASP-like way. Their product,
called SourceForge Enterprise, is a J2EE application which we'd
have to install and run. They do mention Subversion as being
supported by the Enterprise edition.

>>I know that Greg Stein worked for CollabNet, so thought it might be a
>>good idea to ask him about the idea to move things to CollabNet.
>>Of course, before taking this route, I wanted to get a feeling
>>for the general attitude towards a commercial approach, which
>>is why I tossed in the idea.
> 
> Ok - I expect that the project might be *done* before we even have
> a single commercial offer, with a precise service description,
> and a precise price tag. That makes commercial offers so difficult:
> that it is so time expensive to use them, that you might spend
> less time doing it yourself.

For (more or less) simple things like setting up SVN, I'd agree,
but for hosting a complete development system, I have my doubts -
things start to get rather complicated and integration of various
different tools tends to be very time consuming.

Sysadmin tasks like doing backups, emergency recovery, etc. also
get more complicated once you have to deal with many different ways
of data storage deployed by such tools, e.g. many of them
require use of special tools to do hot backups.

>>Other non-commercial alternatives are Berlios and Savannah, but
>>I'm not sure whether they'd offer Subversion support.
> 
> 
> For me, they fall into the "I won't take gifts" category.

Ok, I'll drop the idea.

>>BTW, have you considered using Trac as issue tracker on
>>svn.python.org ?
> 
> 
> You mean, me personally? I quite like the Subversion tracker,
> and don't want to trade it for anything else. I know Guido
> wants to use Roundup (which is also written in Python),
> and obviously so does Richard Jones.
> 
> The main questions are the same as with this PEP: how to do
> the migration from SF (without losing data), and how to
> do the ongoing maintenance. It's just that finding answers
> to these questions is so much harder, therefore, this PEP
> is *only* about CVS.

Ok.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 04 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Thu Aug  4 12:07:40 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 Aug 2005 20:07:40 +1000
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <42F1E8EC.9080506@gmail.com>

Brett Cannon wrote:
> OK, once the cron job comes around and is run,
> http://www.python.org/peps/pep-0348.html will not be a 404 but be the
> latest version of the PEP.
> 
> Differences since my last public version is that it has
> BaseException/Exception as the naming hierarchy, Warning inherits from
> Exception, UserException is UserError, and StandardError inherits from
> Exception.  I also added better annotations on the tree for noticing
> where inheritance changed and whether it become broader (and thus had
> a new exception in its MRO) or more restrictive (and thus lost an
> exception).  Basically everything that Guido has brought up today
> (08-03).
> 

If/when you add a "Getting there from here" section, it would be worth noting 
that there are a few basic strategies to be applied:

  - for new exceptions:
     - just add them in release 2.x

  - for name changes:
     - add the new name as an alias in release 2.x
     - deprecate the old name in release 2.x
     - delete the old name in release 2.(x+1)

  - to switch inheritance to a new exception type:
     - add the inheritance to the new parent in release 2.x
     - delete the inheritance from the old parent in release 3.0

  - to switch inheritance to an existing exception type:
     - add the inheritance to the new parent in release 3.0
     - delete the inheritance from the old parent in release 3.0

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Aug  4 12:47:12 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 Aug 2005 20:47:12 +1000
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <42F1E8EC.9080506@gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<42F1E8EC.9080506@gmail.com>
Message-ID: <42F1F230.5000505@gmail.com>

Nick Coghlan wrote:
> If/when you add a "Getting there from here" section, it would be worth noting 
> that there are a few basic strategies to be applied:

Eh, never mind. It's already there ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Aug  4 13:03:05 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 Aug 2005 21:03:05 +1000
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <42F1F5E9.8050904@gmail.com>

Brett Cannon wrote (in the PEP):
> KeyboardInterrupt inheriting from ControlFlowException
> 
> KeyboardInterrupt has been a contentious point within this hierarchy. Some
> view the exception as more control flow being caused by the user. But with
> its asynchronous cause thanks to the user being able to trigger the
> exception at any point in code it has a more proper place inheriting from
> CriticalException. It also keeps the name of the exception from being
> "CriticalError".

I think this argues against your own hierarchy, since you _did_ call the 
parent exception CriticalError. By your argument above, that suggests 
KeyboardInterrupt doesn't belong there ;)

In practice, whether KeyboardInterrupt inherits from ControlFlowException or 
CriticalError shouldn't be a big deal - the important thing is to get it out 
from under Exception and StandardError.

At which point, the naming issue is enough to incline me towards christening 
it a ControlFlowException. It gets all the 'oddly named' exceptions into one 
place.

Additionally, consider that a hypothetical ThreadExit exception (used to 
terminate a thread semi-gracefully) would also clearly belong under 
ControlFlowException. That is, just because something is asynchronous with 
respect to the currently executing code doesn't necessarily make it an error 
(yes, I know I argued the opposite point the other day. . .).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Aug  4 13:27:40 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 Aug 2005 21:27:40 +1000
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <42F1FBAC.6020404@gmail.com>

Since I forgot to mention it in the last couple of messages - this version 
looks very good. The transition strategy section makes it a lot more meaningful.

Brett Cannon wrote (in the PEP):
> Renamed Exceptions
> 
> Renamed exceptions will directly subclass the new names. When the old
> exceptions are instantiated (which occurs when an exception is caught,
> either by a try statement or by propagating to the top of the execution
> stack), a PendingDeprecationWarning will be raised.

Nice trick with figuring out how to raise the deprecation warning :)
(That line was going to read 'Why not just create an alias?', but then I 
worked out what you were doing, and why you were doing it)

One case that this doesn't completely address is NameError, as it is the only 
renamed exception which currently has a subclass. In this case, I think that 
during the transmition phase, all three of the 'Unbound*Error' exceptions 
should inherit from NameError, with NameError inheriting from NamespaceError.

I believe it should still be possible to get the deprecation warning to work 
correctly in this case (by not raising the warning when a subclass is 
instantiated).

In the 'just a type' category, WeakReferenceError should still be under 
StandardError in the hierarchy.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From foom at fuhm.net  Thu Aug  4 15:01:47 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 4 Aug 2005 09:01:47 -0400
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <42F1F5E9.8050904@gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<42F1F5E9.8050904@gmail.com>
Message-ID: <5148626E-877E-4B6E-881C-0A56824FDE66@fuhm.net>

On Aug 4, 2005, at 7:03 AM, Nick Coghlan wrote:
> Additionally, consider that a hypothetical ThreadExit exception  
> (used to
> terminate a thread semi-gracefully) would also clearly belong under
> ControlFlowException. That is, just because something is  
> asynchronous with
> respect to the currently executing code doesn't necessarily make it  
> an error
> (yes, I know I argued the opposite point the other day. . .).

No. Just because something gets asynchronously raised out from under  
you *does* make it critical (or maybe "critically fatal"). See my  
reply to Philip Eby on Aug 2, msgid  
<9EDA49FB-1E9B-4558-9441-90A65ECC5A52 at fuhm.net>.

James

From metawilm at gmail.com  Thu Aug  4 15:37:28 2005
From: metawilm at gmail.com (Willem Broekema)
Date: Thu, 4 Aug 2005 15:37:28 +0200
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <f6bc9b49050804063739eab48@mail.gmail.com>

On 8/4/05, Brett Cannon <bcannon at gmail.com> wrote:
> OK, once the cron job comes around and is run,
> http://www.python.org/peps/pep-0348.html will not be a 404 but be the
> latest version of the PEP.

Currently, when the "recursion limit" is reached, a RuntimeError is
raised. RuntimeError is in the PEP renamed to UserError. UserError is
in the new hierarchy located below StandardError, below Exception.

I think that in the new hierarchy this error should be in the same
"critical" category as MemoryError. (MemoryError includes general
stack overflow.)


- Willem

From foom at fuhm.net  Thu Aug  4 17:06:00 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 4 Aug 2005 11:06:00 -0400
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <D5ACD356-29D0-4F05-B073-0AAF8B26D10C@fuhm.net>

>        +-- NamespaceError (rename of NameError)
>            +-- UnboundFreeError (new)
>            +-- UnboundGlobalError (new)
>            +-- UnboundLocalError
>

What are these new exceptions for? Under what circumstances are they  
raised? Why is this necessary or an improvement?

> Renamed Exceptions
>
> Renamed exceptions will directly subclass the new names. When the  
> old exceptions are instantiated (which occurs when an exception is  
> caught, either by a try statement or by propagating to the top of  
> the execution stack), a PendingDeprecationWarning will be raised.
>
> This should properly preserve backwards-compatibility as old usage  
> won't change and the new names can be used to also catch exceptions  
> using the old name. The warning of the deprecation is also kept  
> simple.

This will cause problems when a library raises the exception under  
the new name and an app tries to catch the old name. So the standard  
lib (or any other lib) cannot raise the new names. Because the stdlib  
must raise the old names, people will see the old names, continue  
catching the old names, and the new names will never catch on.

Perhaps it'd work out better to have the new names subclass the old  
names. Then you have to continue catching the old name as long as  
anyone is raising it, but at least you can raise the new name with  
impunity. I expect not much code actually raises ReferenceError or  
NameError besides that internal to python. Thus it would be  
relatively safe to change all code to catch the new names for those  
immediately. Lots of code raises RuntimeError, but I bet not very  
much code explicitly catches it.

Oh, but if the stdlib starts raising under the new names, that'll  
break any code that checks the exact type of the exception against  
the old name. Boo.

It'd be better to somehow raise a DeprecationWarning upon access, yet  
still result in the same object. Unfortunately I don't think there's  
any way to do that in python. This lack of ability to deprecate  
module attributes has bit me several times in other projects as well.  
Matt Goodall wrote the hack attached at the end in order to move some  
whole modules around in Nevow. Amazingly it actually seemed to  
work. :) Something like that won't work for __builtins__, of course,  
since that's accessed directly with PyDict_Get.

All in all I don't really see a real need for these renamings and I  
don't see a way to do them compatibly so I'm -1 to the whole idea of  
renaming exceptions.

> Removal of Bare except Clauses
>
> A SemanticsWarning will be raised for all bare except clauses.

Does this mean that bare except clauses change meaning to "except  
Exception" immediately? Or (I hope) did you mean that in Py2.5 they  
continue doing as they do now, but print a warning to tell you they  
will be changing in the future?

James


> import sys
> import types
> import warnings
>
> from twisted.python import reflect
>
> class ModuleWithDeprecations(types.ModuleType):
>
>     def __init__(self, original, deprecatedNames):
>         self.original = original
>         self.deprecatedNames = deprecatedNames
>
>     def __getattr__(self, name):
>         newName = self.deprecatedNames.get(name, None)
>         if newName is not None:
>             warnings.warn("nevow.%s is deprecated, please import %s  
> instead!"% (name,newName), DeprecationWarning, 2)
>             return reflect.namedAny(newName)
>         return getattr(self.original, name)
>
> # Evil hack? What evil hack!
> sys.modules['nevow'] = ModuleWithDeprecations(
>     sys.modules['nevow'],
>     {'formless': 'formless',
>      'freeform': 'formless.webform'
>      }
>     )


From gvanrossum at gmail.com  Thu Aug  4 17:23:32 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 4 Aug 2005 08:23:32 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <f6bc9b49050804063739eab48@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
Message-ID: <ca471dc2050804082335553bfd@mail.gmail.com>

In general the PEP looks really good now!

On 8/4/05, Willem Broekema <metawilm at gmail.com> wrote:
> On 8/4/05, Brett Cannon <bcannon at gmail.com> wrote:
> > OK, once the cron job comes around and is run,
> > http://www.python.org/peps/pep-0348.html will not be a 404 but be the
> > latest version of the PEP.
> 
> Currently, when the "recursion limit" is reached, a RuntimeError is
> raised. RuntimeError is in the PEP renamed to UserError. UserError is
> in the new hierarchy located below StandardError, below Exception.
> 
> I think that in the new hierarchy this error should be in the same
> "critical" category as MemoryError. (MemoryError includes general
> stack overflow.)

No. Usually, a recursion error is a simple bug in the code, no
different from a TypeError or NameError etc.

This does contradict my earlier claim that Python itself doesn't use
RuntimeError; I think I'd be happier if it remained RuntimeError. (I
think there are a few more uses of it inside Python itself; I don't
think it's worth inventing new exceptions for all these.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Aug  4 19:57:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 04 Aug 2005 19:57:16 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F1D72C.8070202@egenix.com>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com>	<42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com>
Message-ID: <42F256FC.7050606@v.loewis.de>

M.-A. Lemburg wrote:
> I guess this was a misunderstanding on my part: VA doesn't offer
> their commercial solution in an ASP-like way. Their product,
> called SourceForge Enterprise, is a J2EE application which we'd
> have to install and run. They do mention Subversion as being
> supported by the Enterprise edition.

Ah, ok. I don't think I want to operate such a software (and,
strictly speaking, this is out of the scope of the PEP). I had the
"pleasure" once of having to maintain a SourceForge installation
(before SourceForge became closed source), and it was a nightmare
to operate.

> For (more or less) simple things like setting up SVN, I'd agree,
> but for hosting a complete development system, I have my doubts -
> things start to get rather complicated and integration of various
> different tools tends to be very time consuming.

I guess Python's development process is very simple then. We use
mailing lists, CVS, newsgroups, web servers, and bug trackers,
but these don't have to integrate. Many of these services are
already on pydotorg, and I propose to add an additional one
(revision control).

> Sysadmin tasks like doing backups, emergency recovery, etc. also
> get more complicated once you have to deal with many different ways
> of data storage deployed by such tools, e.g. many of them
> require use of special tools to do hot backups.

We are doing quite well here. XS4ALL kindly does disk backup for
us, and, in the specific case of Subversion's fsfs, this is all
that is needed. For Postgres, we backup to disk, which then
gets picked up by the disk backup.

Regards,
Martin

From raymond.hettinger at verizon.net  Thu Aug  4 19:56:50 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 04 Aug 2005 13:56:50 -0400
Subject: [Python-Dev] PEP 342 Implementation
Message-ID: <000001c5991d$e40bb140$12b62c81@oemcomputer>

Could someone please make an independent check to verify an issue with
the 342 checkin.  The test suite passes but when I run IDLE and open a
new window (using Control-N), it crashes and burns.

The problem does not occur just before the checkin:
    cvs up -D "2005-08-01 18:00"
But emerges immediately after:
    cvs up -D "2005-08-01 21:00"


Raymond


From mal at egenix.com  Thu Aug  4 20:28:06 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 04 Aug 2005 20:28:06 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F256FC.7050606@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com>	<42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>	<42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de>
Message-ID: <42F25E36.5060103@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>I guess this was a misunderstanding on my part: VA doesn't offer
>>their commercial solution in an ASP-like way. Their product,
>>called SourceForge Enterprise, is a J2EE application which we'd
>>have to install and run. They do mention Subversion as being
>>supported by the Enterprise edition.
> 
> 
> Ah, ok. I don't think I want to operate such a software (and,
> strictly speaking, this is out of the scope of the PEP). I had the
> "pleasure" once of having to maintain a SourceForge installation
> (before SourceForge became closed source), and it was a nightmare
> to operate.

With J2EE I doubt that things got any easier to maintain...
(assuming that you had to run the version of the software which
is used on SF.net).

>>For (more or less) simple things like setting up SVN, I'd agree,
>>but for hosting a complete development system, I have my doubts -
>>things start to get rather complicated and integration of various
>>different tools tends to be very time consuming.
> 
> 
> I guess Python's development process is very simple then. We use
> mailing lists, CVS, newsgroups, web servers, and bug trackers,
> but these don't have to integrate. Many of these services are
> already on pydotorg, and I propose to add an additional one
> (revision control).
> 
> 
>>Sysadmin tasks like doing backups, emergency recovery, etc. also
>>get more complicated once you have to deal with many different ways
>>of data storage deployed by such tools, e.g. many of them
>>require use of special tools to do hot backups.
> 
> 
> We are doing quite well here. XS4ALL kindly does disk backup for
> us, and, in the specific case of Subversion's fsfs, this is all
> that is needed. For Postgres, we backup to disk, which then
> gets picked up by the disk backup.

Sounds like you have everything under control, which is good :-)

BTW, in one of your replies I read that you had a problem with
how cvs2svn handles trunk, branches and tags. In reality, this
is no problem at all, since Subversion is very good at handling
moves within the repository: you can easily change the repository
layout after the import to whatevery layout you see fit - without
losing any of the version history.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 04 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2005-07-18: Released mxODBC.Zope.DA for Zope 2.8

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From pje at telecommunity.com  Thu Aug  4 20:37:04 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Aug 2005 14:37:04 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F25E36.5060103@egenix.com>
References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de>
	<1122676547.10752.61.camel@geddy.wooz.org>
	<42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de>
Message-ID: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com>

At 08:28 PM 8/4/2005 +0200, M.-A. Lemburg wrote:
>BTW, in one of your replies I read that you had a problem with
>how cvs2svn handles trunk, branches and tags. In reality, this
>is no problem at all, since Subversion is very good at handling
>moves within the repository: you can easily change the repository
>layout after the import to whatevery layout you see fit - without
>losing any of the version history.

Yeah, in my use of SVN I find that this is more theoretical than actual for 
certain use cases.  You can see the history of a file including the history 
of any file it was copied from.  However, if you want to try to look at the 
whole layout, you can't easily get to the old locations.  This can be a 
royal pain, whereas at least in CVS you can use viewcvs to show you the 
"attic".  Subversion doesn't have an attic, which makes looking at 
structural history very difficult.

That having been said, I generally like Subversion, I just know that when I 
moved my projects to it I felt it was worth taking extra care to convert 
them in a way that didn't require me to reorganize the repository 
immediately thereafter, because I didn't want a sudden discontinuity, 
beyond which history would be difficult to follow.

Therefore, I'm saying that taking some care with the conversion process to 
get things the way we like them would be a good idea.


From mal at egenix.com  Thu Aug  4 21:29:41 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 04 Aug 2005 21:29:41 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com>
References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de>
	<1122676547.10752.61.camel@geddy.wooz.org>
	<42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de>
	<5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com>
Message-ID: <42F26CA5.6010009@egenix.com>

Phillip J. Eby wrote:
> At 08:28 PM 8/4/2005 +0200, M.-A. Lemburg wrote:
> 
>> BTW, in one of your replies I read that you had a problem with
>> how cvs2svn handles trunk, branches and tags. In reality, this
>> is no problem at all, since Subversion is very good at handling
>> moves within the repository: you can easily change the repository
>> layout after the import to whatevery layout you see fit - without
>> losing any of the version history.
> 
> 
> Yeah, in my use of SVN I find that this is more theoretical than actual 
> for certain use cases.  You can see the history of a file including the 
> history of any file it was copied from.  However, if you want to try to 
> look at the whole layout, you can't easily get to the old locations.  
> This can be a royal pain, whereas at least in CVS you can use viewcvs to 
> show you the "attic".  Subversion doesn't have an attic, which makes 
> looking at structural history very difficult.

Hmm, I usually create a tag before doing such changes in our Subversion
repo. This makes it very easy to look at layouts before a restructuring.

And because Subversion doesn't really care whether you do a tag, branch,
or some other form of diverting versions into different namespaces (it's
all just copying data), you can easily create a directory called "attic"
for just this purpose and copy your structural change tags in there :-)

> That having been said, I generally like Subversion, I just know that 
> when I moved my projects to it I felt it was worth taking extra care to 
> convert them in a way that didn't require me to reorganize the 
> repository immediately thereafter, because I didn't want a sudden 
> discontinuity, beyond which history would be difficult to follow.
> 
> Therefore, I'm saying that taking some care with the conversion process 
> to get things the way we like them would be a good idea.

Still very true indeed.

The fact that cvs2svn is written in Python should make this even easier.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 04 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2005-07-18: Released mxODBC.Zope.DA for Zope 2.8

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From edcjones at comcast.net  Thu Aug  4 21:32:25 2005
From: edcjones at comcast.net (Edward C. Jones)
Date: Thu, 04 Aug 2005 15:32:25 -0400
Subject: [Python-Dev] String exceptions in Python source
Message-ID: <42F26D49.6010408@comcast.net>

/usr/local/src/Python-2.4.1/Lib/SimpleXMLRPCServer.py:
     raise 'bad method'
/usr/local/src/Python-2.4.1/Demo/classes/bitvec.py:
     raise 'FATAL', '(param, l) = %r' % ((param, l),)
/usr/local/src/Python-2.4.1/Lib/plat-mac/FrameWork.py:
     raise 'Unsupported in MachoPython'
/usr/local/src/Python-2.4.1/Lib/plat-mac/FrameWork.py:
     raise 'Can only delete last item of a menu'
/usr/local/src/Python-2.4.1/Lib/plat-mac/MiniAEFrame.py:
     raise 'Cannot happen: AE callback without handler', (_class, _type)
/usr/local/src/Python-2.4.1/Lib/plat-mac/PixMapWrapper.py:
     raise 'UseErr', "don't assign to .baseAddr -- assign to .data instead"
/usr/local/src/Python-2.4.1/Lib/plat-mac/argvemulator.py:
     raise 'Cannot happen: AE callback without handler', (_class, _type)
/usr/local/src/Python-2.4.1/Mac/Modules/waste/wastescan.py:
     raise 'Error: not found: %s', WASTEDIR
/usr/local/src/Python-2.4.1/Mac/Tools/IDE/PyDebugger.py:
     raise 'spam'  (3 times)
/usr/local/src/Python-2.4.1/Mac/Tools/macfreeze/macfreeze.py:
     raise 'unknown gentype', gentype
/usr/local/src/Python-2.4.1/Mac/Tools/macfreeze/macfreezegui.py:
     raise 'Error in gentype', gentype

From tanzer at swing.co.at  Fri Aug  5 10:12:25 2005
From: tanzer at swing.co.at (tanzer@swing.co.at)
Date: Fri, 05 Aug 2005 10:12:25 +0200
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: Your message of "Wed, 03 Aug 2005 18:26:12 EDT."
	<C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net> 
Message-ID: <E1E0xJZ-00067B-Pf@swing.co.at>

James Y Knight <foom at fuhm.net> wrote: 

> > OK, I'm changing my mind again about the names again.
> >
> > Exception as the root and StandardError can stay; the only new
> > proposal would then be to make bare 'except:' call StandardError.
> 
> I don't see how that can work. Any solution that is expected to  
> result in a usable hierarchy this century must preserve "Exception"  
> as the object that user exceptions should derive from (and therefore  
> that users should generally catch, as well). There is way too much  
> momentum behind that to change it.

Well, in the last few years I always derived my own exceptions from
StandardError and used `except StandardError` instead of `except
Exception`.

And I'd love to get rid of the

    except KeyboardInterrupt:
        raise

clause I currently have to write before any `except
StandardError`. 
-- 
Christian Tanzer                                    http://www.c-tanzer.at/


From dooms at info.ucl.ac.be  Fri Aug  5 11:18:47 2005
From: dooms at info.ucl.ac.be (=?ISO-8859-1?Q?Gr=E9goire_Dooms?=)
Date: Fri, 05 Aug 2005 11:18:47 +0200
Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command lists in
	pdb
Message-ID: <42F32EF7.6050208@info.ucl.ac.be>

Hello,

This patch is about to celebrate its second birthday  :-)

https://sourceforge.net/tracker/?func=detail&atid=305470&aid=790710&group_id=5470

It seems from the comments that the feature is nice but the 
implementation was not OK.
I redid the implem according to the comments.

What should I do to get it reviewed further ? (perhaps just this : 
posting to python-dev :-)

Best,
--
Gr?goire


From tjreedy at udel.edu  Fri Aug  5 15:50:52 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 5 Aug 2005 09:50:52 -0400
Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command lists
	inpdb
References: <42F32EF7.6050208@info.ucl.ac.be>
Message-ID: <dcvqrt$9qn$1@sea.gmane.org>


"Gr�goire Dooms" <dooms at info.ucl.ac.be> wrote in message 
news:42F32EF7.6050208 at info.ucl.ac.be...
>This patch is about to celebrate its second birthday  :-)
>What should I do to get it reviewed further ?

The guaranteed-by-a-couple-of-developers way is to review 5 other patches, 
post a summary here, and name this as the one you want reviewed in 
exchange.

TJR


From gvanrossum at gmail.com  Fri Aug  5 17:53:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 5 Aug 2005 08:53:00 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <E1E0xJZ-00067B-Pf@swing.co.at>
References: <C3747BD3-D41B-49DC-AD43-FC24A4036E1C@fuhm.net>
	<E1E0xJZ-00067B-Pf@swing.co.at>
Message-ID: <ca471dc2050805085344867d9c@mail.gmail.com>

One more thing. Is renaming NameError to NamespaceError really worth
it? I'd say that NameError is just as clear.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Fri Aug  5 18:34:43 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 12:34:43 -0400
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050805085344867d9c@mail.gmail.com>
Message-ID: <001301c599db$9681be60$cbae2c81@oemcomputer>

[ Guido]
> One more thing. Is renaming NameError to NamespaceError really worth
> it? I'd say that NameError is just as clear.

+1 on NameError -- it's clear, easy to type, isn't a gratuitous change,
and doesn't make you think twice about NamespaceError vs NameSpaceError.


Raymond


From raymond.hettinger at verizon.net  Fri Aug  5 20:01:26 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 14:01:26 -0400
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050805085344867d9c@mail.gmail.com>
Message-ID: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>

Also strong -1 on renaming RuntimeWarning to SemanticsWarning.

Besides being another unnecessary change (trying to solve a non-existent
problem), this isn't an improvement.  The phrase RuntimeWarning is
sufficiently generic to allow it to be used for a number of purposes.
In costrast, SemanticsWarning is less flexible.  Worse, it is not at all
clear what a Semantics Warning would mean -- it suggests something much
more ominous and complicated that it should.

Another risk from gratuitous changes is the risk of unexpectedly
introducing new problems.  In this case, I find myself remembering the
name as SemanticWarning instead of SemanticsWarning.  These kind of
changes suck -- they fail to take advantage of 15 years of field testing
and risk introducing hard-to-change usability problems.

Likewise, am a strong -1 on renaming RuntimeError to UserError.  The
latter name has some virtues but it is also misread as the User doing
something wrong -- that is definitely not the intended meaning.  While
RuntimeError is a less than perfect name, it should not be changed
unless we have both 1) demonstrated that real world problems have
occurred with the current name and 2) that we have a clearly superior
alternative name (a test which UserError fails).  The only virtue to the
name, UserError, is its symmetry with UserWarning.

-0 on renaming ReferenceError to WeakReferenceError.  The new name does
better suggest the cause.  OTOH, the context of the traceback would also
make that perfectly clear.  I'm not aware of a single user having had a
problem with the current name.  In general, we've avoided long names in
favor of the short and pithy -- the theory was that the only a mnemonic
is needed.  Before adopting this one, there should be some discussion of
1) whether the current name is really that unclear, 2) whether shorter
alternatives would serve (i.e. WeakrefError), and 3) whether the name
suffers from capitalization ambiguity (WeakreferenceError vs
WeakReferenceError).

Summary:  Most of the proposed name changes are unnecessary, the new
names are not necessarily better, and there is a high risk of
introducing new usability problems.


Raymond


From raymond.hettinger at verizon.net  Fri Aug  5 20:46:41 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 14:46:41 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
Message-ID: <002301c599ee$09dd6e60$cbae2c81@oemcomputer>

The PEP moves StopIteration out from under Exception so that it cannot
be caught by a bare except or an explicit "except Exception".

IMO, this is a mistake.  In either form, a programmer is stating that
they want to catch and handle just about anything.  There is a
reasonable argument that SystemExit special and should float to the top,
but that is not the case with StopIteration.

When a user creates their own exception for exiting multiple levels of
loops or frames, should they inherit from ControlFlowException on the
theory that it no different in intent from StopIteration or should they
inherit from UserError on the theory that it is a custom exception?  Do
you really want routine control-flow exceptions to bypass "except
Exception".  I suspect that will lead to coding errors that are very
difficult to spot (it sure looks like it should catch a StopIteration).

Be careful with these proposals.  While well intentioned, they have
ramifications that aren't instantly apparent.  Each one needs some deep
thought, user discussion, usability testing, and a darned good reason
for changing what we already have in the field.


Raymond


From bcannon at gmail.com  Fri Aug  5 21:02:46 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 12:02:46 -0700
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer>
References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
	<002301c599ee$09dd6e60$cbae2c81@oemcomputer>
Message-ID: <bbaeab1005080512025ca5a993@mail.gmail.com>

On 8/5/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> The PEP moves StopIteration out from under Exception so that it cannot
> be caught by a bare except or an explicit "except Exception".
> 
> IMO, this is a mistake.  In either form, a programmer is stating that
> they want to catch and handle just about anything.  There is a
> reasonable argument that SystemExit special and should float to the top,
> but that is not the case with StopIteration.
> 
> When a user creates their own exception for exiting multiple levels of
> loops or frames, should they inherit from ControlFlowException on the
> theory that it no different in intent from StopIteration or should they
> inherit from UserError on the theory that it is a custom exception?

I say ControlFlowException.  UserError is meant for quick-and-dirty
exception usage and not as a base for user error exceptions.  If the
name is confusing it can be changed to SimpleError.

>  Do
> you really want routine control-flow exceptions to bypass "except
> Exception".

Yes.

>  I suspect that will lead to coding errors that are very
> difficult to spot (it sure looks like it should catch a StopIteration).
> 

I honestly don't think it will.  People who are going to care about
catching StopIteration are writing custom iterators, not something a
newbie will porobably be doing and thus should know to be specific
about what exceptions they are catching when they have a specific
thing in mind.

> Be careful with these proposals.  While well intentioned, they have
> ramifications that aren't instantly apparent.  Each one needs some deep
> thought, user discussion, usability testing, and a darned good reason
> for changing what we already have in the field.
> 

Right, which is why this is all in a PEP, so the discussion can happen
and the kinks can be worked out.  As for the testing, that can happen
with __future__ statements, people trying out a patch, or maybe even
some testing branch of Python for possible 3000 features.

-Brett

From bcannon at gmail.com  Fri Aug  5 21:07:08 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 12:07:08 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <42F1F5E9.8050904@gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<42F1F5E9.8050904@gmail.com>
Message-ID: <bbaeab10050805120758b53202@mail.gmail.com>

On 8/4/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote (in the PEP):
> > KeyboardInterrupt inheriting from ControlFlowException
> >
> > KeyboardInterrupt has been a contentious point within this hierarchy. Some
> > view the exception as more control flow being caused by the user. But with
> > its asynchronous cause thanks to the user being able to trigger the
> > exception at any point in code it has a more proper place inheriting from
> > CriticalException. It also keeps the name of the exception from being
> > "CriticalError".
> 
> I think this argues against your own hierarchy, since you _did_ call the
> parent exception CriticalError. By your argument above, that suggests
> KeyboardInterrupt doesn't belong there ;)
> 

=)  Drawback of having names swapped in and out so many times.

> In practice, whether KeyboardInterrupt inherits from ControlFlowException or
> CriticalError shouldn't be a big deal - the important thing is to get it out
> from under Exception and StandardError.
> 

In general, probably.

> At which point, the naming issue is enough to incline me towards christening
> it a ControlFlowException. It gets all the 'oddly named' exceptions into one
> place.
> 

Good point.  I think I would like to see Guido's preference for this
since it feels like it should be under CriticalError.

> Additionally, consider that a hypothetical ThreadExit exception (used to
> terminate a thread semi-gracefully) would also clearly belong under
> ControlFlowException. That is, just because something is asynchronous with
> respect to the currently executing code doesn't necessarily make it an error
> (yes, I know I argued the opposite point the other day. . .).
> 

Another good point.  I am leaning towards moving it now, but I still
would like to hear Guido's preference, if he has one.

-Brett

From pje at telecommunity.com  Fri Aug  5 21:14:48 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Aug 2005 15:14:48 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python  3.0
In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer>
References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
Message-ID: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>

At 02:46 PM 8/5/2005 -0400, Raymond Hettinger wrote:
>The PEP moves StopIteration out from under Exception so that it cannot
>be caught by a bare except or an explicit "except Exception".
>
>IMO, this is a mistake.  In either form, a programmer is stating that
>they want to catch and handle just about anything.  There is a
>reasonable argument that SystemExit special and should float to the top,
>but that is not the case with StopIteration.

While I agree with most of your -1's on gratuitous changes, this particular 
problem isn't gratuitous.  A StopIteration that reaches a regular exception 
handler is a programming error; allowing StopIteration and other 
control-flow exceptions to be caught other than explicitly *masks* 
programming errors.

Under normal circumstances, StopIteration is caught by for loops or by 
explicit catches of StopIteration.  If it doesn't get caught, *that's* an 
error, and it would be hidden if caught by a generic "except" clause.

So, any code that is "broken" by the move was in fact *already* broken, 
it's just that one bug (a too-general except: clause) is masking the other 
bug (the escaping control-flow exception).


>When a user creates their own exception for exiting multiple levels of
>loops or frames, should they inherit from ControlFlowException on the
>theory that it no different in intent from StopIteration or should they
>inherit from UserError on the theory that it is a custom exception?  Do
>you really want routine control-flow exceptions to bypass "except
>Exception".

Yes, definitely.  A control flow exception that isn't explicitly caught 
somewhere is itself an error, but it's not detectable if it's swallowed by 
an over-eager except: clause.


>   I suspect that will lead to coding errors that are very
>difficult to spot (it sure looks like it should catch a StopIteration).

Actually, no, it makes them *easy* to spot because nothing will catch them, 
and therefore you will be able to see that there's no handler in place.  If 
they *are* caught, that is what leads to difficult-to-spot errors -- i.e. 
the situation we have now.


>Be careful with these proposals.  While well intentioned, they have
>ramifications that aren't instantly apparent.  Each one needs some deep
>thought, user discussion, usability testing, and a darned good reason
>for changing what we already have in the field.

There is a darned good reason for this one; critical exceptions and control 
flow exceptions are pretty much the motivating reason for doing any changes 
to the exception hierarchy at all.


From raymond.hettinger at verizon.net  Fri Aug  5 21:23:13 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 15:23:13 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <bbaeab1005080512025ca5a993@mail.gmail.com>
Message-ID: <002901c599f3$24bd1000$cbae2c81@oemcomputer>

> > When a user creates their own exception for exiting multiple levels
of
> > loops or frames, should they inherit from ControlFlowException on
the
> > theory that it no different in intent from StopIteration or should
they
> > inherit from UserError on the theory that it is a custom exception?
> 
> I say ControlFlowException.  UserError is meant for quick-and-dirty
> exception usage and not as a base for user error exceptions.  If the
> name is confusing it can be changed to SimpleError.

Gads.  It sounds like you're just making this up on the fly.  The
process should be disciplined, grounded in use cases, and aimed at
known, real problems with the current hierarchy.

The above question was rhetorical.  It didn't have a right answer.
"Quick-and-dirty" is not a useful category and cannot be reliably placed
in one part of the tree versus another.  A common and basic use case for
quick and dirty exceptions is to break out of nested loops and
functions.  That is control flow as well as quick-and-dirty.


Raymond


From foom at fuhm.net  Fri Aug  5 21:42:16 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri, 5 Aug 2005 15:42:16 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <002301c599ee$09dd6e60$cbae2c81@oemcomputer>
References: <002301c599ee$09dd6e60$cbae2c81@oemcomputer>
Message-ID: <214E1AB0-A01E-4C56-A9AB-356A367AAE25@fuhm.net>


On Aug 5, 2005, at 2:46 PM, Raymond Hettinger wrote:

> The PEP moves StopIteration out from under Exception so that it cannot
> be caught by a bare except or an explicit "except Exception".
>
> IMO, this is a mistake.  In either form, a programmer is stating that
> they want to catch and handle just about anything.  There is a
> reasonable argument that SystemExit special and should float to the  
> top,
> but that is not the case with StopIteration.

I'm glad you brought that up. I had wondered from the beginning why  
ControlFlowException was moved out, but thought there must be a good  
reason I was just not seeing, and promptly forgot about it. So now  
that I've been reminded, can someone explain to me why StopIteration  
and GeneratorExit should not be caught by an "except:" or "except  
Exception:" clause?

James

From raymond.hettinger at verizon.net  Fri Aug  5 21:57:44 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 15:57:44 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python  3.0
In-Reply-To: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
Message-ID: <002a01c599f7$f29d63e0$cbae2c81@oemcomputer>

[Raymond Hettinger wrote]
> >The PEP moves StopIteration out from under Exception so that it
cannot
> >be caught by a bare except or an explicit "except Exception".
> >
> >IMO, this is a mistake.  In either form, a programmer is stating that
> >they want to catch and handle just about anything.  There is a
> >reasonable argument that SystemExit special and should float to the
top,
> >but that is not the case with StopIteration.

[Phillip J. Eby]
> While I agree with most of your -1's on gratuitous changes, this
> particular
> problem isn't gratuitous.  A StopIteration that reaches a regular
> exception
> handler is a programming error; allowing StopIteration and other
> control-flow exceptions to be caught other than explicitly *masks*
> programming errors.

Thanks for clearly articulating the rationale behind moving control-flow
exceptions out from under Exception.  The idea is not entirely without
merit but I believe it is both misguided and has serious negative
consequences.

Two things are both true.  Writers of bare excepts sometimes catch more
than they intended and mask errors in their programs.  It is also true
that there are valid use cases for wanting to trap and log all
recoverable errors in long running programs (i.e. not crashing your
whole air traffic control system if submodule fails to trap a control
flow exception).

I favor the current setup for several reasons:

1.  Writing a bare except is its own warning to a programmer.  It is a
Python basic to be careful with it and to focus attention on whether it
is really intended.  PyChecker flags it and it stands out during code
review.  IOW, it is a documented, well-understood hazard that should
surprise no one.

2.  There is a lesson to be taken from a story in the ACM risks forum
where a massive phone outage was traced to a single line of C code that
ran a "break" to get out of a nested if-statement.  The interesting part
is that this was known to be mission critical code yet the error
survived multiple, independent code reviews.  The problem was that the
code created an optical illusion.  We risk the same thing when an
"except Exception" doesn't catch ControlFlowExceptions.  The
recovery/logging handler will look like it ought to catch everything,
but it won't.  That is a disaster for fault-tolerant coding and for
keeping your sales demo from exploding in front of customers.

3.  As noted above, there ARE valid use cases for bare excepts.  Also
consider that Python rarely documents or even can document all the
recoverable exceptions that can be raised by a method call.  A broad
based handler is sometimes the programmer's only defense.

4.  As noted in another email, user defined control flow exceptions are
a key use case.  I believe that was the typical use for string
exceptions.  The idea is that that user defined control flow exceptions
are one of the key means for exiting multiple layers of loops or
function calls.  If you have a bunch of these, it is reasonable to
expect that "except Exception" will catch them.

Summary:  It is a noble thought to save someone from shooting themselves
in the foot with a bare except.  However, bare excepts are clearly a
we-are-all-adults construct.  It has valid current use cases and its
current meaning is likely the intended meaning.  Making the change will
break some existing code and produce dubious benefits.  The change is at
odds with fundamental use cases for user defined control flow
exceptions.  The change introduces the serious risk of a hard-to-spot
optical illusion error where an "except Exception" doesn't catch
exceptions that were intended to be caught.

Nice try, but don't do anything this radical without validating that it
solves significant problems without introducing worse, unintended
effects.  Don't break existing code unless there is a darned good
reason.  Check with the Zope and Twisted people to see if this would
improve their lives or make things worse.  There are user constituencies
that are not being well represented in these discussions.


Raymond


From bcannon at gmail.com  Fri Aug  5 22:00:18 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 13:00:18 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <42F1FBAC.6020404@gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<42F1FBAC.6020404@gmail.com>
Message-ID: <bbaeab100508051300267a32ff@mail.gmail.com>

On 8/4/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Since I forgot to mention it in the last couple of messages - this version
> looks very good. The transition strategy section makes it a lot more meaningful.
> 

Great to hear!

> Brett Cannon wrote (in the PEP):
> > Renamed Exceptions
> >
> > Renamed exceptions will directly subclass the new names. When the old
> > exceptions are instantiated (which occurs when an exception is caught,
> > either by a try statement or by propagating to the top of the execution
> > stack), a PendingDeprecationWarning will be raised.
> 
> Nice trick with figuring out how to raise the deprecation warning :)
> (That line was going to read 'Why not just create an alias?', but then I
> worked out what you were doing, and why you were doing it)
> 

Thanks.

> One case that this doesn't completely address is NameError, as it is the only
> renamed exception which currently has a subclass. In this case, I think that
> during the transmition phase, all three of the 'Unbound*Error' exceptions
> should inherit from NameError, with NameError inheriting from NamespaceError.
> 
> I believe it should still be possible to get the deprecation warning to work
> correctly in this case (by not raising the warning when a subclass is
> instantiated).
> 

Ah, didn't think about that issue.  Yeah, as long as you don't call a
superclass' __init__ it should still work.

> In the 'just a type' category, WeakReferenceError should still be under
> StandardError in the hierarchy.
> 

Yeah, that is an error from trying adding StandardError back in.

-Brett

From bcannon at gmail.com  Fri Aug  5 22:13:21 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 13:13:21 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <D5ACD356-29D0-4F05-B073-0AAF8B26D10C@fuhm.net>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<D5ACD356-29D0-4F05-B073-0AAF8B26D10C@fuhm.net>
Message-ID: <bbaeab1005080513137158a1b9@mail.gmail.com>

On 8/4/05, James Y Knight <foom at fuhm.net> wrote:
> >        +-- NamespaceError (rename of NameError)
> >            +-- UnboundFreeError (new)
> >            +-- UnboundGlobalError (new)
> >            +-- UnboundLocalError
> >
> 
> What are these new exceptions for? Under what circumstances are they
> raised? Why is this necessary or an improvement?
>

Exceptions relating to when a name is not found in a specific
namespace (directly related to bytecode).  So UnboundFreeError is
raised when the interpreter cannot find a variable that is a free
variable.  UnboundLocalError already exists.  UnboundGlobalError is to
prevent NameError from being overloaded.  UnboundFreeError is to
prevent UnboundLocalError from being overloaded
 
> > Renamed Exceptions
> >
> > Renamed exceptions will directly subclass the new names. When the
> > old exceptions are instantiated (which occurs when an exception is
> > caught, either by a try statement or by propagating to the top of
> > the execution stack), a PendingDeprecationWarning will be raised.
> >
> > This should properly preserve backwards-compatibility as old usage
> > won't change and the new names can be used to also catch exceptions
> > using the old name. The warning of the deprecation is also kept
> > simple.
> 
> This will cause problems when a library raises the exception under
> the new name and an app tries to catch the old name. So the standard
> lib (or any other lib) cannot raise the new names. Because the stdlib
> must raise the old names, people will see the old names, continue
> catching the old names, and the new names will never catch on.
> 

Crap, you're right.  Going to have to think about this more.

> Perhaps it'd work out better to have the new names subclass the old
> names. Then you have to continue catching the old name as long as
> anyone is raising it, but at least you can raise the new name with
> impunity. I expect not much code actually raises ReferenceError or
> NameError besides that internal to python. Thus it would be
> relatively safe to change all code to catch the new names for those
> immediately. Lots of code raises RuntimeError, but I bet not very
> much code explicitly catches it.
> 
> Oh, but if the stdlib starts raising under the new names, that'll
> break any code that checks the exact type of the exception against
> the old name. Boo.
> 
> It'd be better to somehow raise a DeprecationWarning upon access, yet
> still result in the same object. Unfortunately I don't think there's
> any way to do that in python. This lack of ability to deprecate
> module attributes has bit me several times in other projects as well.
> Matt Goodall wrote the hack attached at the end in order to move some
> whole modules around in Nevow. Amazingly it actually seemed to
> work. :) Something like that won't work for __builtins__, of course,
> since that's accessed directly with PyDict_Get.
> 
> All in all I don't really see a real need for these renamings and I
> don't see a way to do them compatibly so I'm -1 to the whole idea of
> renaming exceptions.
> 

Well, the new names can go into 2.x but not removed until 3.0 .

And there is always a solution.  We do control the implementation so
something has evil as hacking the exception system to do
class-specific checks could work.

> > Removal of Bare except Clauses
> >
> > A SemanticsWarning will be raised for all bare except clauses.
> 
> Does this mean that bare except clauses change meaning to "except
> Exception" immediately? Or (I hope) did you mean that in Py2.5 they
> continue doing as they do now, but print a warning to tell you they
> will be changing in the future?

They would have a warning for a version, and then change.

And this will nost necessarily go into 2.5 .

-Brett

From bcannon at gmail.com  Fri Aug  5 22:15:14 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 13:15:14 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <ca471dc2050804082335553bfd@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
	<ca471dc2050804082335553bfd@mail.gmail.com>
Message-ID: <bbaeab100508051315307708a4@mail.gmail.com>

On 8/4/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> In general the PEP looks really good now!
> 

Glad you like it.

> On 8/4/05, Willem Broekema <metawilm at gmail.com> wrote:
> > On 8/4/05, Brett Cannon <bcannon at gmail.com> wrote:
> > > OK, once the cron job comes around and is run,
> > > http://www.python.org/peps/pep-0348.html will not be a 404 but be the
> > > latest version of the PEP.
> >
> > Currently, when the "recursion limit" is reached, a RuntimeError is
> > raised. RuntimeError is in the PEP renamed to UserError. UserError is
> > in the new hierarchy located below StandardError, below Exception.
> >
> > I think that in the new hierarchy this error should be in the same
> > "critical" category as MemoryError. (MemoryError includes general
> > stack overflow.)
> 
> No. Usually, a recursion error is a simple bug in the code, no
> different from a TypeError or NameError etc.
> 
> This does contradict my earlier claim that Python itself doesn't use
> RuntimeError; I think I'd be happier if it remained RuntimeError. (I
> think there are a few more uses of it inside Python itself; I don't
> think it's worth inventing new exceptions for all these.)
> 

OK, I will not propose renaming RuntimeError.

-Brett

From reinhold-birkenfeld-nospam at wolke7.net  Fri Aug  5 22:12:31 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Fri, 05 Aug 2005 22:12:31 +0200
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python  3.0
In-Reply-To: <002a01c599f7$f29d63e0$cbae2c81@oemcomputer>
References: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
	<002a01c599f7$f29d63e0$cbae2c81@oemcomputer>
Message-ID: <dd0h7f$fck$1@sea.gmane.org>

Raymond Hettinger wrote:

> 2.  There is a lesson to be taken from a story in the ACM risks forum
> where a massive phone outage was traced to a single line of C code that
> ran a "break" to get out of a nested if-statement.  The interesting part
> is that this was known to be mission critical code yet the error
> survived multiple, independent code reviews.  The problem was that the
> code created an optical illusion.  We risk the same thing when an
> "except Exception" doesn't catch ControlFlowExceptions.  The
> recovery/logging handler will look like it ought to catch everything,
> but it won't.  That is a disaster for fault-tolerant coding and for
> keeping your sales demo from exploding in front of customers.

I think that ControlFlowException should inherit from Exception, because it is
an exception. As Raymond says, it's hard to spot this when in a hurry.

But looking at the current PEP 348, why not rename BaseException to Exception
and Exception to Error?

That way, you could say "except Error:" instead of most of today's bare "except:"
and it's clear that StopIteration or GeneratorExit won't be caught because they
are not errors.

Reinhold


-- 
Mail address is perfectly valid!


From gvanrossum at gmail.com  Fri Aug  5 22:16:31 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 5 Aug 2005 13:16:31 -0700
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
	<002301c599ee$09dd6e60$cbae2c81@oemcomputer>
	<5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
Message-ID: <ca471dc2050805131669db968e@mail.gmail.com>

On 8/5/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> While I agree with most of your -1's on gratuitous changes, this particular
> problem isn't gratuitous.  A StopIteration that reaches a regular exception
> handler is a programming error; allowing StopIteration and other
> control-flow exceptions to be caught other than explicitly *masks*
> programming errors.

And your point is? If that was the reasoning behind this PEP, it
should move TypeError, NameError, AttributeError and a whole bunch of
others (even LookupError) out of the StandardError hierarchy too!
Those are all clear symptoms of programming errors and are frequently
masked by bare 'except:'.

The point is not to avoid bare 'except:' from hiding programming
errors. There's no hope to obtain that goal.

The point is to make *legitimate* uses of bare 'except:' easier -- the
typical use case is an application that has some kind of main loop
which uses bare 'except:' to catch gross programming errors in other
parts of the app, or in code received from an imperfect source (like
an end-user script) and recovers by logging the error and continuing.
(I was going to say "or clean up and exit", but that use case is
handled by 'finally:'.)

Those legitimate uses often need to make a special case of
Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not
a bug in the code but a request from the user who is *running* the app
(and the appropriate default response is to exit with a stack trace);
SystemExit because it's not a bug but a deliberate attempt to exit the
program -- logging an error would be a mistake.

I think the use cases for moving other exceptions out of the way are
weak; MemoryError and SystemError are exceedingly rare and I've never
felt the need to exclude them; when GeneratorExit or StopIteration
reach the outer level of an app, it's a bug like all the others that
bare 'except:' WANTS to catch.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Fri Aug  5 22:25:45 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 13:25:45 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <ca471dc2050804082335553bfd@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
	<ca471dc2050804082335553bfd@mail.gmail.com>
Message-ID: <bbaeab1005080513251ec8f8c6@mail.gmail.com>

On 8/4/05, Guido van Rossum <gvanrossum at gmail.com> wrote:

> This does contradict my earlier claim that Python itself doesn't use
> RuntimeError; I think I'd be happier if it remained RuntimeError. (I
> think there are a few more uses of it inside Python itself; I don't
> think it's worth inventing new exceptions for all these.)
> 

I just realized that keeping RuntimeError still does not resolve the
issue that the name kind of sucks for realizing intrinsically that it
is for quick-and-dirty exceptions (or am I the only one who thinks
this?).  Should we toss in a subclass called SimpleError?

-Brett

From gvanrossum at gmail.com  Fri Aug  5 22:28:36 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 5 Aug 2005 13:28:36 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080513251ec8f8c6@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
	<ca471dc2050804082335553bfd@mail.gmail.com>
	<bbaeab1005080513251ec8f8c6@mail.gmail.com>
Message-ID: <ca471dc205080513281251604d@mail.gmail.com>

On 8/5/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 8/4/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> 
> > This does contradict my earlier claim that Python itself doesn't use
> > RuntimeError; I think I'd be happier if it remained RuntimeError. (I
> > think there are a few more uses of it inside Python itself; I don't
> > think it's worth inventing new exceptions for all these.)
> >
> 
> I just realized that keeping RuntimeError still does not resolve the
> issue that the name kind of sucks for realizing intrinsically that it
> is for quick-and-dirty exceptions (or am I the only one who thinks
> this?).  Should we toss in a subclass called SimpleError?

I don't think so. People should feel free to use whatever pre-existing
exception they like, even Exception.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Fri Aug  5 22:31:52 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 13:31:52 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <001301c599db$9681be60$cbae2c81@oemcomputer>
References: <ca471dc2050805085344867d9c@mail.gmail.com>
	<001301c599db$9681be60$cbae2c81@oemcomputer>
Message-ID: <bbaeab10050805133148f515f5@mail.gmail.com>

On 8/5/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> [ Guido]
> > One more thing. Is renaming NameError to NamespaceError really worth
> > it? I'd say that NameError is just as clear.
> 
> +1 on NameError -- it's clear, easy to type, isn't a gratuitous change,
> and doesn't make you think twice about NamespaceError vs NameSpaceError.
> 

OK, I will remove the name change proposal.

-Brett

From python at discworld.dyndns.org  Fri Aug  5 22:39:04 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Fri, 5 Aug 2005 14:39:04 -0600
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080513251ec8f8c6@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
	<ca471dc2050804082335553bfd@mail.gmail.com>
	<bbaeab1005080513251ec8f8c6@mail.gmail.com>
Message-ID: <20050805203904.GA30701@discworld.dyndns.org>

Brett Cannon <bcannon at gmail.com> wrote:
> On 8/4/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> 
> I just realized that keeping RuntimeError still does not resolve the
> issue that the name kind of sucks for realizing intrinsically that it
> is for quick-and-dirty exceptions (or am I the only one who thinks
> this?).  Should we toss in a subclass called SimpleError?

Much Python code I've looked at uses ValueError for this purpose.  Would
adding a special exception add much utility?

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From bcannon at gmail.com  Fri Aug  5 23:05:00 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 14:05:00 -0700
Subject: [Python-Dev] PEP,
	take 2: Exception Reorganization for Python 3.0
In-Reply-To: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
References: <ca471dc2050805085344867d9c@mail.gmail.com>
	<001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
Message-ID: <bbaeab1005080514052e336a71@mail.gmail.com>

On 8/5/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> Also strong -1 on renaming RuntimeWarning to SemanticsWarning.
> 
> Besides being another unnecessary change (trying to solve a non-existent
> problem), this isn't an improvement.  The phrase RuntimeWarning is
> sufficiently generic to allow it to be used for a number of purposes.
> In costrast, SemanticsWarning is less flexible.  Worse, it is not at all
> clear what a Semantics Warning would mean -- it suggests something much
> more ominous and complicated that it should.
> 

But the docs don't say that RuntimeWarning is meant as a generic
warning but for dubious runtime behavior being changed.  If it is
truly meant to be generic (I think of UserWarning for that), then
fine, I can let go of the name change.

But it just took a friend of mine with no exposure to the warning
system to understand what it meant.

> Another risk from gratuitous changes is the risk of unexpectedly
> introducing new problems.  In this case, I find myself remembering the
> name as SemanticWarning instead of SemanticsWarning.  These kind of
> changes suck -- they fail to take advantage of 15 years of field testing
> and risk introducing hard-to-change usability problems.
> 

OK, I can see the typos from that, but I still think RuntimeWarning
and Error, for use as a generic exception, suck as names.

> Likewise, am a strong -1 on renaming RuntimeError to UserError.  The
> latter name has some virtues but it is also misread as the User doing
> something wrong -- that is definitely not the intended meaning.  While
> RuntimeError is a less than perfect name, it should not be changed
> unless we have both 1) demonstrated that real world problems have
> occurred with the current name and 2) that we have a clearly superior
> alternative name (a test which UserError fails).  The only virtue to the
> name, UserError, is its symmetry with UserWarning.
> 

SimpleError?

> -0 on renaming ReferenceError to WeakReferenceError.  The new name does
> better suggest the cause.  OTOH, the context of the traceback would also
> make that perfectly clear.  I'm not aware of a single user having had a
> problem with the current name.  In general, we've avoided long names in
> favor of the short and pithy -- the theory was that the only a mnemonic
> is needed.  Before adopting this one, there should be some discussion of
> 1) whether the current name is really that unclear, 2) whether shorter
> alternatives would serve (i.e. WeakrefError), and 3) whether the name
> suffers from capitalization ambiguity (WeakreferenceError vs
> WeakReferenceError).
> 

Will I didn't know what the exception was for until I read the docs. 
Granted this was just from looking at ``import exceptions;
dir(exceptions)``, but why shouldn't the names be that obvious?

And I don't see a capitalization ambiguity; if it was WeakrefError,
sure.  But not when the entire phrase is used.

> Summary:  Most of the proposed name changes are unnecessary, the new
> names are not necessarily better, and there is a high risk of
> introducing new usability problems.
> 

I still think RuntimeError (and RuntimeWarning if that is what it is
meant for) sucks as a name for a generic exception.  I didn't know
that was its use until I read the docs and Guido pointed out during
the discussion of this thread.

I am willing to compromise with a new exception that inherits
RuntimeError named SimpleError (or the inheritance can be flipped).

-Brett

From bcannon at gmail.com  Fri Aug  5 23:09:30 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 14:09:30 -0700
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <002901c599f3$24bd1000$cbae2c81@oemcomputer>
References: <bbaeab1005080512025ca5a993@mail.gmail.com>
	<002901c599f3$24bd1000$cbae2c81@oemcomputer>
Message-ID: <bbaeab1005080514093651135d@mail.gmail.com>

On 8/5/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > > When a user creates their own exception for exiting multiple levels
> of
> > > loops or frames, should they inherit from ControlFlowException on
> the
> > > theory that it no different in intent from StopIteration or should
> they
> > > inherit from UserError on the theory that it is a custom exception?
> >
> > I say ControlFlowException.  UserError is meant for quick-and-dirty
> > exception usage and not as a base for user error exceptions.  If the
> > name is confusing it can be changed to SimpleError.
> 
> Gads.  It sounds like you're just making this up on the fly.  The
> process should be disciplined, grounded in use cases, and aimed at
> known, real problems with the current hierarchy.
> 

It is based on a real use case; my own.  As I said in another email I
just sent, I had no clue that RuntimeError was meant to be used as a
generic exception until Guido pointed it out.

-Brett

From mdehoon at c2b2.columbia.edu  Fri Aug  5 23:18:46 2005
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 5 Aug 2005 17:18:46 -0400
Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint command
	listsinpdb
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu>

> "Gr?goire Dooms" <dooms at info.ucl.ac.be> wrote in message 
> news:42F32EF7.6050208 at info.ucl.ac.be...
> >This patch is about to celebrate its second birthday  :-)
> >What should I do to get it reviewed further ?

> The guaranteed-by-a-couple-of-developers way is to review 5 other patches, 
> post a summary here, and name this as the one you want reviewed in 
> exchange.

> TJR

Speaking of the five-patch-review-rule, about two months ago I reviewed five
patches and posted a summary here in order to push patch #1049855. This patch
is still waiting for a verdict (this is also my own fault, since I needed
several iterations to get this patch straightened out; my apologies for
that). Is there anything else I can do for this patch?

--Michiel.

From bcannon at gmail.com  Fri Aug  5 23:20:34 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 14:20:34 -0700
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050805131669db968e@mail.gmail.com>
References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>
	<002301c599ee$09dd6e60$cbae2c81@oemcomputer>
	<5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
	<ca471dc2050805131669db968e@mail.gmail.com>
Message-ID: <bbaeab100508051420cb59071@mail.gmail.com>

On 8/5/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
[SNIP]
> Those legitimate uses often need to make a special case of
> Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not
> a bug in the code but a request from the user who is *running* the app
> (and the appropriate default response is to exit with a stack trace);
> SystemExit because it's not a bug but a deliberate attempt to exit the
> program -- logging an error would be a mistake.
> 
> I think the use cases for moving other exceptions out of the way are
> weak; MemoryError and SystemError are exceedingly rare and I've never
> felt the need to exclude them; when GeneratorExit or StopIteration
> reach the outer level of an app, it's a bug like all the others that
> bare 'except:' WANTS to catch.
> 

So are you saying you would rather ditch all reorganization
suggestions and just have SystemExit and KeyboardInterrupt inherit
directly from BaseException, and keep the bare 'except' change and
required superclass inheritance suggestions?  Would this appease
everyone else?

If this is what people want, fine.  But I am still going to suggest
CriticalError stay since they are not caused by programmer error
directly (I am ignoring C extension module screw-ups that devour
memory).

-Brett

From bcannon at gmail.com  Fri Aug  5 23:21:51 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 14:21:51 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <ca471dc205080513281251604d@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<f6bc9b49050804063739eab48@mail.gmail.com>
	<ca471dc2050804082335553bfd@mail.gmail.com>
	<bbaeab1005080513251ec8f8c6@mail.gmail.com>
	<ca471dc205080513281251604d@mail.gmail.com>
Message-ID: <bbaeab100508051421189d84bc@mail.gmail.com>

On 8/5/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/5/05, Brett Cannon <bcannon at gmail.com> wrote:
> > On 8/4/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> >
> > > This does contradict my earlier claim that Python itself doesn't use
> > > RuntimeError; I think I'd be happier if it remained RuntimeError. (I
> > > think there are a few more uses of it inside Python itself; I don't
> > > think it's worth inventing new exceptions for all these.)
> > >
> >
> > I just realized that keeping RuntimeError still does not resolve the
> > issue that the name kind of sucks for realizing intrinsically that it
> > is for quick-and-dirty exceptions (or am I the only one who thinks
> > this?).  Should we toss in a subclass called SimpleError?
> 
> I don't think so. People should feel free to use whatever pre-existing
> exception they like, even Exception.
> 

Fine, the idea is pulled.

-Brett

From bcannon at gmail.com  Fri Aug  5 23:23:56 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 5 Aug 2005 14:23:56 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <003201c599fb$1e8c9d60$cbae2c81@oemcomputer>
References: <bbaeab1005080513137158a1b9@mail.gmail.com>
	<003201c599fb$1e8c9d60$cbae2c81@oemcomputer>
Message-ID: <bbaeab1005080514233912d994@mail.gmail.com>

On 8/5/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > On 8/4/05, James Y Knight <foom at fuhm.net> wrote:
> > > >        +-- NamespaceError (rename of NameError)
> > > >            +-- UnboundFreeError (new)
> > > >            +-- UnboundGlobalError (new)
> > > >            +-- UnboundLocalError
> > > >
> > >
> > > What are these new exceptions for? Under what circumstances are they
> > > raised? Why is this necessary or an improvement?
> > >
> >
> > Exceptions relating to when a name is not found in a specific
> > namespace (directly related to bytecode).  So UnboundFreeError is
> > raised when the interpreter cannot find a variable that is a free
> > variable.  UnboundLocalError already exists.  UnboundGlobalError is to
> > prevent NameError from being overloaded.  UnboundFreeError is to
> > prevent UnboundLocalError from being overloaded
> 
> Do we have any use cases for making the distinctions.  I have NEVER had
> a reason to write a different handler for the various types of
> NameError.
> 
> Also, everyone knows what a Global is.  Can the same be said for Free?
> I had thought that to be a implementation detail rather than part of the
> language spec.
> 

Perhaps then we should just ditch UnboundLocalError?  If we just make
sure we have good messages to go with the exceptions the reasons for
the exception should be obvious.

-Brett

From raymond.hettinger at verizon.net  Sat Aug  6 01:18:09 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 05 Aug 2005 19:18:09 -0400
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080514233912d994@mail.gmail.com>
Message-ID: <005f01c59a13$f1d8caa0$cbae2c81@oemcomputer>

> > > > >        +-- NamespaceError (rename of NameError)
> > > > >            +-- UnboundFreeError (new)
> > > > >            +-- UnboundGlobalError (new)
> > > > >            +-- UnboundLocalError
> > > > >
> > > >
> > > > What are these new exceptions for? Under what circumstances are
they
> > > > raised? Why is this necessary or an improvement?

[James Y Knight]
> > > Exceptions relating to when a name is not found in a specific
> > > namespace (directly related to bytecode).  So UnboundFreeError is
> > > raised when the interpreter cannot find a variable that is a free
> > > variable.  UnboundLocalError already exists.  UnboundGlobalError
is to
> > > prevent NameError from being overloaded.  UnboundFreeError is to
> > > prevent UnboundLocalError from being overloaded

[Raymond]
> > Do we have any use cases for making the distinctions.  I have NEVER
had
> > a reason to write a different handler for the various types of
> > NameError.
> >
> > Also, everyone knows what a Global is.  Can the same be said for
Free?
> > I had thought that to be a implementation detail rather than part of
the
> > language spec.

[Brett]
> Perhaps then we should just ditch UnboundLocalError? 

Perhaps the hierarchy should be left unchanged unless there is shown to
be something wrong with it.  "just ditching" something is not a
rationale that warrants a language change.  What problem is being solved
by making additions or deletions to subclasses of NameError?


> If we just make
> sure we have good messages to go with the exceptions the reasons for
> the exception should be obvious.

+1


Raymodn


From ncoghlan at gmail.com  Sat Aug  6 11:33:45 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 06 Aug 2005 19:33:45 +1000
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <ca471dc2050805131669db968e@mail.gmail.com>
References: <001b01c599e7$b78d19e0$cbae2c81@oemcomputer>	<002301c599ee$09dd6e60$cbae2c81@oemcomputer>	<5.1.1.6.0.20050805150841.025a4170@mail.telecommunity.com>
	<ca471dc2050805131669db968e@mail.gmail.com>
Message-ID: <42F483F9.2080503@gmail.com>

Guido van Rossum wrote:
> The point is not to avoid bare 'except:' from hiding programming
> errors. There's no hope to obtain that goal.
> 
> The point is to make *legitimate* uses of bare 'except:' easier -- the
> typical use case is an application that has some kind of main loop
> which uses bare 'except:' to catch gross programming errors in other
> parts of the app, or in code received from an imperfect source (like
> an end-user script) and recovers by logging the error and continuing.
> (I was going to say "or clean up and exit", but that use case is
> handled by 'finally:'.)
> 
> Those legitimate uses often need to make a special case of
> Keyboardinterrupt and SystemExit -- KeyboardInterrupt because it's not
> a bug in the code but a request from the user who is *running* the app
> (and the appropriate default response is to exit with a stack trace);
> SystemExit because it's not a bug but a deliberate attempt to exit the
> program -- logging an error would be a mistake.
> 
> I think the use cases for moving other exceptions out of the way are
> weak; MemoryError and SystemError are exceedingly rare and I've never
> felt the need to exclude them; when GeneratorExit or StopIteration
> reach the outer level of an app, it's a bug like all the others that
> bare 'except:' WANTS to catch.

To try to turn this idea into a concrete example, the idea would be to make 
the following code work correctly:

   for job in joblist:
     try:
        job.exec()
     except: # or "except Exception:"
        failed_jobs.append((job, sys.exc_info()))

Currently, this code will make a user swear, as Ctrl-C will cause the program 
to move onto the next job, instead of exiting as you would except (I have 
found Python scripts not exiting when I press Ctrl-C to be an all-too-common 
problem).

Additionally calling sys.exit() inside a job will fail. This may be deliberate 
(to prevent a job from exiting the whole application), but given only the code 
above, it looks like a bug.

The program will attempt to continue in the face of a MemoryError. This is 
actually reasonable, as memory may have been freed as the stack unwound to the 
level of the job execution loop, or the request that failed may have been for 
a ridicuolously large amount of memory.

The program will also attempt to continue in the face of a SystemError. This 
is reasonable too, as SystemError is only used when the VM thinks the current 
operation needs to be aborted due to an internal problem in the VM, but the VM 
itself is still safe to use. If the VM thinks something is seriously wrong 
with the internal data structures, it will kill the process with Py_FatalError 
(to ensure that no further Python code is executed), rather than raise 
SystemError.

As others have pointed out, GeneratorExit and StopIteration should never reach 
the job execution loop - if they do, there's a bug in the job, and they should 
be caught and logged.

That covers the six exceptions that have been proposed to be moved out from 
under "Exception", and, as I see it, only two of them end up making the grade 
- SystemExit and KeyboardInterrupt, for exactly the reasons Guido gives in his 
message above.

This suggests a Py3k exception hierarchy that looks like:

   BaseException
   +-- CriticalException
       +-- SystemExit
       +-- KeyboardInterrupt
   +-- Exception
       +-- GeneratorExit
       +-- (Remainder as for Python 2.4, other than KeyboardInterrupt)

With a transitional 2.x hierarchy that looks like:

   BaseException
   +-- CriticalException
       +-- SystemExit
       +-- KeyboardInterrupt
   +-- Exception
       +-- GeneratorExit
       +-- (Remainder exactly as for Python 2.4)

The reason for the CriticalException parent is that Python 2.x code can be 
made 'correct' by doing:

   try:
       # whatever
   except CriticalException:
       raise
   except: # or 'except Exception'
       # Handle everything non-critical

And, the hypothetical job execution loop above can be updated to:

   for job in joblist:
     try:
        job.exec()
     except CriticalException:
        failed_jobs.append((job, sys.exc_info()))
        job_idx = joblist.find(job)
        skipped_jobs.extend(joblist[job_idx+1:]
        raise
     except: # or "except Exception:"
        failed_jobs.append((job, sys.exc_info()))


To tell the truth, if base except is kept around for Py3k, I would prefer to 
see it catch BaseException rather than Exception. Failing that, I would prefer 
to see it removed. Having it catch something other than the root of the 
exception hierarchy would be just plain confusing.

Moving SystemExit and KeyboardInterrupt is the only change we've considered 
which seems to have a genuine motivating use case. The rest of the changes 
suggested don't seem to be solving an actual problem (or are solving a problem 
that is minor enough to be not worth any backward compatibility pain).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From nas at arctrix.com  Sat Aug  6 12:23:42 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 6 Aug 2005 04:23:42 -0600
Subject: [Python-Dev] PEP: Generalised String Coercion
Message-ID: <20050806102342.GA11309@mems-exchange.org>

The title is perhaps a little too grandiose but it's the best I
could think of.  The change is really not large.  Personally, I
would be happy enough if only %s was changed and the built-in was
not added.  Please comment.

  Neil


PEP: 349
Title: Generalised String Coercion
Version: $Revision: 1.2 $
Last-Modified: $Date: 2005/08/06 04:05:48 $
Author: Neil Schemenauer <nas at arctrix.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History: 06-Aug-2005
Python-Version: 2.5


Abstract

    This PEP proposes the introduction of a new built-in function,
    text(), that provides a way of generating a string representation
    of an object without forcing the result to be a particular string
    type.  In addition, the behavior %s format specifier would be
    changed to call text() on the argument.  These two changes would
    make it easier to write library code that can be used by
    applications that use only the str type and by others that also
    use the unicode type.


Rationale

    Python has had a Unicode string type for some time now but use of
    it is not yet widespread.  There is a large amount of Python code
    that assumes that string data is represented as str instances.
    The long term plan for Python is to phase out the str type and use
    unicode for all string data.  Clearly, a smooth migration path
    must be provided.

    We need to upgrade existing libraries, written for str instances,
    to be made capable of operating in an all-unicode string world.
    We can't change to an all-unicode world until all essential
    libraries are made capable for it.  Upgrading the libraries in one
    shot does not seem feasible.  A more realistic strategy is to
    individually make the libraries capable of operating on unicode
    strings while preserving their current all-str environment
    behaviour.

    First, we need to be able to write code that can accept unicode
    instances without attempting to coerce them to str instances.  Let
    us label such code as Unicode-safe.  Unicode-safe libraries can be
    used in an all-unicode world.

    Second, we need to be able to write code that, when provided only
    str instances, will not create unicode results.  Let us label such
    code as str-stable.  Libraries that are str-stable can be used by
    libraries and applications that are not yet Unicode-safe.
    
    Sometimes it is simple to write code that is both str-stable and
    Unicode-safe.  For example, the following function just works:

        def appendx(s):
            return s + 'x'

    That's not too surprising since the unicode type is designed to
    make the task easier.  The principle is that when str and unicode
    instances meet, the result is a unicode instance.  One notable
    difficulty arises when code requires a string representation of an
    object; an operation traditionally accomplished by using the str()
    built-in function.
    
    Using str() makes the code not Unicode-safe.  Replacing a str()
    call with a unicode() call makes the code not str-stable.  Using a
    string format almost accomplishes the goal but not quite.
    Consider the following code:

        def text(obj):
            return '%s' % obj

    It behaves as desired except if 'obj' is not a basestring instance
    and needs to return a Unicode representation of itself.  In that
    case, the string format will attempt to coerce the result of
    __str__ to a str instance.  Defining a __unicode__ method does not
    help since it will only be called if the right-hand operand is a
    unicode instance.  Using a unicode instance for the right-hand
    operand does not work because the function is no longer str-stable
    (i.e. it will coerce everything to unicode).


Specification

    A Python implementation of the text() built-in follows:

        def text(s):
            """Return a nice string representation of the object.  The
            return value is a basestring instance.
            """
            if isinstance(s, basestring):
                return s
            r = s.__str__()
            if not isinstance(r, basestring):
                raise TypeError('__str__ returned non-string')
            return r
            
    Note that it is currently possible, although not very useful, to
    write __str__ methods that return unicode instances.

    The %s format specifier for str objects would be changed to call
    text() on the argument.  Currently it calls str() unless the
    argument is a unicode instance (in which case the object is
    substituted as is and the % operation returns a unicode instance).

    The following function would be added to the C API and would be the
    equivalent of the text() function:

        PyObject *PyObject_Text(PyObject *o);

    A reference implementation is available on Sourceforge [1] as a
    patch.

                
Backwards Compatibility

    The change to the %s format specifier would result in some %
    operations returning a unicode instance rather than raising a
    UnicodeDecodeError exception.  It seems unlikely that the change
    would break currently working code.


Alternative Solutions

    Rather than adding the text() built-in, if PEP 246 were
    implemented then adapt(s, basestring) could be equivalent to
    text(s).  The advantage would be one less built-in function.  The
    problem is that PEP 246 is not implemented.

    Fredrik Lundh has suggested [2] that perhaps a new slot should be
    added (e.g. __text__), that could return any kind of string that's
    compatible with Python's text model.  That seems like an
    attractive idea but many details would still need to be worked
    out.

    Instead of providing the text() built-in, the %s format specifier
    could be changed and a string format could be used instead of
    calling text().  However, it seems like the operation is important
    enough to justify a built-in.

    Instead of providing the text() built-in, the basestring type
    could be changed to provide the same functionality.  That would
    possibly be confusing behaviour for an abstract base type.

    Some people have suggested [3] that an easier migration path would
    be to change the default encoding to be UTF-8.  Code that is not
    Unicode safe would then encode Unicode strings as UTF-8 and
    operate on them as str instances, rather than raising a
    UnicodeDecodeError exception.  Other code would assume that str
    instances were encoded using UTF-8 and decode them if necessary.
    While that solution may work for some applications, it seems
    unsuitable as a general solution.  For example, some applications
    get string data from many different sources and assuming that all
    str instances were encoded using UTF-8 could easily introduce
    subtle bugs.


References

    [1] http://www.python.org/sf/1159501
    [2] http://mail.python.org/pipermail/python-dev/2004-September/048755.html
    [3] http://blog.ianbicking.org/illusive-setdefaultencoding.html


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

From amk at amk.ca  Sat Aug  6 14:10:01 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Sat, 6 Aug 2005 08:10:01 -0400
Subject: [Python-Dev] PEP 8: exception style
Message-ID: <20050806121001.GC16042@rogue.amk.ca>

PEP 8 doesn't express any preference between the 
two forms of raise statements:
raise ValueError, 'blah'
raise ValueError("blah")

I like the second form better, because if the exception arguments are
long or include string formatting, you don't need to use line
continuation characters because of the containing parens.  Grepping
through the library code, the first form is in the majority, used
roughly 60% of the time.

Should PEP 8 take a position on this?  If yes, which one?

--amk

From raymond.hettinger at verizon.net  Sat Aug  6 18:12:35 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 06 Aug 2005 12:12:35 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <42F483F9.2080503@gmail.com>
Message-ID: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer>

[Nick Coghlan]
> As others have pointed out, GeneratorExit and StopIteration should
never
> reach
> the job execution loop - if they do, there's a bug in the job, and
they
> should
> be caught and logged.

Please read my other, detailed post on this (8/5/2005 4:05pm).  It is a
mistake to bypass control flow exceptions like GeneratorExit and
StopIteration.  Those need to remain under Exception.  Focus on your
core use case of eliminating the common idiom:

   try: 
       block()
   except KeyboardInterrupt:
       raise
   except:
       pass    # or some handler/logger

In real code, I've never seen the above idiom used with StopIteration.
Read Guido's note and my note.  There are plenty of valid use cases for
a bare except intending to catch almost everything including programming
errors from NameError to StopIteration.  It is a consenting-adults
construct.  Your proposal breaks a major use case for it (preventing
your sales demos from crashing in front of your customers, writing
fault-tolerant programs, etc.)


Raymond


From fumanchu at amor.org  Sat Aug  6 18:33:05 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Sat, 6 Aug 2005 09:33:05 -0700
Subject: [Python-Dev] PEP 8: exception style
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3772711@exchange.hqamor.amorhq.net>

A.M. Kuchling wrote:
> PEP 8 doesn't express any preference between the 
> two forms of raise statements:
> raise ValueError, 'blah'
> raise ValueError("blah")
> 
> I like the second form better, because if the exception arguments are
> long or include string formatting, you don't need to use line
> continuation characters because of the containing parens.  Grepping
> through the library code, the first form is in the majority, used
> roughly 60% of the time.
> 
> Should PEP 8 take a position on this?  If yes, which one?

I like the second form better, because even intermediate Pythonistas
sometimes make a mistake between:

	raise ValueError, A

and

	raise (ValueError, A)

I'd like to see the first form removed in Python 3k, to help reduce the
ambiguity. But PEP 8 taking a stand on it would be a good start for now.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org

From gvanrossum at gmail.com  Sat Aug  6 19:10:54 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 6 Aug 2005 10:10:54 -0700
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <20050806121001.GC16042@rogue.amk.ca>
References: <20050806121001.GC16042@rogue.amk.ca>
Message-ID: <ca471dc205080610104fb870ac@mail.gmail.com>

On 8/6/05, A.M. Kuchling <amk at amk.ca> wrote:
> PEP 8 doesn't express any preference between the
> two forms of raise statements:
> raise ValueError, 'blah'
> raise ValueError("blah")
> 
> I like the second form better, because if the exception arguments are
> long or include string formatting, you don't need to use line
> continuation characters because of the containing parens.  Grepping
> through the library code, the first form is in the majority, used
> roughly 60% of the time.
> 
> Should PEP 8 take a position on this?  If yes, which one?

Definitely ValueError('blah'). The other form will go away in Python
3000. Please update the PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Sat Aug  6 19:10:09 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 06 Aug 2005 13:10:09 -0400
Subject: [Python-Dev] FW:  PEP 8: exception style
Message-ID: <001601c59aa9$b38deaa0$441dc797@oemcomputer>

> PEP 8 doesn't express any preference between the
> two forms of raise statements:
> raise ValueError, 'blah'
> raise ValueError("blah")
> 
> I like the second form better, because if the exception arguments are
> long or include string formatting, you don't need to use line
> continuation characters because of the containing parens.  Grepping
> through the library code, the first form is in the majority, used
> roughly 60% of the time.
> 
> Should PEP 8 take a position on this?  If yes, which one?

I we had to pick one, I would also choose the second form.  But why
bother inflicting our preference on others, both forms are readable so
we won't gain anything by dictating a style.


Raymond


From tim.peters at gmail.com  Sat Aug  6 19:37:04 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 6 Aug 2005 13:37:04 -0400
Subject: [Python-Dev] FW: PEP 8: exception style
In-Reply-To: <001601c59aa9$b38deaa0$441dc797@oemcomputer>
References: <001601c59aa9$b38deaa0$441dc797@oemcomputer>
Message-ID: <1f7befae0508061037179151c0@mail.gmail.com>

[AMK]
>> PEP 8 doesn't express any preference between the
>> two forms of raise statements:
>> raise ValueError, 'blah'
>> raise ValueError("blah")
>>
>> I like the second form better, because if the exception arguments are
>> long or include string formatting, you don't need to use line
>> continuation characters because of the containing parens.  Grepping
>> through the library code, the first form is in the majority, used
>> roughly 60% of the time.
>>
>> Should PEP 8 take a position on this?  If yes, which one?

[Raymond Hettinger]
> I we had to pick one, I would also choose the second form.  But why
> bother inflicting our preference on others, both forms are readable so
> we won't gain anything by dictating a style.

Ongoing cruft reduction -- TOOWTDI.  The first form was necessary at
Python's start because exceptions were strings, and strings aren't
callable, and there needed to be _some_ way to spell "and here's the
detail associated with the exception".  "raise" grew special syntax to
support that need.  In a Python without string exceptions, that syntax
isn't needed, and becomes (over time) an increasingly obscure way to
invoke an ordinary constructor -- ValueError("blah") does exactly the
same thing in a raise statement as it does in any other context, and
transforming `ValueError, 'blah'` into the former becomes a wart
unique to raise statements.

From tjreedy at udel.edu  Sat Aug  6 22:28:56 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 6 Aug 2005 16:28:56 -0400
Subject: [Python-Dev] PEP 8: exception style
References: <20050806121001.GC16042@rogue.amk.ca>
	<ca471dc205080610104fb870ac@mail.gmail.com>
Message-ID: <dd36i9$p7t$1@sea.gmane.org>


"Guido van Rossum" <gvanrossum at gmail.com> wrote in message 
news:ca471dc205080610104fb870ac at mail.gmail.com...
> On 8/6/05, A.M. Kuchling <amk at amk.ca> wrote:
>> PEP 8 doesn't express any preference between the
>> two forms of raise statements:
>> raise ValueError, 'blah'
>> raise ValueError("blah")
>>
>> I like the second form better, because if the exception arguments are
>> long or include string formatting, you don't need to use line
>> continuation characters because of the containing parens.  Grepping
>> through the library code, the first form is in the majority, used
>> roughly 60% of the time.
>>
>> Should PEP 8 take a position on this?  If yes, which one?
>
> Definitely ValueError('blah'). The other form will go away in Python
> 3000. Please update the PEP.

Great.  PEP 3000 could also be updated to add the line

The raise Error,'blah' syntax: use raise Error('blah') instead [14]

in the To be removed section after the line on string exceptions  and [14] 
<Guido's post> under references.

Terry J. Reedy


From tjreedy at udel.edu  Sat Aug  6 22:31:21 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 6 Aug 2005 16:31:21 -0400
Subject: [Python-Dev] Generalised String Coercion
References: <20050806102342.GA11309@mems-exchange.org>
Message-ID: <dd36mq$pjr$1@sea.gmane.org>

> PEP: 349
> Title: Generalised String Coercion
...
> Rationale
>    Python has had a Unicode string type for some time now but use of
>    it is not yet widespread.  There is a large amount of Python code
>    that assumes that string data is represented as str instances.
>    The long term plan for Python is to phase out the str type and use
>    unicode for all string data.

This PEP strikes me as premature, as putting the toy wagon before the 
horse, since it is premised on a major change to Python, possibly the most 
disruptive and controversial ever, being a done deal.  However there is, as 
far as I could find no PEP on Making Strings be Unicode, let alone a 
discussed, debated, and finalized PEP on the subject.

>   Clearly, a smooth migration path must be provided.

Of course.  But the path depends on the detailed final target, which has 
not, as far as I know, has been finalized, and certainly not in the needed 
PEP.  Your proposal might be part of the transition section of such a PEP 
or of a separate migration path PEP.

Terry J. Reedy


From bcannon at gmail.com  Sun Aug  7 01:14:32 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sat, 6 Aug 2005 16:14:32 -0700
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <ca471dc205080610104fb870ac@mail.gmail.com>
References: <20050806121001.GC16042@rogue.amk.ca>
	<ca471dc205080610104fb870ac@mail.gmail.com>
Message-ID: <bbaeab100508061614341e0c5c@mail.gmail.com>

On 8/6/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/6/05, A.M. Kuchling <amk at amk.ca> wrote:
> > PEP 8 doesn't express any preference between the
> > two forms of raise statements:
> > raise ValueError, 'blah'
> > raise ValueError("blah")
> >
> > I like the second form better, because if the exception arguments are
> > long or include string formatting, you don't need to use line
> > continuation characters because of the containing parens.  Grepping
> > through the library code, the first form is in the majority, used
> > roughly 60% of the time.
> >
> > Should PEP 8 take a position on this?  If yes, which one?
> 
> Definitely ValueError('blah'). The other form will go away in Python
> 3000. Please update the PEP.
> 

Done.  rev. 1.18 .

-Brett

From ncoghlan at gmail.com  Sun Aug  7 03:52:01 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 07 Aug 2005 11:52:01 +1000
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer>
References: <001201c59aa1$cb1fd3c0$441dc797@oemcomputer>
Message-ID: <42F56941.401@gmail.com>

Raymond Hettinger wrote:
> Please read my other, detailed post on this (8/5/2005 4:05pm).  It is a 
> mistake to bypass control flow exceptions like GeneratorExit and 
> StopIteration.  Those need to remain under Exception.

This is the paragraph after the one you replied to above:
[Nick Coghlan]
>> That covers the six exceptions that have been proposed to be moved out
>> from under "Exception", and, as I see it, only two of them end up making
>> the grade - SystemExit and KeyboardInterrupt, for exactly the reasons
>> Guido gives in his message above.

The remainder of my message then goes on to describe a hierarchy just as you 
suggest - SystemError, MemoryError, StopIteration and GeneratorExit are all 
still caught by "except Exception:". The only two exceptions which are no 
longer caught by "except Exception:" are KeyboardInterrupt and SystemExit.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From gvanrossum at gmail.com  Sun Aug  7 03:56:39 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 6 Aug 2005 18:56:39 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <dd36mq$pjr$1@sea.gmane.org>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
Message-ID: <ca471dc20508061856c0cce4f@mail.gmail.com>

[Removed python-list CC]

On 8/6/05, Terry Reedy <tjreedy at udel.edu> wrote:
> > PEP: 349
> > Title: Generalised String Coercion
> ...
> > Rationale
> >    Python has had a Unicode string type for some time now but use of
> >    it is not yet widespread.  There is a large amount of Python code
> >    that assumes that string data is represented as str instances.
> >    The long term plan for Python is to phase out the str type and use
> >    unicode for all string data.
> 
> This PEP strikes me as premature, as putting the toy wagon before the
> horse, since it is premised on a major change to Python, possibly the most
> disruptive and controversial ever, being a done deal.  However there is, as
> far as I could find no PEP on Making Strings be Unicode, let alone a
> discussed, debated, and finalized PEP on the subject.

True. OTOH, Jython and IreonPython already have this, and it is my
definite plan to make all strings Unicode in Python 3000. The rest
(such as a bytes datatype) is details, as they say. :-)

My first response to the PEP, however, is that instead of a new
built-in function, I'd rather relax the requirement that str() return
an 8-bit string -- after all, int() is allowed to return a long, so
why couldn't str() be allowed to return a Unicode string?

The main problem for a smooth Unicode transition remains I/O, in my
opinion; I'd like to see a PEP describing a way to attach an encoding
to text files, and a way to decide on a default encoding for stdin,
stdout, stderr.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Sun Aug  7 05:06:45 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 06 Aug 2005 23:06:45 -0400
Subject: [Python-Dev] PEP 348: Exception Reorganization for Python 3.0
In-Reply-To: <42F56941.401@gmail.com>
Message-ID: <000601c59afd$0b98e4e0$4a05a044@oemcomputer>

> The remainder of my message then goes on to describe a hierarchy just
as
> you
> suggest - SystemError, MemoryError, StopIteration and GeneratorExit
are
> all
> still caught by "except Exception:". The only two exceptions which are
no
> longer caught by "except Exception:" are KeyboardInterrupt and
SystemExit.

Ah, I was too quick on the draw.  It now appears that you were already
converted :-)  Now, if only the PEP would get updated ...

BTW, why did you exclude MemoryError?


Raymond


From bcannon at gmail.com  Sun Aug  7 06:26:28 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sat, 6 Aug 2005 21:26:28 -0700
Subject: [Python-Dev] Major revision of PEP 348 committed
Message-ID: <bbaeab1005080621266bcc87@mail.gmail.com>

Version 1.5 of PEP 348 (http://www.python.org/peps/pep-0348.html) just
got checked in.  This one is a *big* change compared to the previous
version:

* Renamings removed
* SystemExit are the KeyboardInterrupt are the only exceptions *not*
inheriting from Exception
    + CriticalException has been renamed TerminalException so it is
more inline with the idea that the exceptions are meant to terminate
the interpreter, not that they are more critical than other exceptions
* Removed ControlFlowException
    + StopIteration and GeneratorExit inherit from Exception directly
* Added VMError which inherits Exception
    + SystemError and MemoryError subclass VMError
* Removed UnboundG(Global|Free)Error
* other stuff I don't remember

This version addresses everyone's worries about
backwards-compatibility or changes that were not substantive enough to
break code.

The things I did on my own without thorough discussion is remove
ControlFlowException and introduce VMError.  The former seemed
reasonable since catching control flow exceptions as a group is
probably rare and with StopIteration and GeneratorExit not falling
outside of Exception, ControlFlowException lost its usefulness.

VMError was introduced to allow the grouping of MemoryError and
SystemError since they are both exceptions relating to the VM.  The
name can be changed to InterpreterError, but VMError is shorter while
still getting the idea across.  Plus I just like VMError more.  =)

OK, guys, have at it.

-Brett

From reinhold-birkenfeld-nospam at wolke7.net  Sun Aug  7 09:46:31 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sun, 07 Aug 2005 09:46:31 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc20508061856c0cce4f@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
Message-ID: <dd4e8n$e7m$1@sea.gmane.org>

Guido van Rossum wrote:

> The main problem for a smooth Unicode transition remains I/O, in my
> opinion; I'd like to see a PEP describing a way to attach an encoding
> to text files, and a way to decide on a default encoding for stdin,
> stdout, stderr.

FWIW, I've already drafted a patch for the former. It lets you write to
file.encoding and honors this when writing Unicode strings to it.

http://www.python.org/sf/1214889

Reinhold

-- 
Mail address is perfectly valid!


From raymond.hettinger at verizon.net  Sun Aug  7 11:54:28 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 07 Aug 2005 05:54:28 -0400
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <bbaeab1005080621266bcc87@mail.gmail.com>
Message-ID: <000401c59b36$01226de0$e410c797@oemcomputer>

VMError -- This is a new intermediate grouping so it won't break
anything and it does bring together two exceptions relating them by
source.  However, I recommend against introducing this new group.
Besides added yet another thing to remember, it violates
Flat-Is-Better-Than-Nested (see FIBTN below).  Also, the new group is
short on use cases with MemoryErrors sometimes being recoverable and
SystemErrors generally not.  In the library, only cookielib catches
these and it does so along with KeyboardInterrupt in order to re-raise.
In general, you don't want to introduce a new grouping unless there is
some recurring need to catch that group. 

EOFError -- I recommend leaving this one alone.  IOError is generally
for real errors while EOF occurs in the normal course of reading a file
or filelike source.  The former is hard to recover and the latter is
normal.  The PEP's justification of "Since an EOF comes from I/O it only
makes sense that it be considered an I/O error" is somewhat shallow and
doesn't reflect thought about how those exceptions are actually used.
That information is readily attainable by scanning the standard library
with 57 instances of EOFError and 150 instances of IOError.  There are a
few cases of overlap where an except clause catches both; however, the
two are mostly used independent from one another.  The review of the
library gives a good indication of how much code would be broken by this
change.  Also, see the FIBTN comment below.

AnyDeprecationWarning -- This grouping makes some sense intuitively but
do we have much real code that has had occasion to catch both at the
same time?  If not, then we don't need this.   

FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra
significance for the exception hierarchy.  The core issue is that
exceptions are NOT inherently tree-structured.  Each may ultimately
carry its own set of meaningful attributes and those tend to not neatly
roll-up into a parent/subclass relationships without Liskov violations.


Likewise, it is a mistake to introduce nesting as a means of
categorization.  The problem is that many conflicting, though meaningful
groupings are possible.  (i.e. grouped by source (vm, user, data,
system), grouped by recoverability or transience, grouped by
module/container type (dictionary errors, weakref errors, net errors,
warnings module, xml module, email errors), etc.)   

The ONLY useful nestings are those for a cluster of exceptions that are
typically all handled together.  IOW, any new nesting needs to be
justified by a long list of real code examples that currently catch all
those exceptions at the same time.  Ideally, searching for that list
would also turn-up no competing instances where other, orthogonal
groupings are being used.

Vocabulary size -- At one time, python-dev exhibited a strong reluctance
to introduce any new builtins.  No matter how sensible the idea, there
was typically an immediate effort to jam the proposed function into some
other namespace.  It should be remembered that each of PEP 348's
proposed new exception groupings ARE new builtins.  Therefore, the bar
for admission should be relatively high (i.e. I would prefer Fredrik's
join() proposal to any of the above new proposals).   Every new word in
the vocabulary makes the language a little more complex, a little less
likely to fit in your brain, and a little harder to learn.  Nestings
make this more acute since learning the new word also entails
remembering how it fits in the structure (yet another good reason for
FIBTN).

Once again, my advice is not introduce change unless it is solving a
specific, real problem in existing code.  

The groupings listed above feel like random ideas searching for a
justification rather than the product of an effort to solve known
issues.

If the PEP can't resist the urge to create new intermediate groupings,
then start by grepping through tons of Python code to find-out which
exceptions are typically caught on the same line.  That would be a
worthwhile empirical study and may lead to useful insights.

Try to avoid reversing the process, staring at the existing tree, and
letting your mind arbitrarily impose patterns on it.  


Raymond


From ncoghlan at gmail.com  Sun Aug  7 14:17:16 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 07 Aug 2005 22:17:16 +1000
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer>
References: <000401c59b36$01226de0$e410c797@oemcomputer>
Message-ID: <42F5FBCC.80701@gmail.com>

Raymond Hettinger wrote:
> FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra
> significance for the exception hierarchy.  The core issue is that
> exceptions are NOT inherently tree-structured.  Each may ultimately
> carry its own set of meaningful attributes and those tend to not neatly
> roll-up into a parent/subclass relationships without Liskov violations.

I think this is a key point, because a Python except clause makes it easy to 
create an on-the-fly exception grouping, but it is more awkward to get rid of 
inheritance that is incorrect (you have to catch and reraise the ones you 
don't want handled before the real handler).

I think Raymond gives a good suggestion - new groupings should only be 
introduced for exceptions where we have reasonable evidence that they are 
already frequently caught together.

TerminalException is a good example of this. "except (KeyboardInterrupt, 
SystemExit): raise" is something that should be written often - there is a 
definite use case for catching them together. Those two are also examples of 
inappropriate inheritance causing obvious usability problems.

Cheers,
Nick.

P.S. Are there any other hardware control people around to understand what I 
mean when I say that python-dev discussions sometimes remind me of a poorly 
tuned PID loop? Particularly the with statement discussion and this one. . .

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sun Aug  7 14:24:21 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 07 Aug 2005 22:24:21 +1000
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <bbaeab1005080621266bcc87@mail.gmail.com>
References: <bbaeab1005080621266bcc87@mail.gmail.com>
Message-ID: <42F5FD75.50903@gmail.com>

Brett Cannon wrote:
> * SystemExit are the KeyboardInterrupt are the only exceptions *not*
> inheriting from Exception
>     + CriticalException has been renamed TerminalException so it is
> more inline with the idea that the exceptions are meant to terminate
> the interpreter, not that they are more critical than other exceptions

I like TerminalException, although TerminatingException may be less ambiguous. 
("There's nothing wrong with my terminal, you moronic machine!")

> This version addresses everyone's worries about
> backwards-compatibility or changes that were not substantive enough to
> break code.

Well, I think you said from the start that the forces of 
backwards-compatibility would get you eventually ;)

> The things I did on my own without thorough discussion is remove
> ControlFlowException and introduce VMError.

+1 on the former.
-1 on the latter.

Same reasons as Raymond, basically. These exceptions are builtins, so let's 
not add new ones without a strong use case.

Anyway, this is starting to look pretty good (but then, I thought that a few 
days ago, too).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From martin at v.loewis.de  Sun Aug  7 14:53:21 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 14:53:21 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc20508061856c0cce4f@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
Message-ID: <42F60441.8000007@v.loewis.de>

Guido van Rossum wrote:
> The main problem for a smooth Unicode transition remains I/O, in my
> opinion; I'd like to see a PEP describing a way to attach an encoding
> to text files, and a way to decide on a default encoding for stdin,
> stdout, stderr.

If stdin, stdout and stderr go to a terminal, there already is a
default encoding (actually, there always is a default encoding on
these, as it falls back to the system encoding if its not a terminal,
or if the terminal's encoding is not supported or cannot be determined).

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 15:06:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 15:06:27 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <dd4e8n$e7m$1@sea.gmane.org>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<dd4e8n$e7m$1@sea.gmane.org>
Message-ID: <42F60753.8030309@v.loewis.de>

Reinhold Birkenfeld wrote:
> FWIW, I've already drafted a patch for the former. It lets you write to
> file.encoding and honors this when writing Unicode strings to it.

I don't like that approach. You shouldn't be allowed to change the
encoding mid-stream (except perhaps under very specific circumstances).

As I see it, the buffer of an encoded file becomes split, atleast for
input: there are bytes which have been read and not yet decoded, and
there are characters which have been decoded but not yet consumed.
If you change the encoding mid-stream, you would have to undo decoding
that was already done, resetting the stream to the real "current"
position.

For output, the situation is similar: before changing to a new encoding,
or before changing from unicode output to byte output, you have to
flush then codec first: it may be that the codec has buffered some
state which needs to be completely processed first before a new codec
can be applied to the stream.

Another issue is seeking: given the many different kinds of buffers,
seeking becomes fairly complex. Ideally, seeking should apply to
application-level positions, ie. if when you tell the current position,
it should be in terms of data already consumed by the application.
Perhaps seeking in an encoded stream should not be supported at all.

Finally, you also have to consider Universal Newlines: you can apply
them either on the byte stream, or on the character stream. I think
conceptually right would be to do universal newlines on the character
stream.

Regards,
Martin

From mal at egenix.com  Sun Aug  7 15:35:49 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 07 Aug 2005 15:35:49 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc20508061856c0cce4f@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
Message-ID: <42F60E35.9080809@egenix.com>

Guido van Rossum wrote:
> My first response to the PEP, however, is that instead of a new
> built-in function, I'd rather relax the requirement that str() return
> an 8-bit string -- after all, int() is allowed to return a long, so
> why couldn't str() be allowed to return a Unicode string?

The problem here is that strings and Unicode are used in different
ways, whereas integers and longs are very similar. Strings are used
for both arbitrary data and text data, Unicode can only be used
for text data.

The new text() built-in would help make a clear distinction
between "convert this object to a string of bytes" and
"please convert this to a text representation". We need to
start making the separation somewhere and I think this is
a good non-invasive start.

Furthermore, the text() built-in could be used to only
allow 8-bit strings with ASCII content to pass through
and require that all non-ASCII content be returned as
Unicode.

We wouldn't be able to enforce this in str().

I'm +1 on adding text().

I would also like to suggest a new formatting marker '%t'
to have the same semantics as text() - instead of changing
the semantics of %s as the Neil suggests in the PEP. Again,
the reason is to make the difference between text and
arbitrary data explicit and visible in the code.

> The main problem for a smooth Unicode transition remains I/O, in my
> opinion; I'd like to see a PEP describing a way to attach an encoding
> to text files, and a way to decide on a default encoding for stdin,
> stdout, stderr.

Hmm, not sure why you need PEPs for this:

Open an encoded file:
---------------------
Use codecs.open() instead of open() or file().

Set the external encoding for stdin, stdout, stderr:
----------------------------------------------------
(also an example for adding encoding support to an
existing file object):

def set_sys_std_encoding(encoding):
    # Load encoding support
    (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding)
    # Wrap using stream writers and readers
    sys.stdin = streamreader(sys.stdin)
    sys.stdout = streamwriter(sys.stdout)
    sys.stderr = streamwriter(sys.stderr)
    # Add .encoding attribute for introspection
    sys.stdin.encoding = encoding
    sys.stdout.encoding = encoding
    sys.stderr.encoding = encoding

set_sys_std_encoding('rot-13')

Example session:
>>> print 'hello'
uryyb
>>> raw_input()
hello
h'hello'
>>> 1/0
Genpronpx (zbfg erprag pnyy ynfg):
  Svyr "<fgqva>", yvar 1, va ?
MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb

Note that the interactive session bypasses the sys.stdin
redirection, which is why you can still enter Python
commands in ASCII - not sure whether there's a reason
for this, or whether it's just a missing feature.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Sun Aug  7 15:47:49 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 15:47:49 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F25E36.5060103@egenix.com>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com>	<42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>	<42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de>
	<42F25E36.5060103@egenix.com>
Message-ID: <42F61105.1070806@v.loewis.de>

M.-A. Lemburg wrote:
> BTW, in one of your replies I read that you had a problem with
> how cvs2svn handles trunk, branches and tags. In reality, this
> is no problem at all, since Subversion is very good at handling
> moves within the repository: you can easily change the repository
> layout after the import to whatevery layout you see fit - without
> losing any of the version history.

Yes, however, I recall that some clients have problems with displaying
history across renames (in particular, I believe viewcvs has this
problem); also, it becomes difficult to refer to an old version by
path name, since the old versions had all different path names.

Jim Fulton has suggested a different approach: cvs2svn can create
a dump file, and svnadmin load accepts a parent directory. Then,
no renames are necessary.

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 15:55:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 15:55:05 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com>
References: <42F256FC.7050606@v.loewis.de> <42E93940.6080708@v.loewis.de>
	<1122676547.10752.61.camel@geddy.wooz.org>
	<42EB5891.6020008@egenix.com> <42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com> <42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com> <42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com> <42F256FC.7050606@v.loewis.de>
	<5.1.1.6.0.20050804143230.025c4138@mail.telecommunity.com>
Message-ID: <42F612B9.105@v.loewis.de>

Phillip J. Eby wrote:
> Yeah, in my use of SVN I find that this is more theoretical than actual
> for certain use cases.  You can see the history of a file including the
> history of any file it was copied from.  However, if you want to try to
> look at the whole layout, you can't easily get to the old locations. 
> This can be a royal pain, whereas at least in CVS you can use viewcvs to
> show you the "attic".  Subversion doesn't have an attic, which makes
> looking at structural history very difficult.

I guess this is a client issue also; in websvn, you can browse an older
revision to see what the structure looked at that point. If you made
tags, you can also browse the tags through the standard HTTP interface.

I don't know a client, off-hand, which would answer the question
"which files have been moved since tag xyz?".

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 16:07:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 16:07:41 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <200507281956.03788.jeff@taupro.com>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
Message-ID: <42F615AD.7010008@v.loewis.de>

Jeff Rush wrote:
> BTW, re SSH access on python.org, using Apache's SSL support re https would 
> provide as good of security without the risk of giving out shell accounts.  
> SSL would encrypt the link and require a password or permit cert auth 
> instead, same as SSH.  Cert admin needn't be hard if only a single server 
> cert is used, with client passwords, instead of client certs.

That is the currently-proposed setup. However, with the current
subversion clients, you will have to save your password to disk, or type
it in every time. This is the real security disk: if somebody attacks
the client machine, they get access to the python source repository.

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 16:10:29 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 07 Aug 2005 16:10:29 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <dcc0rg$g9l$1@sea.gmane.org>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>
	<dcc0rg$g9l$1@sea.gmane.org>
Message-ID: <42F61655.3070101@v.loewis.de>

Fernando Perez wrote:
> I know Joe was in contact with the SVN devs to work on this, so perhaps he's
> using a patched version of cvs2svn, I simply don't know.  But I mention it in
> case it proves useful to the python.org conversion.

Thanks for the pointer. It turns out that I could resolve all my
conversion problems myself (following Jim Fulton's suggestion of
creating dump files). I found that somebody created a patch to support
different structures in cvs2svn directly, but these patches have not
been integrated yet.

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 16:34:43 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 16:34:43 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
Message-ID: <42F61C03.6050703@v.loewis.de>

I have placed a new version of the PEP on

http://www.python.org/peps/pep-0347.html

Changes to the previous version include:

- add more rationale for using svn (atomic changesets,
  fast tags and branches)

- changed conversion procedure to a single repository, with
  some reorganization. See

  http://www.dcl.hpi.uni-potsdam.de/pysvn/

  My proposal is that the repository is called

  http://svn.python.org/projects

- add discussion section (Nick Bastin's proposal of hosting
  a Perforce repository, single vs. multiple repositories,
  user authentication, admin overhead and alternative hosters)

- require python-cvsroot to be preserved forever.

Please let me know what else I should change in the PEP.

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 16:39:21 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 16:39:21 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1122918723.9680.33.camel@warna.corp.google.com>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
Message-ID: <42F61D19.6090806@v.loewis.de>

Donovan Baarda wrote:
> Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge
> properly. All the other cool stuff like renames etc is kinda undone by
> that. For a definition of properly, see;
> 
> http://prcs.sourceforge.net/merge.html

Can you please elaborate? I read the page, and it seems to me that
subversion's merge command works exactly the way described on the
page.

Regards,
Martin

From mal at egenix.com  Sun Aug  7 16:48:13 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 07 Aug 2005 16:48:13 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F61105.1070806@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1122676547.10752.61.camel@geddy.wooz.org>	<42EB5891.6020008@egenix.com>	<42EB5AD1.60703@v.loewis.de>
	<42EF436B.3050308@egenix.com>	<42EFE295.6040906@v.loewis.de>
	<42F11476.9000507@egenix.com>	<42F11962.2070107@v.loewis.de>
	<42F1D72C.8070202@egenix.com>	<42F256FC.7050606@v.loewis.de>
	<42F25E36.5060103@egenix.com> <42F61105.1070806@v.loewis.de>
Message-ID: <42F61F2D.7080604@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>BTW, in one of your replies I read that you had a problem with
>>how cvs2svn handles trunk, branches and tags. In reality, this
>>is no problem at all, since Subversion is very good at handling
>>moves within the repository: you can easily change the repository
>>layout after the import to whatevery layout you see fit - without
>>losing any of the version history.
> 
> 
> Yes, however, I recall that some clients have problems with displaying
> history across renames (in particular, I believe viewcvs has this
> problem); also, it becomes difficult to refer to an old version by
> path name, since the old versions had all different path names.

Since I only use trac to view the source code (which doesn't
have this problem), I can't comment on this.

> Jim Fulton has suggested a different approach: cvs2svn can create
> a dump file, and svnadmin load accepts a parent directory. Then,
> no renames are necessary.

Good idea.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mwh at python.net  Sun Aug  7 20:05:03 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 07 Aug 2005 19:05:03 +0100
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <ca471dc205080610104fb870ac@mail.gmail.com> (Guido van Rossum's
	message of "Sat, 6 Aug 2005 10:10:54 -0700")
References: <20050806121001.GC16042@rogue.amk.ca>
	<ca471dc205080610104fb870ac@mail.gmail.com>
Message-ID: <2mwtmxy65s.fsf@starship.python.net>

Guido van Rossum <gvanrossum at gmail.com> writes:

> On 8/6/05, A.M. Kuchling <amk at amk.ca> wrote:
>> PEP 8 doesn't express any preference between the
>> two forms of raise statements:
>> raise ValueError, 'blah'
>> raise ValueError("blah")
>> 
>> I like the second form better, because if the exception arguments are
>> long or include string formatting, you don't need to use line
>> continuation characters because of the containing parens.  Grepping
>> through the library code, the first form is in the majority, used
>> roughly 60% of the time.
>> 
>> Should PEP 8 take a position on this?  If yes, which one?
>
> Definitely ValueError('blah'). The other form will go away in Python
> 3000. Please update the PEP.

How do you then supply a traceback to the raise statement?

Cheers,
mwh

-- 
  please realize that the Common  Lisp community is more than 40
  years old.  collectively, the community has already been where
  every clueless newbie  will be going for the next three years.
  so relax, please.                     -- Erik Naggum, comp.lang.lisp

From gvanrossum at gmail.com  Sun Aug  7 20:15:05 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 11:15:05 -0700
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <2mwtmxy65s.fsf@starship.python.net>
References: <20050806121001.GC16042@rogue.amk.ca>
	<ca471dc205080610104fb870ac@mail.gmail.com>
	<2mwtmxy65s.fsf@starship.python.net>
Message-ID: <ca471dc20508071115627971c1@mail.gmail.com>

> How do you then supply a traceback to the raise statement?

raise ValueError, ValueError("blah"), tb

Maybe in Py3K this could become

raise ValueError("bloop"), tb

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sun Aug  7 20:27:20 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 20:27:20 +0200
Subject: [Python-Dev] [ python-Patches-790710 ] breakpoint
	command	listsinpdb
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE7AC28B@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <42F65288.8040901@v.loewis.de>

Michiel De Hoon wrote:
> Speaking of the five-patch-review-rule, about two months ago I reviewed five
> patches and posted a summary here in order to push patch #1049855. This patch
> is still waiting for a verdict (this is also my own fault, since I needed
> several iterations to get this patch straightened out; my apologies for
> that). Is there anything else I can do for this patch?

Sorry, I missed that message. I now reviewed the patch, but at the
moment, I see little chance that the suggested feature is implementable,
except by using code that is specific to each stdio implementation.

Regards,
Martin

From raymond.hettinger at verizon.net  Sun Aug  7 20:25:17 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 07 Aug 2005 14:25:17 -0400
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <ca471dc20508071115627971c1@mail.gmail.com>
Message-ID: <000401c59b7d$6d730940$05fecc97@oemcomputer>

> > How do you then supply a traceback to the raise statement?
> 
> raise ValueError, ValueError("blah"), tb
> 
> Maybe in Py3K this could become
> 
> raise ValueError("bloop"), tb

The instantiation and bindings need to be done in one step without
mixing two syntaxes.  Treat this case the same as everything else:

raise ValueError("blip", traceback=tb)


Raymond


From ilya at bluefir.net  Sun Aug  7 22:38:00 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sun, 7 Aug 2005 13:38:00 -0700 (PDT)
Subject: [Python-Dev] pdb: should next command be extended?
Message-ID: <Pine.LNX.4.58.0508071312290.695@bagira>


Problem:
  When the code contains list comprehensions (or for that matter any other
looping construct), the only way to get quickly through this code in pdb
is to set a temporary breakpoint on the line after the loop, which is
inconvenient..
There is a SF bug report #1248119 about this behavior.

Solution:

Should pdb's next command accept an optional numeric argument? It would
specify how many actual lines of code (not "line events")
should  be skipped in the current frame before stopping,

i.e "next 5" would mean stop when
   line>=line_where_next_N_happened+5
is reached.

This would allow to easily get over/out of loops in the debugger

What do you think?

Ilya

From martin at v.loewis.de  Sun Aug  7 23:11:56 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 23:11:56 +0200
Subject: [Python-Dev] [C++-sig]  GCC version compatibility
In-Reply-To: <200507172321.31665.anthony@interlink.com.au>
References: <u8y0jz762.fsf@boost-consulting.com>	<200507171601.23780.anthony@interlink.com.au>	<20050717100609.GB3581@lap200.cdc.informatik.tu-darmstadt.de>
	<200507172321.31665.anthony@interlink.com.au>
Message-ID: <42F6791C.3030602@v.loewis.de>

Anthony Baxter wrote:
> I should probably add that I'm not flagging that I think there's a problem
> here. I'm mostly urging caution - I hate having to cut brown-paper-bag 
> releases <wink>. If possible, can the folks on c++-sig try this patch
> out and put their results in the patch discussion? If you're keen, you 
> could try jumping onto HP's testdrive systems (http://www.testdrive.hp.com/).
>>From what I recall, they have a bunch of systems with non-gcc C++ compilers,
> including the DEC^WDigital^Compaq^WHP one on the alphas, and the HP C++ 
> compiler on the HP/UX boxes[1].

I've looked at the patch, and it looks fairly safe, so I committed it.

Regards,
Martin

From martin at v.loewis.de  Sun Aug  7 23:15:08 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 Aug 2005 23:15:08 +0200
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508071312290.695@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
Message-ID: <42F679DC.6030705@v.loewis.de>

Ilya Sandler wrote:
> Should pdb's next command accept an optional numeric argument? It would
> specify how many actual lines of code (not "line events")
> should  be skipped in the current frame before stopping,
[...]
> What do you think?

That would differ from gdb's "next <n>", which does "next" n times.
It would be confusing if pdb accepted the same command, but it
meant something different. Plus, there is always a chance that
<current line>+n is never reached, which would also be confusing.

So I'm -1 here.

Regards,
Martin

From abo at minkirri.apana.org.au  Mon Aug  8 00:12:36 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sun, 07 Aug 2005 15:12:36 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F61D19.6090806@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<42F61D19.6090806@v.loewis.de>
Message-ID: <42F68754.6090400@minkirri.apana.org.au>

Martin v. L?wis wrote:
> Donovan Baarda wrote:
> 
>>Yeah. IMHO the sadest thing about SVN is it doesn't do branch/merge
>>properly. All the other cool stuff like renames etc is kinda undone by
>>that. For a definition of properly, see;
>>
>>http://prcs.sourceforge.net/merge.html
> 
> 
> Can you please elaborate? I read the page, and it seems to me that
> subversion's merge command works exactly the way described on the
> page.

maybe it's changed since I last looked at it, but last time I looked SVN 
didn't track merge histories. From the svnbook;

"Unfortunately, Subversion is not such a system. Like CVS, Subversion 
1.0 does not yet record any information about merge operations. When you 
commit local modifications, the repository has no idea whether those 
changes came from running svn merge, or from just hand-editing the files."

What this means is SVN has no way of automatically identifying the 
common version. An svn merge requires you to manually identify and 
specify the "last common point" where the branch was created or last 
merged. PRCS automatically finds the common version from the 
branch/merge history, and even remembers the 
merge/replace/nothing/delete decision you make for each file as the 
default to use for future merges.

You can see this in the command line differences. For subversion;

# create and checkout branch my-calc-branch
$ svn copy http://svn.example.com/repos/calc/trunk \
            http://svn.example.com/repos/calc/branches/my-calc-branch \
       -m "Creating a private branch of /calc/trunk."
$ svn checkout http://svn.example.com/repos/calc/branches/my-calc-branch

# merge and commit changes from trunk
$ svn merge -r 341:HEAD http://svn.example.com/repos/calc/trunk
$ svn commit -m "Merged trunc changes to my-calc-branch."

# merge and commit more changes from trunk
$ svn merge -r 345:HEAD http://svn.example.com/repos/calc/trunk
$ svn commit -m "Merged trunc changes to my-calc-branch."

Note that 341 and 345 are "magic" version numbers which correspond to 
the trunc version at the time of branch and first merge respectively. It 
is up to the user to figure out these versions using either meticulous 
use of tags or svn logs.

In PRCS;

# create and checkout branch my-calc-branch
$ prcs checkout calc -r 0
$ prcs checkin -r my-calc-branch -m "Creating my-calc-branch"

# merge and commit changes from trunk
$ prcs merge -r 0
$ prcs checkin -m " merged changes from trunk"

# merge and commit more changes from trunk
$ prcs merge -r 0
$ prcs checkin -m " merged changes from trunk"

Note that "-R 0" means "HEAD of trunk branch", and "-r my-calc-branch" 
means "HEAD of my-calc-branch". There is no need to figure out what 
versions of those branches to use as the "changes from" point, because 
PRCS figures it out for you. Not only that, but if you chose to ignore 
changes in certain files during the first merge, the second merge will 
remember that as the default action for the second merge.

--
Donovan Baarda

From ilya at bluefir.net  Mon Aug  8 00:12:20 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sun, 7 Aug 2005 15:12:20 -0700 (PDT)
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <42F679DC.6030705@v.loewis.de>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<42F679DC.6030705@v.loewis.de>
Message-ID: <Pine.LNX.4.58.0508071435330.695@bagira>


On Sun, 7 Aug 2005, [ISO-8859-1] "Martin v. L?wis" wrote:

> Ilya Sandler wrote:
> > Should pdb's next command accept an optional numeric argument? It would
> > specify how many actual lines of code (not "line events")
> > should  be skipped in the current frame before stopping,
> [...]
> > What do you think?
>
> That would differ from gdb's "next <n>", which does "next" n times.
> It would be confusing if pdb accepted the same command, but it
> meant something different.

But as far as I can tell, pdb's next is
already different from gdb's next! gdb's next seem to always go to the
different source line, while pdb's next may stay on the current line.

The problem with "next <n>" meaning "repeat next n times" is that it
seems to be less useful that the original suggestion.

Any alternative suggestions to allow to step over list comprehensions and
such? (SF 1248119)

> Plus, there is always a chance that
> <current line>+n is never reached, which would also be confusing.

That should not be a problem, returning from the current frame should be
treated as a stopping condition (similarly to the current "next"
behaviour)...

Ilya


> So I'm -1 here.
>
> Regards,
> Martin
>

From martin at v.loewis.de  Mon Aug  8 00:33:26 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 00:33:26 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F68754.6090400@minkirri.apana.org.au>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	<1122918723.9680.33.camel@warna.corp.google.com>	<42F61D19.6090806@v.loewis.de>
	<42F68754.6090400@minkirri.apana.org.au>
Message-ID: <42F68C36.4090208@v.loewis.de>

Donovan Baarda wrote:
> What this means is SVN has no way of automatically identifying the 
> common version.

Ah, ok. That's true. It doesn't mean you can't do proper merging
with subversion - it only means that it is harder, as you need to
figure out the revision range that you want to merge.

If this is too painful, you can probably use subversion to store
the relevant information. For example, you could define a custom
property on the directory, last_merge_from_trunk, which you
would always update after you have done a merge operation. Then
you don't have to look through history to find out when you
last merged.

Regards,
Martin

From gvanrossum at gmail.com  Mon Aug  8 01:58:42 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 16:58:42 -0700
Subject: [Python-Dev] PEP 8: exception style
In-Reply-To: <000401c59b7d$6d730940$05fecc97@oemcomputer>
References: <ca471dc20508071115627971c1@mail.gmail.com>
	<000401c59b7d$6d730940$05fecc97@oemcomputer>
Message-ID: <ca471dc20508071658193c4a27@mail.gmail.com>

> > Maybe in Py3K this could become
> >
> > raise ValueError("bloop"), tb
> 
> The instantiation and bindings need to be done in one step without
> mixing two syntaxes.  Treat this case the same as everything else:
> 
> raise ValueError("blip", traceback=tb)

That requires PEP 344. I have some vague feeling that the way we build
up the traceback by linking backwards, this may not necessarily work
right. I guess somebody has to try to implement PEP 344 in order to
find out.

(In fact, I think trying to implement PEP 344 would be an *excellent*
way to validate it.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Mon Aug  8 02:03:02 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 17:03:02 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F60441.8000007@v.loewis.de>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60441.8000007@v.loewis.de>
Message-ID: <ca471dc205080717033ef6510d@mail.gmail.com>

[me]
> > a way to decide on a default encoding for stdin,
> > stdout, stderr.

[Martin]
> If stdin, stdout and stderr go to a terminal, there already is a
> default encoding (actually, there always is a default encoding on
> these, as it falls back to the system encoding if its not a terminal,
> or if the terminal's encoding is not supported or cannot be determined).

So there is. Wow! I never kew this. How does it work? Can we use this
for writing to files to?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Mon Aug  8 02:07:49 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 17:07:49 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F60753.8030309@v.loewis.de>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<dd4e8n$e7m$1@sea.gmane.org> <42F60753.8030309@v.loewis.de>
Message-ID: <ca471dc205080717074d77b9d5@mail.gmail.com>

[Reinhold Birkenfeld]
> > FWIW, I've already drafted a patch for the former. It lets you write to
> > file.encoding and honors this when writing Unicode strings to it.

[Martin v L]
> I don't like that approach. You shouldn't be allowed to change the
> encoding mid-stream (except perhaps under very specific circumstances).

Right. IMO the encoding is something you specify when opening the
file, just like buffer size and text mode.

> Another issue is seeking: given the many different kinds of buffers,
> seeking becomes fairly complex. Ideally, seeking should apply to
> application-level positions, ie. if when you tell the current position,
> it should be in terms of data already consumed by the application.
> Perhaps seeking in an encoded stream should not be supported at all.

I'm not sure if it works for all encodings, but if possible I'd like
to extend the seeking semantics on text files: seek positions are byte
counts, and the application should consider them as "magic cookies".

> Finally, you also have to consider Universal Newlines: you can apply
> them either on the byte stream, or on the character stream. I think
> conceptually right would be to do universal newlines on the character
> stream.

Is there any reason not to do Universal Newline processing on *all*
text files? I can't think of a use case where you'd like text file
processing but you want to see the bare \r characters.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Mon Aug  8 02:24:34 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 17:24:34 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F60E35.9080809@egenix.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
Message-ID: <ca471dc205080717246d0c4919@mail.gmail.com>

[Guido]
> > My first response to the PEP, however, is that instead of a new
> > built-in function, I'd rather relax the requirement that str() return
> > an 8-bit string -- after all, int() is allowed to return a long, so
> > why couldn't str() be allowed to return a Unicode string?

[MAL]
> The problem here is that strings and Unicode are used in different
> ways, whereas integers and longs are very similar. Strings are used
> for both arbitrary data and text data, Unicode can only be used
> for text data.

Yes, that is the case in Python 2.x. In Python 3.x, I'd like to use a
separate "bytes" array type for non-text and for encoded text data,
just like Java; strings should always be considered text data.

We might be able to get there halfway in Python 2.x: we could
introduce the bytes type now, and provide separate APIs to read and
write them. (In fact, the array module and the f.readinto()  method
make this possible today, but it's too klunky so nobody uses it.
Perhaps a better API would be a new file-open mode ("B"?) to indicate
that a file's read* operations should return bytes instead of strings.
The bytes type could just be a very thin wrapper around array('b').

> The new text() built-in would help make a clear distinction
> between "convert this object to a string of bytes" and
> "please convert this to a text representation". We need to
> start making the separation somewhere and I think this is
> a good non-invasive start.

I agree with the latter, but I would prefer that any new APIs we use
use a 'bytes' data type to represent non-text data, rather than having
two different sets of APIs to differentiate between the use of 8-bit
strings as text vs. data -- while we *currently* use 8-bit strings for
both text and data, in Python 3.0 we won't, so then the interim APIs
would have to change again. I'd rather intrduce a new data type and
new APIs that work with it.

> Furthermore, the text() built-in could be used to only
> allow 8-bit strings with ASCII content to pass through
> and require that all non-ASCII content be returned as
> Unicode.
> 
> We wouldn't be able to enforce this in str().
> 
> I'm +1 on adding text().

I'm still -1.

> I would also like to suggest a new formatting marker '%t'
> to have the same semantics as text() - instead of changing
> the semantics of %s as the Neil suggests in the PEP. Again,
> the reason is to make the difference between text and
> arbitrary data explicit and visible in the code.

Hm. What would be the use case for using %s with binary, non-text data?

> > The main problem for a smooth Unicode transition remains I/O, in my
> > opinion; I'd like to see a PEP describing a way to attach an encoding
> > to text files, and a way to decide on a default encoding for stdin,
> > stdout, stderr.
> 
> Hmm, not sure why you need PEPs for this:

I'd forgotten how far we've come. I'm still unsure how the default
encoding on stdin/stdout works.

But it still needs to be simpler; IMO the built-in open() function
should have an encoding keyword. (But it could return something whose
type is not 'file' -- once again making a distinction between open and
file.) Do these files support universal newlines? IMO they should.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From benji at benjiyork.com  Mon Aug  8 02:51:53 2005
From: benji at benjiyork.com (Benji York)
Date: Sun, 07 Aug 2005 20:51:53 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F68C36.4090208@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>	<1f7befae050728172161d4a9e8@mail.gmail.com>	<200507281956.03788.jeff@taupro.com>	<1f7befae05072819142c36e610@mail.gmail.com>	<1122605323.9670.11.camel@geddy.wooz.org>	<1f7befae0507281959abc2a7c@mail.gmail.com>	<1122607673.9665.38.camel@geddy.wooz.org>	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	<1122918723.9680.33.camel@warna.corp.google.com>	<42F61D19.6090806@v.loewis.de>	<42F68754.6090400@minkirri.apana.org.au>
	<42F68C36.4090208@v.loewis.de>
Message-ID: <42F6ACA9.6030306@benjiyork.com>

Martin v. L?wis wrote:
> Donovan Baarda wrote:
>>What this means is SVN has no way of automatically identifying the 
>>common version.

> If this is too painful, you can probably use subversion to store
> the relevant information. For example, you could define a custom
> property on the directory

A script named "svnmerge" that does just that is included in the contrib 
directory of the Subversion tar.  We (ZC) have just started using it to 
track two-way merge operations, but I don't have much experience with it 
personally yet.
--
Benji York

From nick.bastin at gmail.com  Mon Aug  8 03:52:47 2005
From: nick.bastin at gmail.com (Nicholas Bastin)
Date: Sun, 7 Aug 2005 21:52:47 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F1AADE.50908@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
Message-ID: <66d0a6e105080718527939aa81@mail.gmail.com>

On 8/4/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Nicholas Bastin wrote:
> > Perforce is a commercial product, but it can be had for free for
> > verified Open Source projects, which Python shouldn't have any problem
> > with.  There are other problems, like you have to renew the agreement
> > every year, but it might be worth considering, given the fact that
> > it's an excellent system.
> 
> So we should consider it because it is an excellent system... I don't
> know what that means, in precise, day-to-day usage terms (i.e. what
> precisely would it do for us that, say, Subversion can't do).

It's a mature product.  I would hope that that would count for
something.  I've had enough corrupted subversion repositories that I'm
not crazy about the thought of using it in a production system.  I
know I'm not the only person with this experience.  Sure, you can keep
backups, and not really lose any work, but we're moving over because
we have uptime and availability problems, so lets not just create them
again.

> >>I think anything but Subversion is ruled out because:
> >>- there is no offer to host that anywhere (for subversion, there is
> >>  already svn.python.org)
> >
> >
> > We could host a Perforce repository just as easily, I would think.
> 
> Interesting offer. I'll add this to the PEP - who is "we" in this
> context?

Uh, the Python community.  Which is currently hosting a subversion
repository, so it doesn't seem like a stretch to imagine that
p4.python.org could exist just as easily.

> >>- there is no support for converting a CVS repository (for subversion,
> >>  there is cvs2svn)
> >
> >
> > I'd put $20 on the fact that cvs2svn will *not* work out of the box
> > for converting the python repository.  Just call it a hunch.
> 
> You could have read the PEP before losing that money :-) It did work
> out of the box.

Pardon me if I don't feel that I'd like to see a system in production
for a few weeks before we declare victory.  The problems with this
kind of conversion can be very subtle, and very painful.  I'm not
saying we shouldn't do this, I'm just saying that we should take an
appropriate measure of how much greener the grass really is on the
other side, and how much work we're willing to put in to make it that
way.

--
Nick

From bcannon at gmail.com  Mon Aug  8 03:57:15 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 7 Aug 2005 18:57:15 -0700
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer>
References: <bbaeab1005080621266bcc87@mail.gmail.com>
	<000401c59b36$01226de0$e410c797@oemcomputer>
Message-ID: <bbaeab100508071857196347d8@mail.gmail.com>

On 8/7/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> VMError -- This is a new intermediate grouping so it won't break
> anything and it does bring together two exceptions relating them by
> source.  However, I recommend against introducing this new group.
> Besides added yet another thing to remember, it violates
> Flat-Is-Better-Than-Nested (see FIBTN below).  Also, the new group is
> short on use cases with MemoryErrors sometimes being recoverable and
> SystemErrors generally not.  In the library, only cookielib catches
> these and it does so along with KeyboardInterrupt in order to re-raise.
> In general, you don't want to introduce a new grouping unless there is
> some recurring need to catch that group.
> 

And Nick didn't like it either.  Unless someone speaks up Monday, you
can consider it removed.

> EOFError -- I recommend leaving this one alone.  IOError is generally
> for real errors while EOF occurs in the normal course of reading a file
> or filelike source.  The former is hard to recover and the latter is
> normal.  The PEP's justification of "Since an EOF comes from I/O it only
> makes sense that it be considered an I/O error" is somewhat shallow and
> doesn't reflect thought about how those exceptions are actually used.
> That information is readily attainable by scanning the standard library
> with 57 instances of EOFError and 150 instances of IOError.  There are a
> few cases of overlap where an except clause catches both; however, the
> two are mostly used independent from one another.  The review of the
> library gives a good indication of how much code would be broken by this
> change.  Also, see the FIBTN comment below.
> 

Basically you are arguing that EOFError is practically not an error
and more of an exception signaling an event, like StopIteration for
file reading.  That makes sense, although it does suggest the name
breaks the naming scheme Guido suggested.  But I am not crazy enough
to try to suggest a name change at this point.  =)

> AnyDeprecationWarning -- This grouping makes some sense intuitively but
> do we have much real code that has had occasion to catch both at the
> same time?  If not, then we don't need this.
> 

Well, PendingDeprecationWarning is barely used in Lib/ it seems.  That
would suggest the grouping isn't worth it just because the need to
catch it will be miniscule.  That also kills the argument that it
would simplify warnings filters by cutting down on needing another
registration since the chance of that happening seems to be
microscopic.

> FIBTN (flat-is-better-than-nested) -- This bit of Zen carries extra
> significance for the exception hierarchy.  The core issue is that
> exceptions are NOT inherently tree-structured.  Each may ultimately
> carry its own set of meaningful attributes and those tend to not neatly
> roll-up into a parent/subclass relationships without Liskov violations.
> 
[SNIP]
> 
> Vocabulary size -- At one time, python-dev exhibited a strong reluctance
> to introduce any new builtins.  No matter how sensible the idea, there
> was typically an immediate effort to jam the proposed function into some
> other namespace.  It should be remembered that each of PEP 348's
> proposed new exception groupings ARE new builtins.  Therefore, the bar
> for admission should be relatively high (i.e. I would prefer Fredrik's
> join() proposal to any of the above new proposals).   Every new word in
> the vocabulary makes the language a little more complex, a little less
> likely to fit in your brain, and a little harder to learn.  Nestings
> make this more acute since learning the new word also entails
> remembering how it fits in the structure (yet another good reason for
> FIBTN).
> 

Now those are two arguments I can go with.

OK, so your points make sense.  I will wait until Monday evening after
work to make any changes to give people a chance to argue against
them, but VMError and AnyDeprecationWarning can be considered removed
and EOFError will be moved to inherit from EnvironmentError again.

Luckily you didn't say you hated TerminalException.  =)

-Brett

From bcannon at gmail.com  Mon Aug  8 04:02:30 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 7 Aug 2005 19:02:30 -0700
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <42F5FD75.50903@gmail.com>
References: <bbaeab1005080621266bcc87@mail.gmail.com>
	<42F5FD75.50903@gmail.com>
Message-ID: <bbaeab1005080719027fd5796f@mail.gmail.com>

On 8/7/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > * SystemExit are the KeyboardInterrupt are the only exceptions *not*
> > inheriting from Exception
> >     + CriticalException has been renamed TerminalException so it is
> > more inline with the idea that the exceptions are meant to terminate
> > the interpreter, not that they are more critical than other exceptions
> 
> I like TerminalException, although TerminatingException may be less ambiguous.
> ("There's nothing wrong with my terminal, you moronic machine!")
> 

Maybe.  But the interpreter is not terminating quite yet; state is
still fine since the exceptions have not reached the top of the stack
if you caught it.  But then "terminal" sounds destined to die, which
is not true either since that only occurs if you catch the exceptions;
"terminating" portrays that the plan is the termination but that it is
not definite.

OK, TerminatingException it is.

> > This version addresses everyone's worries about
> > backwards-compatibility or changes that were not substantive enough to
> > break code.
> 
> Well, I think you said from the start that the forces of
> backwards-compatibility would get you eventually ;)
> 

=)  I should become a pundit for being able to tell what is going to happen.

> > The things I did on my own without thorough discussion is remove
> > ControlFlowException and introduce VMError.
> 
> +1 on the former.
> -1 on the latter.
> 
> Same reasons as Raymond, basically. These exceptions are builtins, so let's
> not add new ones without a strong use case.
> 
> Anyway, this is starting to look pretty good (but then, I thought that a few
> days ago, too).
> 

Yeah, and so did everyone else basically.  While Guido has his "let's
get all excited about a crazy idea, but then scale it back" mentality,
I guess I have the "let's change everything for the better, but then
realize other people actually use this language too".  =)

-Brett

From edcjones at comcast.net  Mon Aug  8 04:06:44 2005
From: edcjones at comcast.net (Edward C. Jones)
Date: Sun, 07 Aug 2005 22:06:44 -0400
Subject: [Python-Dev] PyTuple_Pack added references undocumented
Message-ID: <42F6BE34.3040104@comcast.net>

According to the source code, PyTuple_Pack returns a new reference (it 
calls PyTuple_New). It also Py_INCREF's all the objects in the new 
tuple. Is this unusual behavior? None of these added references are 
documented in the API Reference Manual.

From bcannon at gmail.com  Mon Aug  8 04:10:47 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 7 Aug 2005 19:10:47 -0700
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <42F61C03.6050703@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
Message-ID: <bbaeab100508071910622202e7@mail.gmail.com>

On 8/7/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I have placed a new version of the PEP on
> 
> http://www.python.org/peps/pep-0347.html
> 
> Changes to the previous version include:
> 
> - add more rationale for using svn (atomic changesets,
>   fast tags and branches)
> 
> - changed conversion procedure to a single repository, with
>   some reorganization. See
> 
>   http://www.dcl.hpi.uni-potsdam.de/pysvn/
> 

What is going in under python/ ?  If it is what is currently
/dist/src/, then great and the renaming of the repository works.  But
if that is what src/ is going to be used for, then what is python/ for
and it would be nice to have a repository name that more directly
reflects that it is the Python source tree.

And I assume you are going to list the directory structure in the PEP
at some point.

-Brett

From gvanrossum at gmail.com  Mon Aug  8 04:14:43 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 7 Aug 2005 19:14:43 -0700
Subject: [Python-Dev] PyTuple_Pack added references undocumented
In-Reply-To: <42F6BE34.3040104@comcast.net>
References: <42F6BE34.3040104@comcast.net>
Message-ID: <ca471dc20508071914e5f49f0@mail.gmail.com>

On 8/7/05, Edward C. Jones <edcjones at comcast.net> wrote:
> According to the source code, PyTuple_Pack returns a new reference (it
> calls PyTuple_New). It also Py_INCREF's all the objects in the new
> tuple. Is this unusual behavior? None of these added references are
> documented in the API Reference Manual.

This seems the only sensible behavior given what it does.

I think the INCREFs don't need to be documented because you don't have
to worry about them -- they follow the normal pattern of reference
counts: if you owned an object before passing it to PyTuple_Pack(),
you still own it afterwards.

The docs say that it returns a new object, so that's in order too.

It's not listed in refcounts.dat; that seems an omission (or perhaps
the function's varargs signature doesn't fit in the pattern?).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Mon Aug  8 04:49:41 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 07 Aug 2005 22:49:41 -0400
Subject: [Python-Dev] PyTuple_Pack added references undocumented
In-Reply-To: <42F6BE34.3040104@comcast.net>
Message-ID: <000301c59bc3$d3ff9e80$05fecc97@oemcomputer>

> According to the source code, PyTuple_Pack returns a new reference (it
> calls PyTuple_New). It also Py_INCREF's all the objects in the new
> tuple. Is this unusual behavior? 

No.  That is how containers work.  Look at PyBuild_Value() for
comparison.  


> None of these added references are documented in the API Reference
Manual.

The docs seem clear to me.  If the docs don't meet your needs, submit a
patch.


From nas at arctrix.com  Mon Aug  8 05:47:56 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Aug 2005 21:47:56 -0600
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc20508061856c0cce4f@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
Message-ID: <20050808034756.GA16756@mems-exchange.org>

On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote:
> My first response to the PEP, however, is that instead of a new
> built-in function, I'd rather relax the requirement that str() return
> an 8-bit string

Do you have any thoughts on what the C API would be?  It seems to me
that PyObject_Str cannot start returning a unicode object without a
lot of code breakage.  I suppose we could introduce a function
called something like PyObject_String.

  Neil

From pje at telecommunity.com  Mon Aug  8 05:49:25 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 07 Aug 2005 23:49:25 -0400
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc205080717246d0c4919@mail.gmail.com>
References: <42F60E35.9080809@egenix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
Message-ID: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>

At 05:24 PM 8/7/2005 -0700, Guido van Rossum wrote:
>Hm. What would be the use case for using %s with binary, non-text data?

Well, I could see using it to write things like netstrings, 
i.e.  sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way 
to write a netstring in today's Python at least.  But perhaps there's a 
subtlety I've missed here.


From barry at python.org  Mon Aug  8 05:51:49 2005
From: barry at python.org (Barry Warsaw)
Date: Sun, 07 Aug 2005 23:51:49 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e105080718527939aa81@mail.gmail.com>
References: <42E93940.6080708@v.loewis.de>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
Message-ID: <1123473109.20293.35.camel@geddy.wooz.org>

On Sun, 2005-08-07 at 21:52, Nicholas Bastin wrote:

> I've had enough corrupted subversion repositories that I'm
> not crazy about the thought of using it in a production system.  I
> know I'm not the only person with this experience.  Sure, you can keep
> backups, and not really lose any work, but we're moving over because
> we have uptime and availability problems, so lets not just create them
> again.

Has anyone experienced svn corruptions with the fsfs backend?  I
haven't, across quite a few repositories.

> Uh, the Python community.  Which is currently hosting a subversion
> repository, so it doesn't seem like a stretch to imagine that
> p4.python.org could exist just as easily.

Unfortunately, I don't think "we" (meaning specifically the collective
python.org admins) have much if any operational experience with
Perforce.  We do with Subversion though and that's a big plus.  If "we"
were to host a Perforce repository, we'd need significant commitments
from several somebodies to get things set up, keep it running, and help
socialize long-term institutional knowledge amongst the other admins.  

We'd also have to teach the current crop of developers how to use the
client tools effectively.  I think it's fairly simple to teach a CVS
user how to use Subversion, but have no idea if translating CVS
experience to Perforce is as straightforward.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050807/59637840/attachment.pgp

From fdrake at acm.org  Mon Aug  8 07:07:51 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Aug 2005 01:07:51 -0400
Subject: [Python-Dev] PyTuple_Pack added references undocumented
In-Reply-To: <ca471dc20508071914e5f49f0@mail.gmail.com>
References: <42F6BE34.3040104@comcast.net>
	<ca471dc20508071914e5f49f0@mail.gmail.com>
Message-ID: <200508080107.51852.fdrake@acm.org>

On Sunday 07 August 2005 22:14, Guido van Rossum wrote:
 > I think the INCREFs don't need to be documented because you don't have
 > to worry about them -- they follow the normal pattern of reference
 > counts: if you owned an object before passing it to PyTuple_Pack(),
 > you still own it afterwards.

That's right; the function doesn't affect the references you hold in any way, 
so there's no need to deal with them.

 > It's not listed in refcounts.dat; that seems an omission (or perhaps
 > the function's varargs signature doesn't fit in the pattern?).

It should and can be listed.  refcounts.dat won't deal with the varargs 
portion of the signature, but it can deal with the return value and normal 
arguments without worrying about varargs portions of the signature for any 
function.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From martin at v.loewis.de  Mon Aug  8 07:37:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 07:37:27 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc205080717033ef6510d@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	
	<dd36mq$pjr$1@sea.gmane.org>	
	<ca471dc20508061856c0cce4f@mail.gmail.com>	
	<42F60441.8000007@v.loewis.de>
	<ca471dc205080717033ef6510d@mail.gmail.com>
Message-ID: <42F6EF97.50609@v.loewis.de>

Guido van Rossum wrote:
>>If stdin, stdout and stderr go to a terminal, there already is a
>>default encoding (actually, there always is a default encoding on
>>these, as it falls back to the system encoding if its not a terminal,
>>or if the terminal's encoding is not supported or cannot be determined).
> 
> 
> So there is. Wow! I never kew this. How does it work? Can we use this
> for writing to files to?

On Unix, it uses nl_langinfo(CHARSET), which in turn looks at the
environment variables.

On Windows, it uses GetConsoleCP()/GetConsoleOutputCP().

On Mac, I'm still searching for a way to determine the encoding of
Terminal.app.

In IDLE, it uses locale.getpreferredencoding().

So no, this cannot easily be used for file output. Most likely, people
would use locale.getpreferredencoding() for file output. For socket
output, there should not be a standard way to encode Unicode.

Regards,
Martin


From bob at redivi.com  Mon Aug  8 07:58:51 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 7 Aug 2005 19:58:51 -1000
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F6EF97.50609@v.loewis.de>
References: <20050806102342.GA11309@mems-exchange.org>	
	<dd36mq$pjr$1@sea.gmane.org>	
	<ca471dc20508061856c0cce4f@mail.gmail.com>	
	<42F60441.8000007@v.loewis.de>
	<ca471dc205080717033ef6510d@mail.gmail.com>
	<42F6EF97.50609@v.loewis.de>
Message-ID: <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com>

On Aug 7, 2005, at 7:37 PM, Martin v. L?wis wrote:

> Guido van Rossum wrote:
>
>>> If stdin, stdout and stderr go to a terminal, there already is a
>>> default encoding (actually, there always is a default encoding on
>>> these, as it falls back to the system encoding if its not a  
>>> terminal,
>>> or if the terminal's encoding is not supported or cannot be  
>>> determined).
>>>
>>
>>
>> So there is. Wow! I never kew this. How does it work? Can we use this
>> for writing to files to?
>>
>
> On Unix, it uses nl_langinfo(CHARSET), which in turn looks at the
> environment variables.
>
> On Windows, it uses GetConsoleCP()/GetConsoleOutputCP().
>
> On Mac, I'm still searching for a way to determine the encoding of
> Terminal.app.

It's UTF-8 by default, I highly doubt many people bother to change it.

-bob


From martin at v.loewis.de  Mon Aug  8 07:59:02 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 07:59:02 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc205080717074d77b9d5@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	
	<dd36mq$pjr$1@sea.gmane.org>	
	<ca471dc20508061856c0cce4f@mail.gmail.com>	
	<dd4e8n$e7m$1@sea.gmane.org> <42F60753.8030309@v.loewis.de>
	<ca471dc205080717074d77b9d5@mail.gmail.com>
Message-ID: <42F6F4A6.8060605@v.loewis.de>

Guido van Rossum wrote:
> I'm not sure if it works for all encodings, but if possible I'd like
> to extend the seeking semantics on text files: seek positions are byte
> counts, and the application should consider them as "magic cookies".

If the seek position is merely a number, it won't work for all
encodings. For the ISO 2022 ones (iso-2022-jp etc), you need to know
the shift state: you can switch to a different encoding in the stream
using standard escape codes, and then the same bytes are interpreted
differently. For example, iso-2022-jp supports these escape codes:

ESC ( B           ASCII
ESC $ @           JIS X 0208-1978
ESC $ B           JIS X 0208-1983
ESC ( J           JIS X 0201-Roman
ESC $ A           GB2312-1980
ESC $ ( C         KSC5601-1987
ESC $ ( D         JIS X 0212-1990
ESC . A           ISO8859-1
ESC . F           ISO8859-7

So at a certain position in the stream, the same bytes could mean
different characters, depending on which "shift state" you are in.
That's why ISO C introduced fgetpos/fsetpos in addition to
ftell/fseek: an fpos_t is a truly opaque structure that can also
incorporate codec state.

If you follow this approach, you can get back most of seek;
you will lose the "whence" parameter, i.e. you cannot seek forth
and back, and you cannot position at the end of the file
(actually, iso-2022-jp still supports appending to a file, since
it requires that all data "shift out" back to ASCII at the end
of each line, and at the end of the file. So "correct" ISO 2022
files can still be concatenated)


> Is there any reason not to do Universal Newline processing on *all*
> text files?

Correct. However, this still might result in a full rewrite of the
universal newlines code: the code currently operates on byte streams,
when it "should" operate on character streams. In some encodings,
CRLF simply isn't represented by \x0d\x0a
(e.g. UTF-16-LE: \x0d\0\0x0a\0)

Regards,
Martin

From martin at v.loewis.de  Mon Aug  8 08:05:15 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 08:05:15 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e105080718527939aa81@mail.gmail.com>
References: <42E93940.6080708@v.loewis.de>	
	<1122605323.9670.11.camel@geddy.wooz.org>	
	<1f7befae0507281959abc2a7c@mail.gmail.com>	
	<1122607673.9665.38.camel@geddy.wooz.org>	
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	
	<1122918723.9680.33.camel@warna.corp.google.com>	
	<m24qa9f5v8.wl%gnn@neville-neil.com>
	<42EF2794.1000209@v.loewis.de>	
	<66d0a6e105080312181e25fa08@mail.gmail.com>	
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
Message-ID: <42F6F61B.1080505@v.loewis.de>

Nicholas Bastin wrote:
> It's a mature product.  I would hope that that would count for
> something.

Sure. But so is subversion.

> I've had enough corrupted subversion repositories that I'm
> not crazy about the thought of using it in a production system.

I had the last corrupted repository with subversion 0.23. It has
matured since then. Even then, invoking db_recover would restore
the operation, without losing data (i.e. I did not need to go
to backup).

>>Interesting offer. I'll add this to the PEP - who is "we" in this
>>context?
> 
> 
> Uh, the Python community.  Which is currently hosting a subversion
> repository, so it doesn't seem like a stretch to imagine that
> p4.python.org could exist just as easily.

Ah. But these people have no expertise with Perforce, and there
is no Debian Perforce package, so it *is* a stretch assuming that
they could also host a perforce directory.

So I should then remove your offer to host a perforce installation,
as you never made such an offer, right?

> Pardon me if I don't feel that I'd like to see a system in production
> for a few weeks before we declare victory.  The problems with this
> kind of conversion can be very subtle, and very painful.  I'm not
> saying we shouldn't do this, I'm just saying that we should take an
> appropriate measure of how much greener the grass really is on the
> other side, and how much work we're willing to put in to make it that
> way.

Yes. That's what this PEP is for. So I guess you are -1 on the
PEP.

Regards,
Martin

From martin at v.loewis.de  Mon Aug  8 08:08:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 08:08:34 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <bbaeab100508071910622202e7@mail.gmail.com>
References: <42F61C03.6050703@v.loewis.de>
	<bbaeab100508071910622202e7@mail.gmail.com>
Message-ID: <42F6F6E2.7020007@v.loewis.de>

Brett Cannon wrote:
> What is going in under python/ ?  If it is what is currently
> /dist/src/, then great and the renaming of the repository works.

Just have a look yourself :-) Yes, this is dist/src.

> But if that is what src/ is going to be used for

This is nondist/src. Perhaps I should just move nondist/src/Compiler,
and drop nondist/src.

> And I assume you are going to list the directory structure in the PEP
> at some point.

Please take a look at the PEP.

Regards,
Martin

From cludwig at cdc.informatik.tu-darmstadt.de  Mon Aug  8 09:07:13 2005
From: cludwig at cdc.informatik.tu-darmstadt.de (Christoph Ludwig)
Date: Mon, 8 Aug 2005 09:07:13 +0200
Subject: [Python-Dev] [C++-sig]  GCC version compatibility
In-Reply-To: <42F6791C.3030602@v.loewis.de>
References: <u8y0jz762.fsf@boost-consulting.com>
	<200507171601.23780.anthony@interlink.com.au>
	<20050717100609.GB3581@lap200.cdc.informatik.tu-darmstadt.de>
	<200507172321.31665.anthony@interlink.com.au>
	<42F6791C.3030602@v.loewis.de>
Message-ID: <20050808070712.GB3570@lap200.cdc.informatik.tu-darmstadt.de>

On Sun, Aug 07, 2005 at 11:11:56PM +0200, "Martin v. L?wis" wrote:
> I've looked at the patch, and it looks fairly safe, so I committed it.

Thanks. I did not forget my promise to look into a more comprehensive
approach to the C++ build issues. But I first need to better understand the
potential impact on distutils. And, foremost, I need to finish my thesis
whence my spare time projects go very slowly.

Regards

Christoph
-- 
http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html
LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html


From martin at v.loewis.de  Mon Aug  8 09:21:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 09:21:41 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com>
References: <20050806102342.GA11309@mems-exchange.org>	
	<dd36mq$pjr$1@sea.gmane.org>	
	<ca471dc20508061856c0cce4f@mail.gmail.com>	
	<42F60441.8000007@v.loewis.de>
	<ca471dc205080717033ef6510d@mail.gmail.com>
	<42F6EF97.50609@v.loewis.de>
	<79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com>
Message-ID: <42F70805.2070107@v.loewis.de>

Bob Ippolito wrote:
> It's UTF-8 by default, I highly doubt many people bother to change it.

I think your doubts are unfounded. Many Japanese people change it to
EUC-JP (I believe), as UTF-8 support doesn't work well for them (or
atleast didn't use to).

Regards,
Martin

From martin at v.loewis.de  Mon Aug  8 10:00:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 10:00:00 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc205080717246d0c4919@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<42F60E35.9080809@egenix.com>
	<ca471dc205080717246d0c4919@mail.gmail.com>
Message-ID: <42F71100.7000401@v.loewis.de>

Guido van Rossum wrote:
> We might be able to get there halfway in Python 2.x: we could
> introduce the bytes type now, and provide separate APIs to read and
> write them. (In fact, the array module and the f.readinto()  method
> make this possible today, but it's too klunky so nobody uses it.
> Perhaps a better API would be a new file-open mode ("B"?) to indicate
> that a file's read* operations should return bytes instead of strings.
> The bytes type could just be a very thin wrapper around array('b').

That answers an important question: so you want the bytes type to be
mutable (and, consequently, unsuitable as a dictionary key).

Regards,
Martin

From martin at v.loewis.de  Mon Aug  8 10:07:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 10:07:37 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
References: <42F60E35.9080809@egenix.com>	<20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<42F60E35.9080809@egenix.com>
	<5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
Message-ID: <42F712C9.9040000@v.loewis.de>

Phillip J. Eby wrote:
>>Hm. What would be the use case for using %s with binary, non-text data?
> 
> 
> Well, I could see using it to write things like netstrings, 
> i.e.  sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way 
> to write a netstring in today's Python at least.  But perhaps there's a 
> subtlety I've missed here.

As written, this would stop working when strings become Unicode. It's
pretty clear what '%d' means (format the number in decimal numbers,
using "\N{DIGIT ZERO}" .. "\N{DIGIT NINE}" as the digits). It's not
all that clear what %s means: how do you get a sequence of characters
out of data, when data is a byte string?

Perhaps there could be byte string literals, so that you would write

  sock.send(b"%d:%s," % (len(data),data))

but this would raise different questions:
- what does %d mean for a byte string formatting? str(len(data))
  returns a character string, how do you get a byte string?
  In the specific case of %d, encoding as ASCII would work, though.
- if byte strings are mutable, what about byte string literals?
  I.e. if I do

  x = b"%d:%s,"
  x[1] = b'f'

  and run through the code the second time, will the literal have
  changed? Perhaps these would be displays, not literals (although
  I never understood why Guido calls these displays)

Regards,
Martin

From stephen at xemacs.org  Mon Aug  8 10:10:07 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 08 Aug 2005 17:10:07 +0900
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F70805.2070107@v.loewis.de> (Martin v.
	=?iso-8859-1?q?L=F6wis's?= message of "Mon, 08 Aug 2005 09:21:41 +0200")
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60441.8000007@v.loewis.de>
	<ca471dc205080717033ef6510d@mail.gmail.com>
	<42F6EF97.50609@v.loewis.de>
	<79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com>
	<42F70805.2070107@v.loewis.de>
Message-ID: <87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin at v.loewis.de> writes:

    Martin> I think your doubts are unfounded. Many Japanese people
    Martin> change it to EUC-JP (I believe), as UTF-8 support doesn't
    Martin> work well for them (or atleast didn't use to).

If you mean the UTF-8 support in Terminal, it's no better or worse
than the EUC-JP support.  The problem is that most Japanese Unix
systems continue to default to EUC-JP, and many Windows hosts
(including Samba file systems) default to Shift JIS.  So people using
Terminal tend to set it to match the default remote environment (few
of them use shells on the Mac).

All that is certainly true of my organization, for one example.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From martin at v.loewis.de  Mon Aug  8 10:26:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 10:26:37 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<42F60441.8000007@v.loewis.de>	<ca471dc205080717033ef6510d@mail.gmail.com>	<42F6EF97.50609@v.loewis.de>	<79BBBAB6-6630-4AF0-A74B-0D712186A054@redivi.com>	<42F70805.2070107@v.loewis.de>
	<87oe8897ds.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <42F7173D.4000204@v.loewis.de>

Stephen J. Turnbull wrote:
> If you mean the UTF-8 support in Terminal, it's no better or worse
> than the EUC-JP support.  The problem is that most Japanese Unix
> systems continue to default to EUC-JP, and many Windows hosts
> (including Samba file systems) default to Shift JIS.  So people using
> Terminal tend to set it to match the default remote environment (few
> of them use shells on the Mac).

Right: that might be the biggest problem. ls(1) would not display
the file names of the remote servers in any readable way.

Thanks for the confirmation.

Regards,
Martin

From arigo at tunes.org  Mon Aug  8 10:31:06 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 8 Aug 2005 10:31:06 +0200
Subject: [Python-Dev] __traceback__ and reference cycles
Message-ID: <20050808083106.GA15924@code1.codespeak.net>

Hi all,

There are various proposals to add an attribute on exception instances
to store the traceback (see PEP 344).  A detail not discussed, which I
thought of historical interest only, is that today's exceptions try very
hard to avoid reference cycles, in particular the cycle

   'frame -> local variable -> traceback object -> frame'

which was important for pre-GC versions of Python.  A clause 'except
Exception, e' would not create a local reference to the traceback, only
to the exception instance.  If the latter grows a __traceback__
attribute, it is no longer true, and every such except clause typically
creates a cycle.

Of course, we don't care, we have a GC -- do we?  Well, there are cases
where we do: see the attached program...  In my opinion it should be
considered a bug of today's Python that this program leaks memory very
fast and takes longer and longer to run each loop (each loop takes half
a second longer than the previous one!).  (I don't know how this bug
could be fixed, though.)

Spoiling the fun of figuring out what is going on, the reason is that
'e_tb' creates a reference cycle involving the frame of __del__, which
keeps a reference to 'self' alive.  Python thinks 'self' was
resurrected.  The next time the GC runs, the cycle disappears, and the
refcount of 'self' drops to zero again, calling __del__ again -- which
gets resurrected again by a new cycle.  Etc...  Note that no cycle
actually contains 'self'; they just point to 'self'.  In summary, no X
instance gets ever freed, but they all have their destructors called
over and over again.

Attaching a __traceback__ will only make this "bug" show up more often,
as the 'except Exception, e' line in a __del__() method would be enough
to trigger it.

Not sure what to do about it.  I just thought I should share these
thoughts (I stumbled over almost this problem in PyPy).


A bientot,

Armin

From arigo at tunes.org  Mon Aug  8 10:38:12 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 8 Aug 2005 10:38:12 +0200
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <20050808083106.GA15924@code1.codespeak.net>
References: <20050808083106.GA15924@code1.codespeak.net>
Message-ID: <20050808083812.GA16341@code1.codespeak.net>

Hi,

On Mon, Aug 08, 2005 at 10:31:06AM +0200, Armin Rigo wrote:
> see the attached program...

Oups.  Here it is...


Armin
-------------- next part --------------
import sys, time

def log(typ, val, tb):
    pass

class X:
    def __del__(self):
        try:
            typo
        except Exception, e:
            e_type, e_value, e_tb = sys.exc_info()
            log(e_type, e_value, e_tb)


t = time.time()
while True:
    lst = [X() for i in range(1000)]
    t1 = time.time()
    print t1 - t
    t = t1

From phd at mail2.phd.pp.ru  Mon Aug  8 10:47:56 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Mon, 8 Aug 2005 12:47:56 +0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org>
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<1123473109.20293.35.camel@geddy.wooz.org>
Message-ID: <20050808084756.GB28977@phd.pp.ru>

On Sun, Aug 07, 2005 at 11:51:49PM -0400, Barry Warsaw wrote:
> Has anyone experienced svn corruptions with the fsfs backend?  I
> haven't, across quite a few repositories.

   I haven't. But I must admit that the repositories I'm working with
aren't big. The bigest is at svn.colorstudy.com (I am working on SQLObject)
and since the time Ian has switched from dbfs to fsfs I don't remember any
problems with the repo.

   Speaking of merge. SVN relived much pain that CVS had gave me. With CVS
I had a lot of conflicts - if the code to be merged is already there (had
been merged from another branch) one got conflict. If the code contains CVS
keywords (__version__ = "$Id$") cvs merge always produced conflicts.
   SVN fixed both problems so now I see only real conflicts. SVN just
ignores the code to be merged if it has ben already merged; and SVN convert
keywords internally to its default form ($Id$ instead of $Id: python.c 42 phd $)
before merging.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From mwh at python.net  Mon Aug  8 11:29:50 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 08 Aug 2005 10:29:50 +0100
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F60E35.9080809@egenix.com> (M.'s message of "Sun, 07 Aug
	2005 15:35:49 +0200")
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
Message-ID: <2mll3cydwx.fsf@starship.python.net>

"M.-A. Lemburg" <mal at egenix.com> writes:

> Set the external encoding for stdin, stdout, stderr:
> ----------------------------------------------------
> (also an example for adding encoding support to an
> existing file object):
>
> def set_sys_std_encoding(encoding):
>     # Load encoding support
>     (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding)
>     # Wrap using stream writers and readers
>     sys.stdin = streamreader(sys.stdin)
>     sys.stdout = streamwriter(sys.stdout)
>     sys.stderr = streamwriter(sys.stderr)
>     # Add .encoding attribute for introspection
>     sys.stdin.encoding = encoding
>     sys.stdout.encoding = encoding
>     sys.stderr.encoding = encoding
>
> set_sys_std_encoding('rot-13')
>
> Example session:
>>>> print 'hello'
> uryyb
>>>> raw_input()
> hello
> h'hello'
>>>> 1/0
> Genpronpx (zbfg erprag pnyy ynfg):
>   Svyr "<fgqva>", yvar 1, va ?
> MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb
>
> Note that the interactive session bypasses the sys.stdin
> redirection, which is why you can still enter Python
> commands in ASCII - not sure whether there's a reason
> for this, or whether it's just a missing feature.

Um, I'm not quite sure how this would be implemented.  Interactive
input comes via PyOS_Readline which deals in FILE*s... this area of
the code always confuses me :(

Cheers,
mwh

-- 
 As it seems to me, in Perl you have to be an expert to correctly make
 a nested data structure like, say, a list of hashes of instances.  In
 Python, you have to be an idiot not  to be able to do it, because you
 just write it down.             -- Peter Norvig, comp.lang.functional

From mwh at python.net  Mon Aug  8 11:49:15 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 08 Aug 2005 10:49:15 +0100
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org> (Barry Warsaw's
	message of "Sun, 07 Aug 2005 23:51:49 -0400")
References: <42E93940.6080708@v.loewis.de>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<1123473109.20293.35.camel@geddy.wooz.org>
Message-ID: <2mhde0yd0k.fsf@starship.python.net>

Barry Warsaw <barry at python.org> writes:

> Unfortunately, I don't think "we" (meaning specifically the collective
> python.org admins) have much if any operational experience with
> Perforce.

Also (from someone who is on the fringes of the pydotorg admin set): I
don't know that much about subversion administration.  But, if it
proves necessary, as it's an open source project and all, I'm willing
to put some time into learning about it.  I'm *much* less likely to do
this for a closed source package unless someone is paying me to do it.
Maybe I'm the only person who thinks this way, but if not, it's
something to think about.

Cheers,
mwh

-- 
42. You can measure a programmer's perspective by noting his
    attitude on the continuing vitality of FORTRAN.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html

From ncoghlan at gmail.com  Mon Aug  8 12:24:24 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 08 Aug 2005 20:24:24 +1000
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F71100.7000401@v.loewis.de>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<42F60E35.9080809@egenix.com>	<ca471dc205080717246d0c4919@mail.gmail.com>
	<42F71100.7000401@v.loewis.de>
Message-ID: <42F732D8.9080505@gmail.com>

Martin v. L?wis wrote:
> Guido van Rossum wrote:
>>The bytes type could just be a very thin wrapper around array('b').
> 
> That answers an important question: so you want the bytes type to be
> mutable (and, consequently, unsuitable as a dictionary key).

I would suggest a bytes/frozenbytes pair, similar to set/frozenset and list/tuple.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mal at egenix.com  Mon Aug  8 12:42:26 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 08 Aug 2005 12:42:26 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc205080717246d0c4919@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	
	<dd36mq$pjr$1@sea.gmane.org>	
	<ca471dc20508061856c0cce4f@mail.gmail.com>	
	<42F60E35.9080809@egenix.com>
	<ca471dc205080717246d0c4919@mail.gmail.com>
Message-ID: <42F73712.9040904@egenix.com>

Guido van Rossum wrote:
> [Guido]
> 
>>>My first response to the PEP, however, is that instead of a new
>>>built-in function, I'd rather relax the requirement that str() return
>>>an 8-bit string -- after all, int() is allowed to return a long, so
>>>why couldn't str() be allowed to return a Unicode string?
> 
> 
> [MAL]
> 
>>The problem here is that strings and Unicode are used in different
>>ways, whereas integers and longs are very similar. Strings are used
>>for both arbitrary data and text data, Unicode can only be used
>>for text data.
> 
> Yes, that is the case in Python 2.x. In Python 3.x, I'd like to use a
> separate "bytes" array type for non-text and for encoded text data,
> just like Java; strings should always be considered text data.
>
> We might be able to get there halfway in Python 2.x: we could
> introduce the bytes type now, and provide separate APIs to read and
> write them.
>
> (In fact, the array module and the f.readinto()  method
> make this possible today, but it's too klunky so nobody uses it.
> Perhaps a better API would be a new file-open mode ("B"?) to indicate
> that a file's read* operations should return bytes instead of strings.
> The bytes type could just be a very thin wrapper around array('b').

I'd prefer to keep such bytes type immutable (arrays are mutable),
otherwise, as Martin already mentioned, they wouldn't be usable
as dictionary keys and the transition from the current string
implementation would be made more difficult than necessary.

Since we won't have any use for the string type in Py3k,
why not simply strip it down to a plain bytes type ?

(I wouldn't want to lose or have to reinvent all the
optimizations that went into its implementation and which
are missing in the array implementation.)

About the file-type idea:

We already have text mode and binary mode - with their implementation
being platform dependent. I don't think that this is particularly
good area to add new functionality.

If you use codecs.open() to open a file, you could easily
write a codec which implements what you have in mind.

>>The new text() built-in would help make a clear distinction
>>between "convert this object to a string of bytes" and
>>"please convert this to a text representation". We need to
>>start making the separation somewhere and I think this is
>>a good non-invasive start.
> 
> 
> I agree with the latter, but I would prefer that any new APIs we use
> use a 'bytes' data type to represent non-text data, rather than having
> two different sets of APIs to differentiate between the use of 8-bit
> strings as text vs. data -- while we *currently* use 8-bit strings for
> both text and data, in Python 3.0 we won't, so then the interim APIs
> would have to change again. I'd rather intrduce a new data type and
> new APIs that work with it.

Well, let's put it this way: it all really depends on
what str() should mean in Py3k.

Given that str() is used for mixed content data strings,
simply aliasing str() to unicode() in Py3k would cause a
lot of breakage, due to changed semantics.

Aliasing str() to bytes() would also cause breakage, due
to the fact that bytes types wouldn't have string method
like e.g. .lower(), .upper(), etc.

Perhaps str() in Py3k should become a helper that
converts bytes() to Unicode, provided the content is
ASCII-only.

In any case, Py3k would only have unicode() for text
and bytes() for data, so there's no real need to continue
using str().

If we add the text() API in Py2k and with the above
meaning, then we could rename unicode() to text()
in Py3k - only a cosmetical change, but one that I would
find useful: text() and bytes() are more intuitive to
understand than unicode() and bytes().

>>Furthermore, the text() built-in could be used to only
>>allow 8-bit strings with ASCII content to pass through
>>and require that all non-ASCII content be returned as
>>Unicode.
>>
>>We wouldn't be able to enforce this in str().
>>
>>I'm +1 on adding text().
> 
> 
> I'm still -1.
> 
> 
>>I would also like to suggest a new formatting marker '%t'
>>to have the same semantics as text() - instead of changing
>>the semantics of %s as the Neil suggests in the PEP. Again,
>>the reason is to make the difference between text and
>>arbitrary data explicit and visible in the code.
> 
> 
> Hm. What would be the use case for using %s with binary, non-text data?

I guess we'd only keep it for backwards compatibility and
map it to the str() helper.

>>>The main problem for a smooth Unicode transition remains I/O, in my
>>>opinion; I'd like to see a PEP describing a way to attach an encoding
>>>to text files, and a way to decide on a default encoding for stdin,
>>>stdout, stderr.
>>
>>Hmm, not sure why you need PEPs for this:
> 
> 
> I'd forgotten how far we've come. I'm still unsure how the default
> encoding on stdin/stdout works.

Codecs in general work like this: they take an existing file-like
object and wrap it with new versions of .read(), .write(),
.readline(), etc. which filter the data through encoding and/or
decoding functions.

Once a file is wrapped with a codec StreamWriter/Reader,
you can continue using it as if it were a standard file-like
object.

> But it still needs to be simpler; IMO the built-in open() function
> should have an encoding keyword. (But it could return something whose
> type is not 'file' -- once again making a distinction between open and
> file.) 

Right, because it would then return a wrapped file object.

> Do these files support universal newlines? IMO they should. 

Since the codecs wrap the underlying file object which does
support universal newlines, this should be the case.

However, you should be aware of the fact that Unicode
defines a lot more line break characters than just \r,
\r\n, \n.

The codecs use the .splitlines() methods of strings and
Unicode - which support all of them transparently, so you
don't need to enable universal newlines support at all -
it's sort-of enabled per default.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Mon Aug  8 13:06:31 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 08 Aug 2005 13:06:31 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <2mll3cydwx.fsf@starship.python.net>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<42F60E35.9080809@egenix.com>
	<2mll3cydwx.fsf@starship.python.net>
Message-ID: <42F73CB7.2090007@egenix.com>

Michael Hudson wrote:
> "M.-A. Lemburg" <mal at egenix.com> writes:
> 
> 
>>Set the external encoding for stdin, stdout, stderr:
>>----------------------------------------------------
>>(also an example for adding encoding support to an
>>existing file object):
>>
>>def set_sys_std_encoding(encoding):
>>    # Load encoding support
>>    (encode, decode, streamreader, streamwriter) = codecs.lookup(encoding)
>>    # Wrap using stream writers and readers
>>    sys.stdin = streamreader(sys.stdin)
>>    sys.stdout = streamwriter(sys.stdout)
>>    sys.stderr = streamwriter(sys.stderr)
>>    # Add .encoding attribute for introspection
>>    sys.stdin.encoding = encoding
>>    sys.stdout.encoding = encoding
>>    sys.stderr.encoding = encoding
>>
>>set_sys_std_encoding('rot-13')
>>
>>Example session:
>>
>>>>>print 'hello'
>>
>>uryyb
>>
>>>>>raw_input()
>>
>>hello
>>h'hello'
>>
>>>>>1/0
>>
>>Genpronpx (zbfg erprag pnyy ynfg):
>>  Svyr "<fgqva>", yvar 1, va ?
>>MrebQvivfvbaReebe: vagrtre qvivfvba be zbqhyb ol mreb
>>
>>Note that the interactive session bypasses the sys.stdin
>>redirection, which is why you can still enter Python
>>commands in ASCII - not sure whether there's a reason
>>for this, or whether it's just a missing feature.
> 
> 
> Um, I'm not quite sure how this would be implemented.  Interactive
> input comes via PyOS_Readline which deals in FILE*s... this area of
> the code always confuses me :(

Me too.

It appears that this part of the Python code
has undergone so many iterations and patches, that the
structure has suffered a lot, e.g. the main() functions calls
PyRun_AnyFileFlags(stdin, "<stdin>", &cf),
but the fp argument stdin is then subsequently
ignored if the tok_nextc() function finds that
a prompt is set.

Anyway, hacking along the same lines, I think
the above can be had by changing tok_stdin_decode()
to use a possibly available sys.stdin.decode()
method for the decoding of the data read by
PyOS_Readline(). This would then return Unicode
which tok_stdin_decode() could then encode to
UTF-8 which is the encoding that the tokenizer
can work on.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From pje at telecommunity.com  Mon Aug  8 15:54:20 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Aug 2005 09:54:20 -0400
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F712C9.9040000@v.loewis.de>
References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
	<42F60E35.9080809@egenix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
	<5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com>

At 10:07 AM 8/8/2005 +0200, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
> >>Hm. What would be the use case for using %s with binary, non-text data?
> >
> >
> > Well, I could see using it to write things like netstrings,
> > i.e.  sock.send("%d:%s," % (len(data),data)) seems like the One Obvious 
> Way
> > to write a netstring in today's Python at least.  But perhaps there's a
> > subtlety I've missed here.
>
>As written, this would stop working when strings become Unicode. It's
>pretty clear what '%d' means (format the number in decimal numbers,
>using "\N{DIGIT ZERO}" .. "\N{DIGIT NINE}" as the digits). It's not
>all that clear what %s means: how do you get a sequence of characters
>out of data, when data is a byte string?
>
>Perhaps there could be byte string literals, so that you would write
>
>   sock.send(b"%d:%s," % (len(data),data))

Actually, thinking about it some more, it seems to me it's actually more 
like this:

    sock.send( ("%d:%s," % 
(len(data),data.decode('latin1'))).encode('latin1') )

That is, if all we have is unicode and bytes, and 'data' is bytes, then 
encoding and decoding from latin1 is the right way to do a netstring.  It's 
a bit more painful, but still doable.


>but this would raise different questions:
>- what does %d mean for a byte string formatting? str(len(data))
>   returns a character string, how do you get a byte string?
>   In the specific case of %d, encoding as ASCII would work, though.
>- if byte strings are mutable, what about byte string literals?
>   I.e. if I do
>
>   x = b"%d:%s,"
>   x[1] = b'f'
>
>   and run through the code the second time, will the literal have
>   changed? Perhaps these would be displays, not literals (although
>   I never understood why Guido calls these displays)

I'm thinking that bytes.decode and unicode.encode are the correct way to 
convert between the two, and there's no such thing as a bytes literal.  We 
can always optimize "constant.encode(constant)" to a bytes display 
internally if necessary, although it will be a pain for programs that have 
lots of bytestring constants.  OTOH, we've previously discussed having a 
'bytes()' constructor, and perhaps it should use latin1 as its default 
encoding.


From aahz at pythoncraft.com  Mon Aug  8 17:41:57 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 8 Aug 2005 08:41:57 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <20050808034756.GA16756@mems-exchange.org>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
Message-ID: <20050808154157.GA28005@panix.com>

On Sun, Aug 07, 2005, Neil Schemenauer wrote:
> On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote:
>>
>> My first response to the PEP, however, is that instead of a new
>> built-in function, I'd rather relax the requirement that str() return
>> an 8-bit string
> 
> Do you have any thoughts on what the C API would be?  It seems to me
> that PyObject_Str cannot start returning a unicode object without a
> lot of code breakage.  I suppose we could introduce a function
> called something like PyObject_String.

OTOH, should Guido change his -1 on text(), that leads to the obvious
PyObject_Text.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From aahz at pythoncraft.com  Mon Aug  8 17:45:03 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 8 Aug 2005 08:45:03 -0700
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508071312290.695@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
Message-ID: <20050808154503.GB28005@panix.com>

On Sun, Aug 07, 2005, Ilya Sandler wrote:
>
> Solution:
> 
> Should pdb's next command accept an optional numeric argument? It would
> specify how many actual lines of code (not "line events")
> should  be skipped in the current frame before stopping,

At OSCON, Anthony Baxter made the point that pdb is currently one of the
more unPythonic modules.  If you're feeling a lot of energy about this,
rewriting pdb might be more productive.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From gvanrossum at gmail.com  Mon Aug  8 18:14:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 8 Aug 2005 09:14:18 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <20050808154157.GA28005@panix.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
	<20050808154157.GA28005@panix.com>
Message-ID: <ca471dc2050808091443147b6e@mail.gmail.com>

Ouch. Too much discussion to respond to it all. Please remember that
in Jythin and IronPython, str and unicode are already synonyms. That's
how Python 3.0 will do it, except unicode will disappear as being
redundant. I like the bytes/frozenbytes pair idea. Streams could grow
a getpos()/setpos() API pair that can be used for stateful encodings
(although it sounds like seek()/tell() would be okay to use in most
cases as long as you read in units of whole lines). For sockets,
send()/recv() would deal in bytes, and makefile() would get an
encoding parameter. I'm not going to change my mind on text() unless
someone explains what's so attractive about it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Mon Aug  8 18:16:22 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 8 Aug 2005 09:16:22 -0700
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <20050808083106.GA15924@code1.codespeak.net>
References: <20050808083106.GA15924@code1.codespeak.net>
Message-ID: <ca471dc205080809162940cd51@mail.gmail.com>

On 8/8/05, Armin Rigo <arigo at tunes.org> wrote:
> Attaching a __traceback__ will only make this "bug" show up more often,
> as the 'except Exception, e' line in a __del__() method would be enough
> to trigger it.

Hmm... We can blame this on __del__ as much as on __traceback__, of
course... But it is definitely of concern.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon Aug  8 18:33:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Aug 2005 12:33:39 -0400
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc2050808091443147b6e@mail.gmail.com>
References: <20050808154157.GA28005@panix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
	<20050808154157.GA28005@panix.com>
Message-ID: <5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com>

At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote:
>I'm not going to change my mind on text() unless
>someone explains what's so attractive about it.

1. It's obvious to non-programmers what it's for (str and unicode aren't)

2. It's more obvious to programmers that it's a *text* string rather than a 
string of bytes

3. It's easier to type than "unicode", but less opaque than "str"

4. Switching to 'text' and 'bytes' allows for a clean break from any mental 
baggage now associated with 'unicode' and 'str'.

Of course, the flip side to these arguments is that in today's Python, one 
rarely has use for the string type names, except for coercion and some 
occasional type checking.  On the other hand, if we end up with type 
declarations, then these issues become a bit more important.


From mal at egenix.com  Mon Aug  8 18:51:47 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 08 Aug 2005 18:51:47 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc2050808091443147b6e@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<20050808034756.GA16756@mems-exchange.org>	<20050808154157.GA28005@panix.com>
	<ca471dc2050808091443147b6e@mail.gmail.com>
Message-ID: <42F78DA3.80605@egenix.com>

Guido van Rossum wrote:
> Ouch. Too much discussion to respond to it all. Please remember that
> in Jythin and IronPython, str and unicode are already synonyms. 

I know, but don't understand that argument: aren't we talking
about Python in general, not some particular implementation ?

Why should CPython applications break just to permit Jython and
IronPython applications not to break ?

> That's how Python 3.0 will do it, except unicode will disappear as being
> redundant. I like the bytes/frozenbytes pair idea. Streams could grow
> a getpos()/setpos() API pair that can be used for stateful encodings
> (although it sounds like seek()/tell() would be okay to use in most
> cases as long as you read in units of whole lines). 

Please don't confuse the raw bytes position in a file or stream
with e.g. an index into the possibly decoded data. Those are
two different pairs of shoes.

Since the position into decoded data depends on what type of
encoding your using and how you decode, the "position" would
not be defined across streams, but depend on the features of
a particular stream.

> For sockets, send()/recv() would deal in bytes, and makefile() would get an
> encoding parameter. I'm not going to change my mind on text() unless
> someone explains what's so attractive about it.

Please read my reply for some reasoning and also Phillips
answer to your posting.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From aahz at pythoncraft.com  Mon Aug  8 18:56:20 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 8 Aug 2005 09:56:20 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123473109.20293.35.camel@geddy.wooz.org>
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<1123473109.20293.35.camel@geddy.wooz.org>
Message-ID: <20050808165620.GA8064@panix.com>

On Sun, Aug 07, 2005, Barry Warsaw wrote:
>
> We'd also have to teach the current crop of developers how to use the
> client tools effectively.  I think it's fairly simple to teach a CVS
> user how to use Subversion, but have no idea if translating CVS
> experience to Perforce is as straightforward.

The impression I got from Alex Martelli is that it's not particularly
straightforward.  (Google apparently uses Perforce.)
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From bcannon at gmail.com  Mon Aug  8 19:14:30 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 8 Aug 2005 10:14:30 -0700
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <42F6F6E2.7020007@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<bbaeab100508071910622202e7@mail.gmail.com>
	<42F6F6E2.7020007@v.loewis.de>
Message-ID: <bbaeab10050808101440b958ef@mail.gmail.com>

On 8/7/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Brett Cannon wrote:
> > What is going in under python/ ?  If it is what is currently
> > /dist/src/, then great and the renaming of the repository works.
> 
> Just have a look yourself :-) Yes, this is dist/src.
> 

Ah, OK.  I didn't drill far enough down.  Not enough experience with
svn to realize that the directory was not just filled with default
directories.

> > But if that is what src/ is going to be used for
> 
> This is nondist/src. Perhaps I should just move nondist/src/Compiler,
> and drop nondist/src.
> 

Wouldn't hurt.  Since svn allows directory deletion there doesn't seem
to be an huge need to worry about the projects/ directory getting to
large.

> > And I assume you are going to list the directory structure in the PEP
> > at some point.
> 
> Please take a look at the PEP.
> 

OK, now I see it.  I scanned the PEP initially but I didn't see it;
guess I was expecting a more literal directory list than a paragraph
on it.

-Brett

From trentm at ActiveState.com  Mon Aug  8 20:51:00 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 11:51:00 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42EF2794.1000209@v.loewis.de>
References: <1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
Message-ID: <20050808185100.GJ16963@ActiveState.com>

> > Since Python is Open Source are you looking at Per Force which you can
> > use for free and seems to be a happy medium between something like CVS
> > and something horrific like Clear Case?
> 
> No. The PEP is only about Subversion. Why should we be looking at Per
> Force? Only because Python is Open Source?

Perforce offers free licensing to open source projects.


> I think anything but Subversion is ruled out because:
> - there is no offer to host that anywhere (for subversion, there is
>   already svn.python.org)
> - there is no support for converting a CVS repository (for subversion,
>   there is cvs2svn)

There *is* support for converting a CVS repository to Perforce [1].

Perforce is very good, stable, solid, reliable, good tools, etc. etc.
but I'd tend to support SVN over Perforce for Python development.
Perforce usage is quite different than CVS (would be a painful
re-learning for old CVS-hands) and SVN tends to better support highly
distributed development: most operations don't need to talk to the
server, with Perforce (aka p4), almost *all* operations talk to the
server. This can be somewhat mitigated with "p4proxy" (a tool that
Perforce also provides) but people would be happier with SVN, I'd bet.

[1] It is a project called VCP. Some details here (I didn't look too
    hard):
    http://www.cpan.org/modules/by-module/LWP/AUTRIJUS/VCP-autrijus-snapshot-0.9-20041020.readme

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From trentm at ActiveState.com  Mon Aug  8 20:58:06 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 11:58:06 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808165620.GA8064@panix.com>
References: <1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<1123473109.20293.35.camel@geddy.wooz.org>
	<20050808165620.GA8064@panix.com>
Message-ID: <20050808185806.GK16963@ActiveState.com>

[Aahz wrote]
> On Sun, Aug 07, 2005, Barry Warsaw wrote:
> >
> > We'd also have to teach the current crop of developers how to use the
> > client tools effectively.  I think it's fairly simple to teach a CVS
> > user how to use Subversion, but have no idea if translating CVS
> > experience to Perforce is as straightforward.
> 
> The impression I got from Alex Martelli is that it's not particularly
> straightforward. 

Agreed.


> (Google apparently uses Perforce.)

We do at ActiveState as well. *The* Perl source code repository is a
Perforce one (hosted separately here at ActiveState as well). Microsoft
licenses the Perforce code and uses it (with some slight modifications I
hear) internally.

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From trentm at ActiveState.com  Mon Aug  8 20:58:06 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 11:58:06 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808165620.GA8064@panix.com>
References: <1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<1123473109.20293.35.camel@geddy.wooz.org>
	<20050808165620.GA8064@panix.com>
Message-ID: <20050808185806.GK16963@ActiveState.com>

[Aahz wrote]
> On Sun, Aug 07, 2005, Barry Warsaw wrote:
> >
> > We'd also have to teach the current crop of developers how to use the
> > client tools effectively.  I think it's fairly simple to teach a CVS
> > user how to use Subversion, but have no idea if translating CVS
> > experience to Perforce is as straightforward.
> 
> The impression I got from Alex Martelli is that it's not particularly
> straightforward. 

Agreed.


> (Google apparently uses Perforce.)

We do at ActiveState as well. *The* Perl source code repository is a
Perforce one (hosted separately here at ActiveState as well). Microsoft
licenses the Perforce code and uses it (with some slight modifications I
hear) internally.

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From falcon at intercable.ru  Thu Aug  4 09:09:33 2005
From: falcon at intercable.ru (falcon)
Date: Thu, 4 Aug 2005 11:09:33 +0400
Subject: [Python-Dev] PEP-343 - Context Managment variant
Message-ID: <313379598.20050804110933@intercable.ru>

I know I came after the battle. And I have just another sight on context managment.

Simple Context Managment may look in Python 2.4.1 like this:

Synhronized example:

def Synhronised(lock,func):
        lock.acquire()
        try:
                func()
        finally:
                lock.release()
....
lock=Lock()
def Some():
    local_var1=x
    local_var2=y
    local_var3=Z
    def Work():
        global local_var3
        local_var3=Here_I_work(local_var1,local_var2,local_var3)
    Synhronised(lock,Work)
    return asd(local_var3)

FileOpenClose Example:

def FileOpenClose(name,mode,func):
    f=file(name,mode)
    try:
        func(f)
    finally:
        f.close()

....

def Another(name):
    local_var1=x
    local_var2=y
    local_var3=None
    def Work(f):
        global local_var3
        local_var3=[[x,y(i)] for i in f]
    FileOpenClose(name,'r',Work)
    return local_var3

It was complicated because :
    1. We must declare closure (local function)  before using it
    2. We must declare local vars, to which we wish assign in "global" statement
    3. We cannot create variable local to outter function in closure, so we must create it before
       and declare in global
So one can say: "that is because there're no block lambda". (I can be wrong)
I think it is possible to introduce block-object in analogy to lambda-object (and function-object)
It's difference from function:
   it has no true self local variables, all global(free) and local variables it steels from outter
   scope. And all local vars, introduced in block are really introduced in outter scope
   (i think, during parse state). So its global dicts and local dicts are realy corresponding dicts
   of outter scope. (Excuse my english)
So, may be it would be faster than function call. I don't know.

Syntax can be following:

lock=Lock()
def Some():
    local_var1=x
    local_var2=y
    local_var3=Z
    Synhronised(lock,block)
        local_var3=Here_I_work(local_var1,local_var2,local_var3)
    return asd(local_var3)

def Another(name):
    local_var1=x
    local_var2=y
    FileOpenClose(name,'r',block{f})
        local_var3=[[x,y(i)] for i in f]
    return local_var3


And here is sample of returning value:

def Foo(context,proc):
    context.enter()
    try:
        res=proc(context.x,context.y)*2
    except Exception,Err:
        context.throw(Err)
    finally:
        context.exit()
    return res
...
context=MyContext()
...
def Bar():
    result=Foo(context,block{x,y})
        continue x+y
    return result

It's idea was token from Ruby. But I think, good idea is good whatever it came from.
It can be not Pythonic.


From aahz at pythoncraft.com  Mon Aug  8 22:09:28 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 8 Aug 2005 13:09:28 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808185100.GJ16963@ActiveState.com>
References: <200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
Message-ID: <20050808200928.GA24381@panix.com>

On Mon, Aug 08, 2005, Trent Mick wrote:
>Martin:
>> 
>> No. The PEP is only about Subversion. Why should we be looking at Per
>> Force? Only because Python is Open Source?
> 
> Perforce offers free licensing to open source projects.

So did BitKeeper.  Linux got bitten by that.  We'd need a strong
incentive to consider Perforce over Subversion just because of that
issue.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From martin at v.loewis.de  Mon Aug  8 23:37:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 23:37:48 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808185100.GJ16963@ActiveState.com>
References: <1f7befae050728172161d4a9e8@mail.gmail.com>
	<200507281956.03788.jeff@taupro.com>
	<1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
Message-ID: <42F7D0AC.5020003@v.loewis.de>

Trent Mick wrote:
>>No. The PEP is only about Subversion. Why should we be looking at Per
>>Force? Only because Python is Open Source?
> 
> 
> Perforce offers free licensing to open source projects.

Ok, so I now got "it's mature", and "it would be without charges".
Given that it is now running against Subversion, I would be still
interested in advantages that it offers compared to svn.

The biggest disadvantage, to me, is that few people know how
to use it (myself included). I don't trust tools I've never used.

So for me, as the author of this PEP, usage of the revsion control
software is non-negotiable (selection of hoster, to a limited degree,
is). If you want to see Perforce used for the Python development,
you should write a counter-PEP, so we could let the BDFL decide.
[This is a theoretical "you" here, since you then explain that
you would still prefer to use subversion]

Regards,
Martin

From martin at v.loewis.de  Mon Aug  8 23:42:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 Aug 2005 23:42:01 +0200
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com>
References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
	<42F60E35.9080809@egenix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
	<5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
	<5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com>
Message-ID: <42F7D1A9.4000909@v.loewis.de>

Phillip J. Eby wrote:
> Actually, thinking about it some more, it seems to me it's actually more
> like this:
> 
>    sock.send( ("%d:%s," %
> (len(data),data.decode('latin1'))).encode('latin1') )

While this would work, it would still feel wrong: the binary data
are *not* latin1 (most likely), so declaring them to be latin1 would
be confusing. Perhaps a synonym '8bit' for latin1 could be introduced.

Regards,
Martin


From trentm at ActiveState.com  Tue Aug  9 00:49:12 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 15:49:12 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F7D0AC.5020003@v.loewis.de>
References: <1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
Message-ID: <20050808224912.GA11584@ActiveState.com>

One feature I like in Perforce (which Subversion doesn't have) is the
ability to have pending changesets. A changeset is, as with subversion,
something you check-in atomically. Pending changesets in Perforce allow
you to (1) group related files in a source tree where you might be
working on multiple things at once to ensure and (2) to build a change
description as you go. In a large source tree this can be useful for
separating chunks of work.

There are other little things, like not being able to trim the check-in
filelist when editing the check-in message.  For example, say you have
10 files checked out scattered around the Python source tree and you
want to check 9 of those in. Currently with svn you have to manually
specify those 9 to be sure to not include the remaining one. With p4 you
just say to check-in the whole tree and then remove that one from the
list give you in your editor with entering the check-in message. Not
that big of a deal.

[Martin v. L?wis on Perforce]
> The biggest disadvantage, to me, is that few people know how
> to use it (myself included). 

Granted. For that reason and for a couple of others I mentioned (SVN
will probably work better for offline and distributed developers) I
think Subversion wins over Perforce. That is presuming, of course, that
we find Subversion to be acceptibly stable/robust/manageble.


Trent

-- 
Trent Mick
TrentM at ActiveState.com

From nas at arctrix.com  Tue Aug  9 00:51:52 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 8 Aug 2005 16:51:52 -0600
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc20508061856c0cce4f@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
Message-ID: <20050808225151.GA19090@mems-exchange.org>

On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote:
> My first response to the PEP, however, is that instead of a new
> built-in function, I'd rather relax the requirement that str() return
> an 8-bit string -- after all, int() is allowed to return a long, so
> why couldn't str() be allowed to return a Unicode string?

I've played with this idea a bit and it seems viable.  I modified my
original patch to have string_new call PyObject_Text instead of
PyObject_Str.  That change breaks only two tests, both in
test_email.  The tracebacks are attached.  Both problems seem
relatively shallow.  Do you thing such a change could go into 2.5?

  Neil


Traceback (most recent call last):
  File "/home/nas/Python/py_cvs/Lib/email/test/test_email.py", line 2844, in test_encoded_adjacent_nonencoded
    h = make_header(decode_header(s))
  File "/home/nas/Python/py_cvs/Lib/email/Header.py", line 123, in make_header
    charset = Charset(charset)
  File "/home/nas/Python/py_cvs/Lib/email/Charset.py", line 190, in __init__
    input_charset = unicode(input_charset, 'ascii').lower()
TypeError: decoding Unicode is not supported

Traceback (most recent call last):
  File "/home/nas/Python/py_cvs/Lib/email/test/test_email.py", line 2750, in test_multilingual
    eq(decode_header(enc),
  File "/home/nas/Python/py_cvs/Lib/email/Header.py", line 85, in decode_header
    dec = email.quopriMIME.header_decode(encoded)
  File "/home/nas/Python/py_cvs/Lib/email/quopriMIME.py", line 319, in header_decode
    return re.sub(r'=\w{2}', _unquote_match, s)
  File "/home/nas/Python/py_cvs/Lib/sre.py", line 142, in sub
    return _compile(pattern, 0).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 0: ordinal not in range(128)

From tim.peters at gmail.com  Tue Aug  9 01:29:07 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 Aug 2005 19:29:07 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808224912.GA11584@ActiveState.com>
References: <1f7befae05072819142c36e610@mail.gmail.com>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
Message-ID: <1f7befae05080816294bbc1100@mail.gmail.com>

[Trent Mick]
> ...
> There are other little things, like not being able to trim the check-in
> filelist when editing the check-in message.  For example, say you have
> 10 files checked out scattered around the Python source tree and you
> want to check 9 of those in.

This seems dubious, since you're not checking in the state you
actually have locally, and you were careful to run the full Python
test suite with your full local state ;-)

> Currently with svn you have to manually specify those 9 to be sure to not
> include the remaining one. With p4 you just say to check-in the whole tree
> and then remove that one from the list give you in your editor with entering
> the check-in message. Not that big of a deal.

As a purely theoretical exercise <wink>, the last time I faced that
under SVN, I opened the single file I didn't want to check-in in my
editor, did "svn revert" on it from the cmdline, checked in the whole
tree, and then hit the editor's "save" button.  This doesn't scale
well to skipping 25 of 50, but it's effective enough for 1 or 2.

From pinard at iro.umontreal.ca  Tue Aug  9 01:41:58 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Mon, 8 Aug 2005 19:41:58 -0400
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com>
References: <20050808154157.GA28005@panix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
	<20050808154157.GA28005@panix.com>
	<5.1.1.6.0.20050808122851.025d3e90@mail.telecommunity.com>
Message-ID: <20050808234158.GA19081@alcyon.progiciels-bpi.ca>

[Phillip J. Eby]

> At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote:

> > I'm not going to change my mind on text() unless someone explains
> > what's so attractive about it.

> 2. It's more obvious to programmers that it's a *text* string rather
> than a string of bytes

I've no opinion on the proposal on itself, except maybe that "text",
that precise word or name, is a pretty bad choice.  It is far too likely
that people already use or want to use that precise identifier.

There once was a suggestion for naming "text" the module now known
as "textwrap", under the premise that it could be later extended for
holding many other various text-related functions.  Happily enough, this
idea was not retained. "textwrap" is much more reasonable as a name.

I found Python 1.5.2's "string" to be especially prone to clashing.  I
still find "socket" obtrusive in that respect.  Consider "len" as an
example of a clever choice, while "length" would not have been. "str" is
also a good choice. "object" is a bit more annoying theoretically, yet
we almost never need it in practice. "type" is annoying as a name (yet
very nice as a concept), as if it was free to use, it would often serve
to label our own things.  The fact is we often need the built-in.

Python should not choose common English words for its built-ins, without
very careful thought, and be reluctant to any compulsion in this area.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From abo at minkirri.apana.org.au  Tue Aug  9 02:32:16 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 08 Aug 2005 17:32:16 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808224912.GA11584@ActiveState.com>
References: <1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
Message-ID: <1123547536.3700.109.camel@warna.corp.google.com>

On Mon, 2005-08-08 at 15:49, Trent Mick wrote:
> One feature I like in Perforce (which Subversion doesn't have) is the
> ability to have pending changesets. A changeset is, as with subversion,
> something you check-in atomically. Pending changesets in Perforce allow
> you to (1) group related files in a source tree where you might be
> working on multiple things at once to ensure and (2) to build a change
> description as you go. In a large source tree this can be useful for
> separating chunks of work.

This seems like a poor workaround for crappy branch/merge support. 

I'm new to perforce, but the pending changesets seem dodgey to me... you
are accumulating changes gradually without recording any history during
the process... ie, no checkins until the end.

Even worse, perforce seems to treat clients like "unversioned branches",
allowing you to review and test pending changesets in other clients.
This supposedly allows people to review/test each others changes before
they are committed. The problem is, since these changes are not
committed, there is no firm history of what what was reviewed/tested vs
what gets committed... ie they could be different.

Having multiple different pending changesets in one large source tree
also feels like a workaround for high client overheads. Trying to
develop and test a mixture of different changes in one source tree is
asking for trouble... they can interact.

Maybe I just haven't grokked perforce yet... which might be considered a
black mark against it's learning curve :-)

For me, the logical way to group a collection of changes is in a branch.
This allows you to commit and track history of the collection of
changes. You check out each branch into different directories and
develop/test them independantly. The branch can then be reviewed and
merged when it is complete.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From trentm at ActiveState.com  Tue Aug  9 02:33:45 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 17:33:45 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1f7befae05080816294bbc1100@mail.gmail.com>
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
	<1f7befae05080816294bbc1100@mail.gmail.com>
Message-ID: <20050809003345.GB23158@ActiveState.com>

[Tim Peters wrote]
> [Trent Mick]
> > ...
> > There are other little things, like not being able to trim the check-in
> > filelist when editing the check-in message.  For example, say you have
> > 10 files checked out scattered around the Python source tree and you
> > want to check 9 of those in.
> 
> This seems dubious, since you're not checking in the state you
> actually have locally,

Say that 10th file is a documentation fix for a module unrelated to the
other 9 files.

> and you were careful to run the full Python
> test suite with your full local state ;-)

Absolutely. Er. Always. :)

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From trentm at ActiveState.com  Tue Aug  9 02:51:23 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 8 Aug 2005 17:51:23 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123547536.3700.109.camel@warna.corp.google.com>
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
	<1123547536.3700.109.camel@warna.corp.google.com>
Message-ID: <20050809005123.GC23158@ActiveState.com>


Who made me the Perforce-bitch? Here I am screaming "Subversion!
Subversion!" and y'all think I just using that as cover for a p4 lover
affair. :)

[Donovan Baarda wrote]
> On Mon, 2005-08-08 at 15:49, Trent Mick wrote:
> > One feature I like in Perforce (which Subversion doesn't have) is the
> > ability to have pending changesets. A changeset is, as with subversion,
> > something you check-in atomically. Pending changesets in Perforce allow
> > you to (1) group related files in a source tree where you might be
> > working on multiple things at once to ensure and (2) to build a change
> > description as you go. In a large source tree this can be useful for
> > separating chunks of work.
> 
> This seems like a poor workaround for crappy branch/merge support. 

More like a pretty nice independent self-organizing feature that was
necessitated as a workaround for a crappy solution (clientspecs) for
huge data trees.

> I'm new to perforce, but the pending changesets seem dodgey to me... you
> are accumulating changes gradually without recording any history during
> the process... ie, no checkins until the end.

You want to do checkins of code in a consisten state. Some large changes
take a couple of days to write. During which one may have to do a couple
minor things in unrelated sections of a project. Having some mechanism
to capture some thoughts and be able to say "these are the relevant
source files for this work" is handy. Creating a branch for something
that takes a couple of days is overkill.

Perforce branching is pretty good in my experience. For very long
projects one can easily create a branch.


> Even worse, perforce seems to treat clients like "unversioned branches",
> allowing you to review and test pending changesets in other clients.

I'm not sure what you are talking about here. Yes, client information is
stored on the server, but the *changes* (i.e. the diffs) on the client
aren't so you must be talking about some other tool.

Actually, if there *were* such a feature that would be quite handy. I'd
love to be able to easily transfer my diffs developed on my Windows box
to my Linux or Mac OS X box to quickly test changes there before
checking in.

> This supposedly allows people to review/test each others changes before
> they are committed. The problem is, since these changes are not
> committed, there is no firm history of what what was reviewed/tested vs
> what gets committed... ie they could be different.

The alternative being either that you have separate branches for
everything (can be a pain) or just check-in for review (possibly
breaking the build or functionality for other developers until the
review is done). Actually the Perl guys working on PureMessage
downstairs have two branches going in Perforce: one for checking into
right away and then a cleaner tree to which only reviewed check-ins from
the first are integrated.

I'm not saying I am awash in pending changelists here. Nor that they
should be used for what is better handled with branching.  It is a tool
(and a minor one).

> Trying to develop and test a mixture of different changes in one
> source tree is asking for trouble... they can interact.

...within reason.

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From foom at fuhm.net  Tue Aug  9 02:56:51 2005
From: foom at fuhm.net (James Y Knight)
Date: Mon, 8 Aug 2005 20:56:51 -0400
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <ca471dc2050808091443147b6e@mail.gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
	<20050808154157.GA28005@panix.com>
	<ca471dc2050808091443147b6e@mail.gmail.com>
Message-ID: <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net>

On Aug 8, 2005, at 12:14 PM, Guido van Rossum wrote:
> Ouch. Too much discussion to respond to it all. Please remember that
> in Jythin and IronPython, str and unicode are already synonyms. That's
> how Python 3.0 will do it, except unicode will disappear as being
> redundant. I like the bytes/frozenbytes pair idea. Streams could grow
> a getpos()/setpos() API pair that can be used for stateful encodings
> (although it sounds like seek()/tell() would be okay to use in most
> cases as long as you read in units of whole lines). For sockets,
> send()/recv() would deal in bytes, and makefile() would get an
> encoding parameter. I'm not going to change my mind on text() unless
> someone explains what's so attractive about it.

Files no more have an encoding than sockets do. Reading/writing them  
should ideally (by default) result in bytes. codecs.open and  
codecs.StreamReaderWriter provide the character-converting wrapper  
around file-like objects.

I agree that getpos/setpos may be a useful addition to the API, but  
only because it would allow StreamReaderWriter to override it to do  
something useful. For normal files it could simply be an alias for  
tell/seek. Of course, someone would have to actually implement the  
ability to save and restore state for every codec...

Hum, actually, it somewhat makes sense for the "open" builtin to  
become what is now "codecs.open", for convenience's sake, although it  
does blur the distinction between a byte stream and a character  
stream somewhat. If that happens, I suppose it does actually make  
sense to give "makefile" the same signature.

James

From tim.peters at gmail.com  Tue Aug  9 03:12:49 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 Aug 2005 21:12:49 -0400
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <20050808083106.GA15924@code1.codespeak.net>
References: <20050808083106.GA15924@code1.codespeak.net>
Message-ID: <1f7befae05080818127ad30e63@mail.gmail.com>

[Armin Rigo]
> There are various proposals to add an attribute on exception instances
> to store the traceback (see PEP 344).  A detail not discussed, which I
> thought of historical interest only, is that today's exceptions try very
> hard to avoid reference cycles, in particular the cycle
>
>   'frame -> local variable -> traceback object -> frame'
>
> which was important for pre-GC versions of Python.  A clause 'except
> Exception, e' would not create a local reference to the traceback, only
> to the exception instance.  If the latter grows a __traceback__
> attribute, it is no longer true, and every such except clause typically
> creates a cycle.
>
> Of course, we don't care, we have a GC -- do we?  Well, there are cases
> where we do: see the attached program...  In my opinion it should be
> considered a bug of today's Python that this program leaks memory very
> fast and takes longer and longer to run each loop (each loop takes half
> a second longer than the previous one!).  (I don't know how this bug
> could be fixed, though.)
>
> Spoiling the fun of figuring out what is going on, the reason is that
> 'e_tb' creates a reference cycle involving the frame of __del__, which
> keeps a reference to 'self' alive.  Python thinks 'self' was
> resurrected.  The next time the GC runs, the cycle disappears, and the
> refcount of 'self' drops to zero again, calling __del__ again -- which
> gets resurrected again by a new cycle.  Etc...  Note that no cycle
> actually contains 'self'; they just point to 'self'.  In summary, no X
> instance gets ever freed, but they all have their destructors called
> over and over again.
>
> Attaching a __traceback__ will only make this "bug" show up more often,
> as the 'except Exception, e' line in a __del__() method would be enough
> to trigger it.
>
> Not sure what to do about it.  I just thought I should share these
> thoughts (I stumbled over almost this problem in PyPy).

I can't think of a Python feature with a higher aggregate
braincell_burned / benefit ratio than __del__ methods.  If P3K retains
them-- or maybe even before --we should consider taking "the Java
dodge" on this one.  That is, decree that henceforth a __del__ method
will get invoked by magic at most once on any given object O, no
matter how often O is resurrected.

It's been mentioned before, but it's at least theoretically
backward-incompatible, so "it's scary".  I can guarantee I don't have
any code that would care, including all the ZODB code I watch over
these days.  For ZODB it's especially easy to be sure of this:  the
only __del__ method in the whole thing appears in the test suite,
verifying that ZODB's object cache no longer gets into an infinite
loop when a user-defined persistent object has a brain-dead __del__
method that reloads self from the database.  (Interestingly enough, if
Python guaranteed to call __del__ at most once, the infinite loop in
ZODB's object cache never would have appeared in this case.)

From abo at minkirri.apana.org.au  Tue Aug  9 03:13:10 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 08 Aug 2005 18:13:10 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050809005123.GC23158@ActiveState.com>
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
	<1123547536.3700.109.camel@warna.corp.google.com>
	<20050809005123.GC23158@ActiveState.com>
Message-ID: <1123549990.3695.119.camel@warna.corp.google.com>

On Mon, 2005-08-08 at 17:51, Trent Mick wrote:
[...]
> [Donovan Baarda wrote]
> > On Mon, 2005-08-08 at 15:49, Trent Mick wrote:
[...]
> You want to do checkins of code in a consisten state. Some large changes
> take a couple of days to write. During which one may have to do a couple
> minor things in unrelated sections of a project. Having some mechanism
> to capture some thoughts and be able to say "these are the relevant

I don't need to checkin in a consitent state if I'm working on a
seperate branch. I can checkin any time I want to record a development
checkpoint... I can capture the thoughts in the version history of the
branch.

> source files for this work" is handy. Creating a branch for something
> that takes a couple of days is overkill.
[...]
> The alternative being either that you have separate branches for
> everything (can be a pain) or just check-in for review (possibly

It all comes down to how painless branch/merge is. Many esoteric
"features" of version control systems feel like they are there to
workaround the absence of proper branch/merge histories.

Note: SVN doesn't have branch/merge histories either.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From gvanrossum at gmail.com  Tue Aug  9 03:18:06 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 8 Aug 2005 18:18:06 -0700
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<1f7befae05080818127ad30e63@mail.gmail.com>
Message-ID: <ca471dc205080818185eea9cd1@mail.gmail.com>

On 8/8/05, Tim Peters <tim.peters at gmail.com> wrote:
> I can't think of a Python feature with a higher aggregate
> braincell_burned / benefit ratio than __del__ methods.  If P3K retains
> them-- or maybe even before --we should consider taking "the Java
> dodge" on this one.  That is, decree that henceforth a __del__ method
> will get invoked by magic at most once on any given object O, no
> matter how often O is resurrected.

I'm sympathetic to this one. Care to write a PEP? It could be really
short and sweet as long as it provides enough information to implement
the feature.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ilya at bluefir.net  Tue Aug  9 03:15:26 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Mon, 8 Aug 2005 18:15:26 -0700 (PDT)
Subject: [Python-Dev] an alternative suggestion,
 Re:  pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508071435330.695@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<42F679DC.6030705@v.loewis.de>
	<Pine.LNX.4.58.0508071435330.695@bagira>
Message-ID: <Pine.LNX.4.58.0508081754010.1008@bagira>


> > Should pdb's next command accept an optional numeric argument? It would
> > specify how many actual lines of code (not "line events")
> > should  be skipped in the current frame before stopping,

> That would differ from gdb's "next <n>", which does "next" n times.
> It would be confusing if pdb accepted the same command, but it
> meant something different.

So, would implementing gdb's "until" command instead of "next N" be a
better idea? In its simplest form (without arguments) "until" advances to
the next (textually) source line... This would solve the original problem of
 getting over list comprehensions...

There is a bit of a problem with abbreviation of "until": gdb abbreviates
it as "u", while in pdb "u" means "up"...It might be a good idea to have the
same abbreviations

Ilya


Problem:
  When the code contains list comprehensions (or for that matter any other
looping construct), the only way to get quickly through this code in pdb
is to set a temporary breakpoint on the line after the loop, which is
inconvenient..
There is a SF bug report #1248119 about this behavior.


On Sun, 7 Aug 2005, Ilya Sandler wrote:

>
>
> On Sun, 7 Aug 2005, [ISO-8859-1] "Martin v. L?wis" wrote:
>
> > Ilya Sandler wrote:
> > > Should pdb's next command accept an optional numeric argument? It would
> > > specify how many actual lines of code (not "line events")
> > > should  be skipped in the current frame before stopping,
> > [...]
> > > What do you think?
> >
> > That would differ from gdb's "next <n>", which does "next" n times.
> > It would be confusing if pdb accepted the same command, but it
> > meant something different.
>
> But as far as I can tell, pdb's next is
> already different from gdb's next! gdb's next seem to always go to the
> different source line, while pdb's next may stay on the current line.
>
> The problem with "next <n>" meaning "repeat next n times" is that it
> seems to be less useful that the original suggestion.
>
> Any alternative suggestions to allow to step over list comprehensions and
> such? (SF 1248119)
>
> > Plus, there is always a chance that
> > <current line>+n is never reached, which would also be confusing.
>
> That should not be a problem, returning from the current frame should be
> treated as a stopping condition (similarly to the current "next"
> behaviour)...
>
> Ilya
>
>
>
> > So I'm -1 here.
> >
> > Regards,
> > Martin
> >
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ilya%40bluefir.net
>

From pje at telecommunity.com  Tue Aug  9 03:45:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Aug 2005 21:45:27 -0400
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<20050808083106.GA15924@code1.codespeak.net>
Message-ID: <5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com>

At 09:12 PM 8/8/2005 -0400, Tim Peters wrote:
>I can't think of a Python feature with a higher aggregate
>braincell_burned / benefit ratio than __del__ methods.  If P3K retains
>them-- or maybe even before --we should consider taking "the Java
>dodge" on this one.  That is, decree that henceforth a __del__ method
>will get invoked by magic at most once on any given object O, no
>matter how often O is resurrected.

How does that help?  Doesn't it mean that we'll have to have some way of 
keeping track of which items' __del__ methods were called?


From bcannon at gmail.com  Tue Aug  9 04:02:40 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 8 Aug 2005 19:02:40 -0700
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<1f7befae05080818127ad30e63@mail.gmail.com>
Message-ID: <bbaeab1005080819022647d9c8@mail.gmail.com>

On 8/8/05, Tim Peters <tim.peters at gmail.com> wrote:

> I can't think of a Python feature with a higher aggregate
> braincell_burned / benefit ratio than __del__ methods.  If P3K retains
> them-- or maybe even before --we should consider taking "the Java
> dodge" on this one.  That is, decree that henceforth a __del__ method
> will get invoked by magic at most once on any given object O, no
> matter how often O is resurrected.
> 

Wasn't there talk of getting rid of __del__ a little while ago and
instead use weakrefs to functions to handle cleaning up?  Is that
still feasible?  And if so, would this alleviate the problem?

-Brett

From tim.peters at gmail.com  Tue Aug  9 04:37:56 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 Aug 2005 22:37:56 -0400
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<1f7befae05080818127ad30e63@mail.gmail.com>
	<5.1.1.6.0.20050808214339.02611248@mail.telecommunity.com>
Message-ID: <1f7befae05080819373910d1c2@mail.gmail.com>

[Tim Peters]
>> If P3K retains them [__del__]-- or maybe even before --we should
>> consider taking "the Java dodge" on this one.  That is, decree that
>> henceforth a __del__ method will get invoked by magic at most
>> once on any given object O, no matter how often O is resurrected.

[Phillip J. Eby]
> How does that help?

You have to dig into Armin's example (or read his explanation):  every
time __del__ is called on one of his X objects, it creates a cycle by
binding sys.exec_info()[2] to the local vrbl `e_tb`.  `self` is
reachable from that cycle, so self's refcount does not fall to 0 when
__del__ returns.  The object is resurrected.  When cyclic gc next
runs, it determines that the cycle is trash, and runs around
decref'ing the objects in the cycle.  That eventually makes the
refcount on the X object fall to 0 again too, but then its __del__
method also runs again, and creates an isomorphic cycle, resurrecting
`self` again.  Etc.

Armin didn't point this out explicitly, but it's important to realize
that gc.garbage remains empty the entire time you let his program run.
 The object with the __del__ method isn't _in_ a cycle here, it's
hanging _off_ a cycle, which __del__ keeps recreating.  Cyclic gc
isn't inhibited by a __del__ on an object hanging off a trash cycle
(but not in a trash cycle itself), but in this case it's ineffective
anyway.

If __del__ were invoked only the first time cyclic gc ran, the
original cycle would go away during the next cyclic gc run, and a new
cycle would not take its place.

>  Doesn't it mean that we'll have to have some way of keeping track of
> which items' __del__ methods were called?

Yes, by hook or by crook; and yup too, that may be unattractive.

From tim.peters at gmail.com  Tue Aug  9 05:02:59 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 Aug 2005 23:02:59 -0400
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <bbaeab1005080819022647d9c8@mail.gmail.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<1f7befae05080818127ad30e63@mail.gmail.com>
	<bbaeab1005080819022647d9c8@mail.gmail.com>
Message-ID: <1f7befae050808200272961928@mail.gmail.com>

[Brett Cannon]
> Wasn't there talk of getting rid of __del__ a little while ago and
> instead use weakrefs to functions to handle cleaning up?

There was from me, yes, with an eye toward P3K.

> Is that still feasible?

It never was, really.  The combination of finalizers, cycles and
resurrection is a freakin' mess, "even in theory".  The way things are
right now, Python's weakref gc endcase behavior is even more
mystically implementation-driven than its __del__ gc endcase behavior,
and nobody has had time to try to dream up a cleaner approach.

> And if so, would this alleviate the problem?

Absolutely <wink>.  The underlying reason for optimism is that

    weakrefs in Python are designed to, at worst, let *other* objects
    learn that a given object has died, via a callback function.  The weakly
    referenced object itself is not passed to the callback, and the presumption
    is that the weakly referenced object is unreachable trash at the time the
    callback is invoked.

IOW, resurrection was "obviously" impossible, making endcase life very
much simpler.  That paragraph is from Modules/gc_weakref.txt, and you
can read there all about why optimism hasn't work yet ;-)

From ilya at bluefir.net  Tue Aug  9 05:13:41 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Mon, 8 Aug 2005 20:13:41 -0700 (PDT)
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <20050808154503.GB28005@panix.com>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
Message-ID: <Pine.LNX.4.58.0508081926010.2814@bagira>


> At OSCON, Anthony Baxter made the point that pdb is currently one of the
> more unPythonic modules.

What is unpythonic about pdb? Is this part of Anthony's presentation
online? (Google found a summary and slides from presentation but they
don't say anything about pdb's deficiencies)

Ilya

On Mon, 8 Aug 2005, Aahz wrote:

> On Sun, Aug 07, 2005, Ilya Sandler wrote:
> >
> > Solution:
> >
> > Should pdb's next command accept an optional numeric argument? It would
> > specify how many actual lines of code (not "line events")
> > should  be skipped in the current frame before stopping,
>
> At OSCON, Anthony Baxter made the point that pdb is currently one of the
> more unPythonic modules.  If you're feeling a lot of energy about this,
> rewriting pdb might be more productive.
> --
> Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/
>
> The way to build large Python applications is to componentize and
> loosely-couple the hell out of everything.
>

From bcannon at gmail.com  Tue Aug  9 06:32:49 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 8 Aug 2005 21:32:49 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
Message-ID: <bbaeab100508082132252dc7bf@mail.gmail.com>

version 1.7 scales the proposal back once more
(http://www.python.org/peps/pep-0348.html).  At this point the only
changes to the hierarchy are the addition of BaseException and
TerminatingException, and the change of inheritnace for
KeyboardInterrupt, SystemExit, and NotImplementedError.  At this point
I don't think MAL or Raymond will have any major complaints.  =)

Assuming no one throws a fit over this version, discussing transition
is the next step.  I think the transition plan is fine, but if anyone
has any specific input that would be great.  I could probably stand to
do a more specific timeline in terms of 2.x, 2.x+1, 3.0-1, etc., but
that will have to wait for another day this week.

And once that is settled I guess it is either time for pronouncement
or it just sits there until Python 3.0 actually starts to come upon
us.

-Brett

From stephen at xemacs.org  Tue Aug  9 07:15:48 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 09 Aug 2005 14:15:48 +0900
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123549990.3695.119.camel@warna.corp.google.com> (Donovan
	Baarda's message of "Mon, 08 Aug 2005 18:13:10 -0700")
References: <1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
	<1123547536.3700.109.camel@warna.corp.google.com>
	<20050809005123.GC23158@ActiveState.com>
	<1123549990.3695.119.camel@warna.corp.google.com>
Message-ID: <87k6iv7ksb.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Donovan" == Donovan Baarda <abo at minkirri.apana.org.au> writes:

    Donovan> It all comes down to how painless branch/merge is. Many
    Donovan> esoteric "features" of version control systems feel like
    Donovan> they are there to workaround the absence of proper
    Donovan> branch/merge histories.

It's not that simple.  I've followed both the Arch and the darcs
lists---they handle a lot more branch/merge scenarios than Subversion
does, but you still can't get away with zero discipline.  On the other
hand, for the purpose of the main repository for a well-factored
project with disciplined workflow like Python, it's not obvious to me
that the middle-complexity scenarios are that important.

Furthermore, taking good advantage of the more complex branch/merge
scenarios will require a change to Python workflow (for example, push-
to-tracker will no longer be a convenient way to submit patches for
most developers); that will be costly.  More important, since none of
the core Python people have spoken up strongly in favor of an advanced
system, I would guess there's little experience to support planning a
new workflow.  (Cf. the Linux case, where Linus opted to roll his own.)

I know many people in the Emacs communities who are successfully using
CVS for the main repositories and various advanced systems (prcs,
darcs, arch at least) for local branching and small group project
communication.  It seems fairly easy to automate that (much easier
than extracting changeset information from CVS!)  I think that as
developers find they have need for such capabilities, the practice
will grow in Python too, and then there may be a case to be built for
moving the main repository to such a system.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From stephen at xemacs.org  Tue Aug  9 07:28:08 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 09 Aug 2005 14:28:08 +0900
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F7D1A9.4000909@v.loewis.de> (Martin v.
	=?iso-8859-1?q?L=F6wis's?= message of "Mon, 08 Aug 2005 23:42:01 +0200")
References: <5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
	<42F60E35.9080809@egenix.com>
	<20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<42F60E35.9080809@egenix.com>
	<5.1.1.6.0.20050807234701.025b5490@mail.telecommunity.com>
	<5.1.1.6.0.20050808094640.025b8d98@mail.telecommunity.com>
	<42F7D1A9.4000909@v.loewis.de>
Message-ID: <87fytj7k7r.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin at v.loewis.de> writes:

    Martin> While this would work, it would still feel wrong: the
    Martin> binary data are *not* latin1 (most likely), so declaring
    Martin> them to be latin1 would be confusing. Perhaps a synonym
    Martin> '8bit' for latin1 could be introduced.

Be careful.  This alias has caused Emacs some amount of pain, as
binary data escapes into contexts (such as Universal Newline
processing) where it gets interpreted as character data.  We've also
had some problems in codec implementation, because latin1 and (eg)
latin9 have some differences in semantics other than changing the
coded character set for the GR register---controls are treated
differently, for example, because they _are_ binary (alias latin1)
octets, but not in the range of the latin9 code.

I won't go so far as to say it won't work, but it will require careful
design.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From python at rcn.com  Tue Aug  9 07:43:32 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 9 Aug 2005 01:43:32 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab100508082132252dc7bf@mail.gmail.com>
Message-ID: <000a01c59ca5$47627960$803dc797@oemcomputer>

[Brett Cannon]  
> At this point the only
> changes to the hierarchy are the addition of BaseException and
> TerminatingException, and the change of inheritnace for
> KeyboardInterrupt, SystemExit, and NotImplementedError.  

TerminatingException
--------------------

The rationale for adding TerminatingException needs to be developed or
reconsidered.  AFAICT, there hasn't been an exploration of existing code
bases to determine that there is going to be even minimal use of "except
TerminatingException".

Are KeyboardInterrupt and SystemExit often caught together on the same
line and handled in the same way?  

If so, isn't "except TerminatingException" less explicit, clear, and
flexible than "except (KeyboardInterrupt, SystemExit)"?  Do we need a
second way to do it?

Doesn't the new meaning of Exception already offer a better idiom:

   try:
      suite()
   except Exception:
      log_or_recover()
   except:
      handle_terminating_exceptions()
   else:

Are there any benefits sufficient to warrant yet another new built-in?
Does it also warrant violating FIBTN by introducing more structure?
While I'm clear on why KeyboardInterrupt and SystemExit were moved from
under Exception, it is not at all clear what problem is being solved by
adding a new intermediate grouping.

The PEP needs to address all of the above.  Right now, it contains a
definition rather than justification, research, and analysis.


WindowsError
------------

This should be kept.  Unlike module specific exceptions, this exception
occurs in multiple places and diverse applications.  It is appropriate
to list as a builtin.

"Too O/S specific" is not a reason for eliminating this.  Looking at the
codebase there does not appear to be a good substitute.  Eliminating
this one would break code, decrease clarity, and cause modules to grow
competing variants.

After the change, nothing would be better and many things would be
worse.


NotImplementedError
-------------------
Moving this is fine.  Removing unnecessary nesting is a step forward.
The PEP should list that as a justification.


Bare excepts defaulting to Exception
------------------------------------

After further thought, I'm not as sure about this one and whether it is
workable.  The code fragment above highlights the issue.  In a series of
except clauses, each line only matches what was not caught by a previous
clause.  This is a useful and basic part of the syntax.  It leaves a
bare except to have the role of a final catchall (much like a default in
C's switch-case).  If one line uses "except Exception", then a
subsequence bare except should probably catch KeyboardInterrupt and
SystemExit.  Otherwise, there is a risk of creating optical illusion
errors (code that looks like it should work but is actually broken).
I'm not certain on this one, but the PEP does need to fully explore the
implications and think-out the consequent usability issues. 


> And once that is settled I guess it is either time for pronouncement
> or it just sits there until Python 3.0 actually starts to come upon
> us.

What happened to "don't take this too seriously, I'm just trying to get
the ball rolling"?

This PEP or any Py3.0 PEP needs to sit a good while before
pronouncement.  Because 3.0 is not an active project, the PEP is
unlikely to be a high priority review item by many of Python's best
minds.  It should not be stamped as accepted until they've had a chance
to think it through.  Because 3.0 is still somewhat ethereal, it is not
reasonable to expect them to push aside their other work to look at this
right now.

The PEP needs to be kicked around on the newsgroup (naming and grouping
discussions are easy and everyone will have an opinion).  Also the folks
with PyPy, BitTorrent, Zope, Twisted, IronPython, Jython, and such need
to have a chance to have their say.

Because of Py3.0's low visibility, these PEPs could easily slide through
prematurely.  Were the project imminent, it is likely that this PEP
would have had significantly more discussion.


Try not to get frustrated at these reviews.  Because there was no
research into existing code, working to solve known problems, evaluation
of alternatives, or usability analysis, it is no surprise Sturgeon's Law
would apply.  Since Python has been around so long, it is also no
surprise that what we have now is pretty good and that improvements
won't be trivially easy to come by.


Raymond

From steven.bethard at gmail.com  Tue Aug  9 08:28:08 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 9 Aug 2005 00:28:08 -0600
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <000401c59b36$01226de0$e410c797@oemcomputer>
References: <bbaeab1005080621266bcc87@mail.gmail.com>
	<000401c59b36$01226de0$e410c797@oemcomputer>
Message-ID: <d11dcfba050808232832d1626e@mail.gmail.com>

Raymond Hettinger wrote:
> If the PEP can't resist the urge to create new intermediate groupings,
> then start by grepping through tons of Python code to find-out which
> exceptions are typically caught on the same line.  That would be a
> worthwhile empirical study and may lead to useful insights.

I was curious, so I did a little grepping (ok, os.walking and
re.findalling) ;-) through the Python source.  The only exceptions
that were caught together more than 5 times were:

AttributeError and TypeError (23 instances) in
code.py
doctest.py
linecache.py
mailbox.py
idlelib/rpc.py
lib-old/newdir.py
lib-tk/Tkinter.py
test/test_descr.py
test/test_file.py
test/test_funcattrs.py
test/test_os.py
Though these occur in a few different contexts, one relatively common
one was when the code tried to set a possibly read-only attribute.

ImportError and AttributeError (9 instances), in
getpass.py
locale.py
pydoc.py
tarfile.py
xmlrpclib.py
lib-tk/tkFileDialog.py
test/test_largefile.py
test/test_tarfile.py
This seemed to be used when an incompatible module might be present. 
(Attributes were tested to make sure the module was the right one.) 
Also used when code tried to use "private" module attributes (e.g.
_getdefaultlocale()).

OverflowError and ValueError (9 instances), in
csv.py
ftplib.py
mhlib.py
mimify.py
warnings.py
test/test_resource.py
These were generally around a call to int(x).  I assume they're
generally unnecessary now that int() silently converts to longs.

IOError and OSError (6 instances), in
pty.py
tempfile.py
whichdb.py
distutils/dir_util.py
idlelib/configHandler.py
test/test_complex.py
These were all around file/directory handling that I didn't study in
too much detail.  With the current hierarchy, there's no reason these
couldn't just be catching EnvironmentError anyway.

As you can see, even for the most common pairs of exceptions, the
total number of times these pairs were caught was pretty small.  Even
ignoring the low counts, we see that the last two pairs or exceptions
aren't really necessary, thanks to int/long unification and the
existence of EnvironmentError, and the former two pairs argue
*against* added nesting as it is unclear whether to group
AttributeError with ImportError or TypeError.  So it doesn't really
look like the stdlib's going to provide much of a case for adding
nesting to the exception hierarchy.

Anyway, I know PEP 348's been scaled back at this point anyway, but I
figured I might as well post my findings in case anyone was curious.

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From martin at v.loewis.de  Tue Aug  9 08:44:43 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 Aug 2005 08:44:43 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <20050808224912.GA11584@ActiveState.com>
References: <1f7befae05072819142c36e610@mail.gmail.com>
	<1122605323.9670.11.camel@geddy.wooz.org>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com>
	<42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
Message-ID: <42F850DB.8000406@v.loewis.de>

Trent Mick wrote:
> One feature I like in Perforce (which Subversion doesn't have) is the
> ability to have pending changesets.

That sounds useful.

> Currently with svn you have to manually
> specify those 9 to be sure to not include the remaining one. With p4 you
> just say to check-in the whole tree and then remove that one from the
> list give you in your editor with entering the check-in message. Not
> that big of a deal.

Depends on the client also. With Tortoise SVN, you do get a checkbox
list where you can exclude files from the checkin.

Regards,
Martin

From martin at v.loewis.de  Tue Aug  9 08:52:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 Aug 2005 08:52:01 +0200
Subject: [Python-Dev] an alternative suggestion,
 Re:  pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508081754010.1008@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<42F679DC.6030705@v.loewis.de>
	<Pine.LNX.4.58.0508071435330.695@bagira>
	<Pine.LNX.4.58.0508081754010.1008@bagira>
Message-ID: <42F85291.9070605@v.loewis.de>

Ilya Sandler wrote:
> So, would implementing gdb's "until" command instead of "next N" be a
> better idea? In its simplest form (without arguments) "until" advances to
> the next (textually) source line... This would solve the original problem of
>  getting over list comprehensions...

I like that idea.

> There is a bit of a problem with abbreviation of "until": gdb abbreviates
> it as "u", while in pdb "u" means "up"...It might be a good idea to have the
> same abbreviations

Indeed. I don't know much about pdb internals, but I think "u" should
become unbound, and "up" and "unt" should become the shortest
abbreviations.

Regards,
Martin

From ncoghlan at gmail.com  Tue Aug  9 11:28:01 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 09 Aug 2005 19:28:01 +1000
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net>
References: <20050806102342.GA11309@mems-exchange.org>	<dd36mq$pjr$1@sea.gmane.org>	<ca471dc20508061856c0cce4f@mail.gmail.com>	<20050808034756.GA16756@mems-exchange.org>	<20050808154157.GA28005@panix.com>	<ca471dc2050808091443147b6e@mail.gmail.com>
	<2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net>
Message-ID: <42F87721.5020603@gmail.com>

James Y Knight wrote:
> Hum, actually, it somewhat makes sense for the "open" builtin to  
> become what is now "codecs.open", for convenience's sake, although it  
> does blur the distinction between a byte stream and a character  
> stream somewhat. If that happens, I suppose it does actually make  
> sense to give "makefile" the same signature.

We could always give the text mode/binary mode distinction in "open" a real 
meaning - text mode deals with character sequences, binary mode deals with 
byte sequences.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mwh at python.net  Tue Aug  9 11:50:08 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 09 Aug 2005 10:50:08 +0100
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <d11dcfba050808232832d1626e@mail.gmail.com> (Steven Bethard's
	message of "Tue, 9 Aug 2005 00:28:08 -0600")
References: <bbaeab1005080621266bcc87@mail.gmail.com>
	<000401c59b36$01226de0$e410c797@oemcomputer>
	<d11dcfba050808232832d1626e@mail.gmail.com>
Message-ID: <2my87bwib3.fsf@starship.python.net>

Steven Bethard <steven.bethard at gmail.com> writes:

> Raymond Hettinger wrote:
>> If the PEP can't resist the urge to create new intermediate groupings,
>> then start by grepping through tons of Python code to find-out which
>> exceptions are typically caught on the same line.  That would be a
>> worthwhile empirical study and may lead to useful insights.
>
> I was curious, so I did a little grepping (ok, os.walking and
> re.findalling) ;-) through the Python source.  The only exceptions
> that were caught together more than 5 times were:
>
> AttributeError and TypeError (23 instances) in
> code.py
> doctest.py
> linecache.py
> mailbox.py
> idlelib/rpc.py
> lib-old/newdir.py
> lib-tk/Tkinter.py
> test/test_descr.py
> test/test_file.py
> test/test_funcattrs.py
> test/test_os.py
> Though these occur in a few different contexts, one relatively common
> one was when the code tried to set a possibly read-only attribute.

This TypeError/AttributeError one is long known, and a bit of a mess,
really.  Finding an attribute usually fails because the object is not
of the expected type, after all.

> ImportError and AttributeError (9 instances), in
> getpass.py
> locale.py
> pydoc.py
> tarfile.py
> xmlrpclib.py
> lib-tk/tkFileDialog.py
> test/test_largefile.py
> test/test_tarfile.py
> This seemed to be used when an incompatible module might be present. 
> (Attributes were tested to make sure the module was the right one.) 
> Also used when code tried to use "private" module attributes (e.g.
> _getdefaultlocale()).

This seems like ever-so-faintly lazy programming to me, but maybe
that's overly purist.

> OverflowError and ValueError (9 instances), in
> csv.py
> ftplib.py
> mhlib.py
> mimify.py
> warnings.py
> test/test_resource.py
> These were generally around a call to int(x).  I assume they're
> generally unnecessary now that int() silently converts to longs.

Yes, I think so.

> IOError and OSError (6 instances), in
> pty.py
> tempfile.py
> whichdb.py
> distutils/dir_util.py
> idlelib/configHandler.py
> test/test_complex.py
> These were all around file/directory handling that I didn't study in
> too much detail.  With the current hierarchy, there's no reason these
> couldn't just be catching EnvironmentError anyway.

Heh.  I'd have to admit that I rarely know which of IOError or OSError
I should be expecting in a given situation, nor that EnvironmentError
is a common subclass that I could catch instead...

[...]

> Anyway, I know PEP 348's been scaled back at this point anyway, but I
> figured I might as well post my findings in case anyone was curious.

Was interesting, thanks!

Cheers,
mwh

-- 
  <freeside> On a scale of One to AWESOME, twisted.web is PRETTY
             ABSTRACT!!!!                       -- from Twisted.Quotes

From ncoghlan at gmail.com  Tue Aug  9 12:30:03 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 09 Aug 2005 20:30:03 +1000
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <000a01c59ca5$47627960$803dc797@oemcomputer>
References: <000a01c59ca5$47627960$803dc797@oemcomputer>
Message-ID: <42F885AB.8040104@gmail.com>

Raymond Hettinger wrote:
> TerminatingException
> --------------------
> 
> The rationale for adding TerminatingException needs to be developed or
> reconsidered.  AFAICT, there hasn't been an exploration of existing code
> bases to determine that there is going to be even minimal use of "except
> TerminatingException".
> 
> Are KeyboardInterrupt and SystemExit often caught together on the same
> line and handled in the same way?  

Yes, to avoid the current overbroad inheritance of "except Exception:" by 
intercepting and reraising these two terminating exceptions.

> If so, isn't "except TerminatingException" less explicit, clear, and
> flexible than "except (KeyboardInterrupt, SystemExit)"?

No, TerminatingException makes it explicit to the reader what is going on - 
special handling is being applied to any exceptions that indicate the 
interpreter is expected to exit as a result of the exception. Using "except 
(KeyboardInterrupt, SystemExit):" is less explicit, as it relies on the reader 
knowing that these two exceptions share the common characteristic that they 
are generally meant to terminate the Python interpreter.

> Are there any benefits sufficient to warrant yet another new built-in?
> Does it also warrant violating FIBTN by introducing more structure?
> While I'm clear on why KeyboardInterrupt and SystemExit were moved from
> under Exception, it is not at all clear what problem is being solved by
> adding a new intermediate grouping.

The main benefits of TerminatingException lie in easing the transition to 
Py3k. After transition, "except Exception:" will already do the right thing.
However, TerminatingException will still serve a useful documentational 
purpose, as it sums up in two words the key characteristic that caused 
KeyboardInterrupt and SystemExit to be moved out from underneath Exception.

> Bare excepts defaulting to Exception
> ------------------------------------
> 
> After further thought, I'm not as sure about this one and whether it is
> workable.  The code fragment above highlights the issue.  In a series of
> except clauses, each line only matches what was not caught by a previous
> clause.  This is a useful and basic part of the syntax.  It leaves a
> bare except to have the role of a final catchall (much like a default in
> C's switch-case).  If one line uses "except Exception", then a
> subsequence bare except should probably catch KeyboardInterrupt and
> SystemExit.  Otherwise, there is a risk of creating optical illusion
> errors (code that looks like it should work but is actually broken).
> I'm not certain on this one, but the PEP does need to fully explore the
> implications and think-out the consequent usability issues. 

I'm also concerned about this one. IMO, bare excepts in Python 3k should 
either not be allowed at all (use "except BaseException:" intead), or they 
should be synonyms for "except BaseException:".

Having a bare except that doesn't actually catch everything just seems wrong - 
and we already have style guides that say "except Exception:" is to be 
generally preferred over a bare except. Consenting adults and all that. . .

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From pedronis at strakt.com  Tue Aug  9 13:37:17 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue, 09 Aug 2005 13:37:17 +0200
Subject: [Python-Dev] __traceback__ and reference cycles
In-Reply-To: <1f7befae05080818127ad30e63@mail.gmail.com>
References: <20050808083106.GA15924@code1.codespeak.net>
	<1f7befae05080818127ad30e63@mail.gmail.com>
Message-ID: <42F8956D.9060702@strakt.com>

Tim Peters wrote:
> 
> I can't think of a Python feature with a higher aggregate
> braincell_burned / benefit ratio than __del__ methods.  If P3K retains
> them-- or maybe even before --we should consider taking "the Java
> dodge" on this one.  That is, decree that henceforth a __del__ method
> will get invoked by magic at most once on any given object O, no
> matter how often O is resurrected.
> 

Jython __del__ support is already layered on Java finalize, so that's
what one gets.

From barry at python.org  Tue Aug  9 14:38:14 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 09 Aug 2005 08:38:14 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1f7befae05080816294bbc1100@mail.gmail.com>
References: <1f7befae05072819142c36e610@mail.gmail.com>
	<1f7befae0507281959abc2a7c@mail.gmail.com>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<20050808185100.GJ16963@ActiveState.com> <42F7D0AC.5020003@v.loewis.de>
	<20050808224912.GA11584@ActiveState.com>
	<1f7befae05080816294bbc1100@mail.gmail.com>
Message-ID: <1123591094.11608.431.camel@presto.wooz.org>

On Mon, 2005-08-08 at 19:29, Tim Peters wrote:

> > Currently with svn you have to manually specify those 9 to be sure to not
> > include the remaining one. With p4 you just say to check-in the whole tree
> > and then remove that one from the list give you in your editor with entering
> > the check-in message. Not that big of a deal.
> 
> As a purely theoretical exercise <wink>, the last time I faced that
> under SVN, I opened the single file I didn't want to check-in in my
> editor, did "svn revert" on it from the cmdline, checked in the whole
> tree, and then hit the editor's "save" button.  This doesn't scale
> well to skipping 25 of 50, but it's effective enough for 1 or 2.

Or one could use a decent client, like say psvn under XEmacs <wink>
which presents you a list of all modified files and lets you select
which ones you want to commit.

The one thing I dislike about svn (in my day-to-day use of it) is that
it can take a VERY long time to do updates at the roots of very large
trees.  I once tried to check out the root of our dev tree, which
contains all branches and tags.  Of course the initial checkout took
forever.  But an update at the root made this approach unusable.  svn
would sit there, seemingly idle for 30-45 minutes and then take another
30-45 minutes updating the changes, which typically consisted of maybe
50 files out of thousands.  And this on a gig LAN with fast h/w all
around (and for Tim's sake I won't even complain about how some
operating systems appear to perform much worse than others :).

The smaller you can keep your working copies, the better.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050809/6a70e6d2/attachment.pgp

From gvanrossum at gmail.com  Tue Aug  9 17:03:12 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue, 9 Aug 2005 08:03:12 -0700
Subject: [Python-Dev] Generalised String Coercion
In-Reply-To: <42F87721.5020603@gmail.com>
References: <20050806102342.GA11309@mems-exchange.org>
	<dd36mq$pjr$1@sea.gmane.org>
	<ca471dc20508061856c0cce4f@mail.gmail.com>
	<20050808034756.GA16756@mems-exchange.org>
	<20050808154157.GA28005@panix.com>
	<ca471dc2050808091443147b6e@mail.gmail.com>
	<2B24D218-9919-4CF7-AEF2-7335B8360878@fuhm.net>
	<42F87721.5020603@gmail.com>
Message-ID: <ca471dc20508090803213e3e0e@mail.gmail.com>

On 8/9/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> We could always give the text mode/binary mode distinction in "open" a real
> meaning - text mode deals with character sequences, binary mode deals with
> byte sequences.

I thought that's what I proposed before. I'm still for it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Tue Aug  9 19:17:42 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 9 Aug 2005 10:17:42 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <000a01c59ca5$47627960$803dc797@oemcomputer>
References: <bbaeab100508082132252dc7bf@mail.gmail.com>
	<000a01c59ca5$47627960$803dc797@oemcomputer>
Message-ID: <bbaeab1005080910171d6ba70@mail.gmail.com>

On 8/8/05, Raymond Hettinger <python at rcn.com> wrote:
> [Brett Cannon]
> > At this point the only
> > changes to the hierarchy are the addition of BaseException and
> > TerminatingException, and the change of inheritnace for
> > KeyboardInterrupt, SystemExit, and NotImplementedError.
> 
> TerminatingException
> --------------------
> 
> The rationale for adding TerminatingException needs to be developed or
> reconsidered.  AFAICT, there hasn't been an exploration of existing code
> bases to determine that there is going to be even minimal use of "except
> TerminatingException".
> 
> Are KeyboardInterrupt and SystemExit often caught together on the same
> line and handled in the same way?
> 

The problem with existing code checking for this situation is that the
situation itself is not the same as it will be if bare 'except's
change::

 try:
   ...
 except:
   ...
 except TerminatingException:
   ...

has never really been possible before, but will be if the PEP goes forward.

> If so, isn't "except TerminatingException" less explicit, clear, and
> flexible than "except (KeyboardInterrupt, SystemExit)"?  Do we need a
> second way to do it?
> 

But what if we add other exceptions that don't inherit from Exception
that was want to typically propagate up?  Having a catch-all for
exceptions that a bare 'except' will skip that is more explicit than
``except BaseException`` seems reasonable to me.  As Nick said in
another email, it provides a more obvoius self-documentation point to
catch TerminatingException than ``(KeyboardInterrupt, SystemExit)``,
plus you get some future-proofing on top of it in case we add more
exceptions that are not caught by a bare 'except'.

> Doesn't the new meaning of Exception already offer a better idiom:
> 
>    try:
>       suite()
>    except Exception:
>       log_or_recover()
>    except:
>       handle_terminating_exceptions()
>    else:
> 
> Are there any benefits sufficient to warrant yet another new built-in?
> Does it also warrant violating FIBTN by introducing more structure?
> While I'm clear on why KeyboardInterrupt and SystemExit were moved from
> under Exception, it is not at all clear what problem is being solved by
> adding a new intermediate grouping.
> 
> The PEP needs to address all of the above.  Right now, it contains a
> definition rather than justification, research, and analysis.
> 
> 
> 
> WindowsError
> ------------
> 
> This should be kept.  Unlike module specific exceptions, this exception
> occurs in multiple places and diverse applications.  It is appropriate
> to list as a builtin.
> 
> "Too O/S specific" is not a reason for eliminating this.  Looking at the
> codebase there does not appear to be a good substitute.  Eliminating
> this one would break code, decrease clarity, and cause modules to grow
> competing variants.
> 

I unfortunately forgot to add that the exception would be moved under
os, so it would be more of a renaming than a removal.

The reason I pulled it was that Guido said UnixError and MacError
didn't belong, so why should WindowsError stay?  Obviously there are
backwards-compatibility issues with removing it, but why should we
have this platform-specific thing in the built-in namespace?  Nothing
else is platform-specific in the language until you go into the
stdlib.  The language itself is supposed to be platform-agnostic, and
yet here is this exception that is not meant to be used by anyone but
by a specific OS.  Seems like a contradiction to me.

> After the change, nothing would be better and many things would be
> worse.
> 
> 
> 
> NotImplementedError
> -------------------
> Moving this is fine.  Removing unnecessary nesting is a step forward.
> The PEP should list that as a justification.
> 

Yay, something uncontraversial!  =)

> 
> 
> Bare excepts defaulting to Exception
> ------------------------------------
> 
> After further thought, I'm not as sure about this one and whether it is
> workable.  The code fragment above highlights the issue.  In a series of
> except clauses, each line only matches what was not caught by a previous
> clause.  This is a useful and basic part of the syntax.  It leaves a
> bare except to have the role of a final catchall (much like a default in
> C's switch-case).  If one line uses "except Exception", then a
> subsequence bare except should probably catch KeyboardInterrupt and
> SystemExit.  Otherwise, there is a risk of creating optical illusion
> errors (code that looks like it should work but is actually broken).
> I'm not certain on this one, but the PEP does need to fully explore the
> implications and think-out the consequent usability issues.
> 

This is Guido's thing.  You will have to convince him of the change. 
I can flesh out the PEP to argue for which ever result he wants, but
that part of the proposal is in there because Guido wanted it.  I am
just a PEP lackey in this case.  =)

> 
> > And once that is settled I guess it is either time for pronouncement
> > or it just sits there until Python 3.0 actually starts to come upon
> > us.
> 
> What happened to "don't take this too seriously, I'm just trying to get
> the ball rolling"?
> 

Nothing, it's called writing the email when I was tired and while I
was trying to fall asleep realizing what I had done.  =)

It still needs to go out to c.l.py and will probably sit for a long
while unpronounced.  That's the reason I was saying that the
transition plan needs to be fleshed out with 2.x, 2.x+1 version
numbers instead of concrete ones like 2.5 .

-Brett

From jack at performancedrivers.com  Tue Aug  9 19:53:38 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Tue, 9 Aug 2005 13:53:38 -0400
Subject: [Python-Dev] Major revision of PEP 348 committed
In-Reply-To: <d11dcfba050808232832d1626e@mail.gmail.com>
References: <bbaeab1005080621266bcc87@mail.gmail.com>
	<000401c59b36$01226de0$e410c797@oemcomputer>
	<d11dcfba050808232832d1626e@mail.gmail.com>
Message-ID: <20050809175338.GB1365@performancedrivers.com>

On Tue, Aug 09, 2005 at 12:28:08AM -0600, Steven Bethard wrote:
> Raymond Hettinger wrote:
> > If the PEP can't resist the urge to create new intermediate groupings,
> > then start by grepping through tons of Python code to find-out which
> > exceptions are typically caught on the same line.  That would be a
> > worthwhile empirical study and may lead to useful insights.
> 
> I was curious, so I did a little grepping (ok, os.walking and
> re.findalling) ;-) through the Python source.  The only exceptions
> that were caught together more than 5 times were:
> 
> AttributeError and TypeError (23 instances)
> ImportError and AttributeError (9 instances)
> OverflowError and ValueError (9 instances)
> IOError and OSError (6 instances)

I grepped my own source (ok, find, xargs, and grep'd ;) and here is
what I found.  40 KLOCs, it is a web app so I mainly catch multiple
exceptions when interpreting URLs and doing type convertions.  Unexpected
quacks from inside the app are allowed to rise to the top because at
that point all the input should be in a good state.

All of these arise because more than one operation is happening
in the try/except each of which could raise an exception (even if it
is a one-liner).

ValueError, TypeError (6 instances)
Around calls to int() like
  foo = int(cgi_dict.get('foo', None))

This is pretty domain specific, cgi variables are in a dict-alike object
that returns None for missing keys.  If it was a proper dict instead
this pairing would be (ValueError, KeyError).

The rest are a variation on the above where the result is used in the
same couple lines to do some kind of a lookup in a dict, list, or
namespace.
  client_id = int(cgi_dict.get('foo', None))
  client_name = names[client_id]

ValueError, TypeError, AttributeError (2 instances)
ValueError, TypeError, KeyError (3 instances)
ValueError, TypeError, IndexError (3 instances)

And finally this one because bsddb can say "Failed" in more than one way.

IOError, bsddb.error (2 incstances)
  btree = bsddb.btopen(self.filename, open_type)


-Jack


From nas at arctrix.com  Tue Aug  9 21:19:13 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 9 Aug 2005 13:19:13 -0600
Subject: [Python-Dev] Sourceforge CVS down?
Message-ID: <20050809191913.GB22038@mems-exchange.org>

I've been getting:

  ssh: connect to host cvs.sourceforge.net port 22: Connection refused

for the past few hours.  Their "Site News" doesn't say anything
about downtime.

  Neil

From martin at v.loewis.de  Tue Aug  9 22:05:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 Aug 2005 22:05:58 +0200
Subject: [Python-Dev] Sourceforge CVS down?
In-Reply-To: <20050809191913.GB22038@mems-exchange.org>
References: <20050809191913.GB22038@mems-exchange.org>
Message-ID: <42F90CA6.5090704@v.loewis.de>

Neil Schemenauer wrote:
> I've been getting:
> 
>   ssh: connect to host cvs.sourceforge.net port 22: Connection refused
> 
> for the past few hours.  Their "Site News" doesn't say anything
> about downtime.

I'm seeing the same.

Martin

From tim.peters at gmail.com  Tue Aug  9 22:06:45 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 9 Aug 2005 16:06:45 -0400
Subject: [Python-Dev] Sourceforge CVS down?
In-Reply-To: <20050809191913.GB22038@mems-exchange.org>
References: <20050809191913.GB22038@mems-exchange.org>
Message-ID: <1f7befae050809130638398dbe@mail.gmail.com>

[Neil Schemenauer[
> I've been getting:
> 
>  ssh: connect to host cvs.sourceforge.net port 22: Connection refused
> 
> for the past few hours.  Their "Site News" doesn't say anything
> about downtime.

A cvs update doesn't work for me either now.  I did finish one
sometime before noon (EDT) today, though.

From eric.nieuwland at xs4all.nl  Tue Aug  9 22:32:50 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 9 Aug 2005 22:32:50 +0200
Subject: [Python-Dev] PEP 348 and ControlFlow
Message-ID: <c0a4f62addae7083a10901c4738339f9@xs4all.nl>

Dear all,

Sorry to bring this up again, but I think there is an inconsistency in 
PEP 348 in its current formulation.

 From PEP: "In Python 2.4, a bare except clause will catch any and all 
exceptions. Typically, though, this is not what is truly desired. More 
often than not one wants to catch all error exceptions that do not 
signify a "bad" interpreter state. In the new exception hierarchy this 
is condition is embodied by Exception. Thus bare except clauses will 
catch only exceptions inheriting from Exception."

So,  bare except will catch anything that is an Exception. This 
includes GeneratorExit and StopIteration, which contradicts: "It has 
been suggested that ControlFlowException should inherit from Exception. 
This idea has been rejected based on the thinking that control flow 
exceptions typically should not be caught by bare except clauses, 
whereas Exception subclasses should be."

To me this means GeneratorExit and StopIteration are to be taken out of 
the Exception subtree. It seems to me rather awkward to put them at the 
same level as Exception and TerminatingException. So there comes the 
old (yeah, I know REJECTED) idea of a ControlFlowException class, right 
next to Exception and TerminatingException:

BaseException
	+TerminatingException
		+ ...
	+ Exception
		+ ...
	+ ControlFlowException
		+ GeneratorExit
		+ StopIteration

Is my logic flawed (again ;-)?

--eric
Eric Nieuwland


From gvwilson at cs.utoronto.ca  Tue Aug  9 15:05:14 2005
From: gvwilson at cs.utoronto.ca (Greg Wilson)
Date: Tue, 9 Aug 2005 09:05:14 -0400 (EDT)
Subject: [Python-Dev] PSF grant / contacts
Message-ID: <Pine.GSO.4.58.0508090902540.18474@dvp.cs>

Hi,

I'm working with support from the Python Software Foundation to develop an
open source course on basic software development skills for people with
backgrounds in science and engineering.  I have a beta version of the
course notes ready for review, and would like to pull in Python-friendly
people in sci&eng to look it over and give me feedback.  If you know
people who fit this bill (particularly people who might be interested in
following along with a trial run of the course this fall), I'd be grateful
for pointers.

Thanks,
Greg Wilson

From raymond.hettinger at verizon.net  Wed Aug 10 01:15:17 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 09 Aug 2005 19:15:17 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab1005080910171d6ba70@mail.gmail.com>
Message-ID: <001701c59d38$3517aa80$302ac797@oemcomputer>

[Brett]
> The problem with existing code checking for this situation is that the
> situation itself is not the same as it will be if bare 'except's
> change::
> 
>  try:
>    ...
>  except:
>    ...
>  except TerminatingException:
>    ...
> 
> has never really been possible before, but will be if the PEP goes
> forward.

That's not an improvement.  The above code fragment should trigger a gag
reflex indicating that something is wrong with the proposed default for
a bare except.


> Having a catch-all for
> exceptions that a bare 'except' will skip that is more explicit than
> ``except BaseException`` seems reasonable to me.  

The data gathered by Jack and Steven's research indicate that the number
of cases where TerminatingException would be useful is ZERO.  Try not to
introduce a new builtin that no one will ever use.  Try not to add a new
word whose only function is to replace a two-word tuple (TOOWTDI).  Try
not to unnecessarily nest the tree (FITBN).  Try not to propose
solutions to problems that don't exist (PBP).  


Raymond


From stephan.richter at tufts.edu  Wed Aug 10 00:24:44 2005
From: stephan.richter at tufts.edu (Stephan Richter)
Date: Tue, 9 Aug 2005 18:24:44 -0400
Subject: [Python-Dev] PSF grant / contacts
In-Reply-To: <Pine.GSO.4.58.0508090902540.18474@dvp.cs>
References: <Pine.GSO.4.58.0508090902540.18474@dvp.cs>
Message-ID: <200508091824.45192.stephan.richter@tufts.edu>

On Tuesday 09 August 2005 09:05, Greg Wilson wrote:
> I'm working with support from the Python Software Foundation to develop an
> open source course on basic software development skills for people with
> backgrounds in science and engineering. ?I have a beta version of the
> course notes ready for review, and would like to pull in Python-friendly
> people in sci&eng to look it over and give me feedback. ?If you know
> people who fit this bill (particularly people who might be interested in
> following along with a trial run of the course this fall), I'd be grateful
> for pointers.

Yeah, I would be interested. I have taught my fellow grad students last 
semester Python, but the docs out there were not that good for teaching 
scientific data analysis. I am planning to repeat the course with Physics 
undergrad students this Fall. If you could send me the material, I would 
appreciate it.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From theller at python.net  Wed Aug 10 09:22:32 2005
From: theller at python.net (Thomas Heller)
Date: Wed, 10 Aug 2005 09:22:32 +0200
Subject: [Python-Dev] Sourceforge CVS down?
References: <20050809191913.GB22038@mems-exchange.org>
	<1f7befae050809130638398dbe@mail.gmail.com>
Message-ID: <7jeu6ytj.fsf@python.net>

Tim Peters <tim.peters at gmail.com> writes:

> [Neil Schemenauer[
>> I've been getting:
>> 
>>  ssh: connect to host cvs.sourceforge.net port 22: Connection refused
>> 
>> for the past few hours.  Their "Site News" doesn't say anything
>> about downtime.
>
> A cvs update doesn't work for me either now.  I did finish one
> sometime before noon (EDT) today, though.

They've been upgrading the CVS server hardware.  See the 'site status'
page http://sourceforge.net/docman/display_doc.php?group_id=1&docid=2352

Thomas


From fredrik at pythonware.com  Wed Aug 10 12:53:28 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 10 Aug 2005 12:53:28 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
References: <42E93940.6080708@v.loewis.de><1122605323.9670.11.camel@geddy.wooz.org><1f7befae0507281959abc2a7c@mail.gmail.com><1122607673.9665.38.camel@geddy.wooz.org><87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp><1122918723.9680.33.camel@warna.corp.google.com><m24qa9f5v8.wl%gnn@neville-neil.com>
	<42EF2794.1000209@v.loewis.de><66d0a6e105080312181e25fa08@mail.gmail.com><42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
Message-ID: <ddcmb9$vus$1@sea.gmane.org>

Nicholas Bastin wrote:

> It's a mature product.  I would hope that that would count for
> something.  I've had enough corrupted subversion repositories that I'm
> not crazy about the thought of using it in a production system.  I
> know I'm not the only person with this experience.

compared to Perforce, SVN is extremely fragile.  I've used both for
years, and I've never had Perforce repository break down on me.  our
SVN repositories are relatively stable these days, but the clients are
still buggy as hell (mostly along the "I don't feel like doing this today,
despite the fact that it worked yesterday, and I don't feel like telling
you what's wrong either" lines.  having to nuke workspaces from time
to time gets boring, quickly.)

in contrast, Perforce just runs and runs and runs.  the clients always
do what you tell them.  and server maintenance is trivial; just make sure
that the server starts when the host computer boots, and if you have
enough disk, just leave it running.  if you're tight on disk space, trim
away some log files now and then.  that's it.

but despite this, if all you need is a better CVS, I'd say SVN is good
enough for today's python-dev.

I'd still think that a more distributed, mail-driven system (built on
top of Mercurial, Bazaar-NG, or some such (*)) would speed up
both development and patch processing, and also make it a lot easier
for "casual contributors" and "drive-by developers" to help develop
Python, but that's another story.

</F>

*) being able to ship a fully working Python-powered SCM with the
Python source code would be an extra coolness bonus, of course. 


From gvanrossum at gmail.com  Wed Aug 10 16:32:27 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 10 Aug 2005 07:32:27 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <ddcmb9$vus$1@sea.gmane.org>
References: <42E93940.6080708@v.loewis.de>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<ddcmb9$vus$1@sea.gmane.org>
Message-ID: <ca471dc2050810073226b0ee7a@mail.gmail.com>

On 8/10/05, Fredrik Lundh <fredrik at pythonware.com> wrote:

> in contrast, Perforce just runs and runs and runs.  the clients always
> do what you tell them.  and server maintenance is trivial; just make sure
> that the server starts when the host computer boots, and if you have
> enough disk, just leave it running.  if you're tight on disk space, trim
> away some log files now and then.  that's it.

We've used P4 at Elemental for two years now; I mostly agree with this
assessment, although occasionally the server becomes unbearably slow
and a sysadmin does some painful magic to rescue it. Maybe that's just
because the box is underpowered.

More troublesome is that I've seen a few client repositories getting
out of sync; one developer spent a lot of time tracking down
mysterious compilation errors that went away after forced resync'ing.
We never figured out the cause, but (since he swears he didn't touch
the affected files) most likely hitting ^C during a previous sync
could've broken some things.

Another problem with P4 is that local operation is lousy -- if you
can't reach the server, you can't do *anything* -- while svn always
lets you edit and diff. Also, P4 has *no* command to tell you which
files you've created without adding them to the repository yet -- so
the most frequent build breakage is caused by missing new files.

Finally, while I hear that P4's branching support is superior over
SVN's, I find it has a steep learning curve -- almost every developer
needs some serious hand-holding before they understand P4 branches
correctly.

I'm intrigued by Linus Torvald's preference for extremely distributed
source control, but I have no experience and it seems a bit, um,
experimental. Someone should contact Steve Alexander, who I believe is
excited about Bazaar-NG.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at discworld.dyndns.org  Wed Aug 10 17:03:14 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed, 10 Aug 2005 09:03:14 -0600
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <ca471dc2050810073226b0ee7a@mail.gmail.com>
References: <1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<ddcmb9$vus$1@sea.gmane.org>
	<ca471dc2050810073226b0ee7a@mail.gmail.com>
Message-ID: <20050810150313.GA7757@discworld.dyndns.org>

Guido van Rossum <gvanrossum at gmail.com> wrote:
> 
> I'm intrigued by Linus Torvald's preference for extremely distributed
> source control, but I have no experience and it seems a bit, um,
> experimental.

"git", which is Linus' home-grown replacement for BitKeeper, quickly attracted
a development community and has grown into a reasonably full-featured
distributed RCS.  It is apparently already stable enough for serious use.
If I was trying to pick an RCS for a large, distributed project, I would at
least investigate it as a possibility.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From trentm at ActiveState.com  Wed Aug 10 20:47:40 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 10 Aug 2005 11:47:40 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <ca471dc2050810073226b0ee7a@mail.gmail.com>
References: <1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<ddcmb9$vus$1@sea.gmane.org>
	<ca471dc2050810073226b0ee7a@mail.gmail.com>
Message-ID: <20050810184740.GK15991@ActiveState.com>

[Guido van Rossum wrote]
> Also, P4 has *no* command to tell you which
> files you've created without adding them to the repository yet -- so
> the most frequent build breakage is caused by missing new files.

This one is a frequent complaint from CVS-heads here at ActiveState.
I have a p4 wrapper called "px" that extends some p4 commands (and adds
a couple). One of the commands that it extends is "diff" to add a "-sn"
(new) option similar to the "-se" (edit), "-sd" (delete).

$ px help diff
...the usual 'p4 help diff'...
    new px options:   [-sn -c changelist#]

        Px adds another -s<flag> option:
                -sn     Local files not in the p4 client.

        Px also adds the --skip option (which only makes sense
        together with -sn) to specify that regularly skipped file
        (CVS control files, *~) should be skipped.

        The '-c' option can be used to limit diff'ing to files in
        the given changelist. '-c' cannot be used with any of the
        '-s' options.


'px' should grow a "px status" a la "svn|cvs status" to give a quick
summary of local differences.  Other additions:

$ px help px

    'px' entensions to 'p4':

    px --help
        Add px-specific help output to the usual 'p4 -h' and 'p4 -?'.
        See 'px help usage'.

    px -V, --version
        Print px-specific version information in addition to the usage
        'p4 -V' output.  See 'px help usage'.

    px -g ...
        Format input/output as *un*marshalled Python objects. Compare to
        the usual 'p4 -G ...'.  See 'px help usage'.

    px annotate ...
        Identify last change to each line in given file, like 'cvs
        annotate' or 'p4pr.pl'.  See 'px help annotate'.

    px backout ...
        Provide all the general steps for rolling back a perforce
        change as described in Perforce technote 14.  See 'px help
        backout'.

    px changes -d ...
        Print the full 'p4 describe -du' output for each listed change.
        See 'px help changes'.

    px diff -sn --skip ...
        List local files not in the p4 depot. Useful for importing new
        files into a depot via 'px diff -sn --skip ./... | px -x - add'.
        See 'px help diff'.

    px diff -c <change> ...
        Limit diffing to files opened in the given pending change.  See
        'px help diff'.

    px genpatch [<change>]
        Generate a patch (usable by the GNU 'patch' program) from a
        pending or submitted chagelist.  See 'px help genpatch'.

Available here:
    http://starship.python.net/~tmick/#px

Pure python. Works on Python >=2.2. Windows, Linux, Mac OS X, Unix.


Trent

-- 
Trent Mick
TrentM at ActiveState.com

From joseh.martins at gmail.com  Wed Aug 10 20:51:36 2005
From: joseh.martins at gmail.com (Joseh Martins)
Date: Wed, 10 Aug 2005 15:51:36 -0300
Subject: [Python-Dev] Python + Ping
Message-ID: <e2568d8c050810115139b782a9@mail.gmail.com>

Hello Everybody,

I?m a beginner in python dev..

Well, i need to implement a external ping command and get the results
to view the output. How can I do that?

Per example, i need to ping and IP address and need to know if the
host is down or up.

Tka a lot?

From trentm at ActiveState.com  Wed Aug 10 21:00:31 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 10 Aug 2005 12:00:31 -0700
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508081926010.2814@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
	<Pine.LNX.4.58.0508081926010.2814@bagira>
Message-ID: <20050810190031.GM15991@ActiveState.com>

[Ilya Sandler wrote]
> 
> > At OSCON, Anthony Baxter made the point that pdb is currently one of the
> > more unPythonic modules.
> 
> What is unpythonic about pdb? Is this part of Anthony's presentation
> online? (Google found a summary and slides from presentation but they
> don't say anything about pdb's deficiencies)

Kevin Altis was policing him to 5 minutes for his lightning talk so he
didn't have a lot of time to elaborate. :) His slides were more of the
Lawrence Lessig, quick and pithy style rather than lots of explanatory
text.

I think overridability, i.e. being about to subclass the Pdb stuff to do
useful things, or lack of it was the main beef.  Mostly Anthony was
echoing comments from others' experiences with trying to work with the
Pdb code. 

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From trentm at ActiveState.com  Wed Aug 10 21:00:31 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 10 Aug 2005 12:00:31 -0700
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508081926010.2814@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
	<Pine.LNX.4.58.0508081926010.2814@bagira>
Message-ID: <20050810190031.GM15991@ActiveState.com>

[Ilya Sandler wrote]
> 
> > At OSCON, Anthony Baxter made the point that pdb is currently one of the
> > more unPythonic modules.
> 
> What is unpythonic about pdb? Is this part of Anthony's presentation
> online? (Google found a summary and slides from presentation but they
> don't say anything about pdb's deficiencies)

Kevin Altis was policing him to 5 minutes for his lightning talk so he
didn't have a lot of time to elaborate. :) His slides were more of the
Lawrence Lessig, quick and pithy style rather than lots of explanatory
text.

I think overridability, i.e. being about to subclass the Pdb stuff to do
useful things, or lack of it was the main beef.  Mostly Anthony was
echoing comments from others' experiences with trying to work with the
Pdb code. 

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From tdelaney at avaya.com  Wed Aug 10 22:16:49 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Thu, 11 Aug 2005 06:16:49 +1000
Subject: [Python-Dev] Python + Ping
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com>

Joseh Martins wrote:

> I?m a beginner in python dev..
> 
> Well, i need to implement a external ping command and get the results
> to view the output. How can I do that?
> 
> Per example, i need to ping and IP address and need to know if the
> host is down or up.

python-dev is for discussion of the development *of* python, not development *with* python.

This question should be posted to the python-list at python.org discussion list (or comp.lang.python newsgroup - they're the same thing) or possibly even the tutor at python.org mailing list.

Tim Delaney

From jcarlson at uci.edu  Wed Aug 10 21:06:01 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 10 Aug 2005 12:06:01 -0700
Subject: [Python-Dev] Python + Ping
In-Reply-To: <e2568d8c050810115139b782a9@mail.gmail.com>
References: <e2568d8c050810115139b782a9@mail.gmail.com>
Message-ID: <20050810120458.7812.JCARLSON@uci.edu>


Your email is off-topic for python-dev, which is for the development OF
Python.  Repost your question on python-list.

 - Josiah


Joseh Martins <joseh.martins at gmail.com> wrote:
> 
> Hello Everybody,
> 
> I?m a beginner in python dev..
> 
> Well, i need to implement a external ping command and get the results
> to view the output. How can I do that?
> 
> Per example, i need to ping and IP address and need to know if the
> host is down or up.
> 
> Tka a lot?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu


From raymond.hettinger at verizon.net  Wed Aug 10 23:27:47 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 10 Aug 2005 17:27:47 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab1005080910171d6ba70@mail.gmail.com>
Message-ID: <000001c59df2$5bc96960$60b9958d@oemcomputer>

> > WindowsError
> > ------------
> >
> > This should be kept.  Unlike module specific exceptions, this
exception
> > occurs in multiple places and diverse applications.  It is
appropriate
> > to list as a builtin.
> >
> > "Too O/S specific" is not a reason for eliminating this.  Looking at
the
> > codebase there does not appear to be a good substitute.  Eliminating
> > this one would break code, decrease clarity, and cause modules to
grow
> > competing variants.

[Brett]
> I unfortunately forgot to add that the exception would be moved under
> os, so it would be more of a renaming than a removal.

Isn't OSError already used for another purpose (non-platform dependent
exceptions raised by the os module)?


Raymond


From bcannon at gmail.com  Wed Aug 10 23:34:00 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 10 Aug 2005 14:34:00 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <000001c59df2$5bc96960$60b9958d@oemcomputer>
References: <bbaeab1005080910171d6ba70@mail.gmail.com>
	<000001c59df2$5bc96960$60b9958d@oemcomputer>
Message-ID: <bbaeab1005081014344a8cfd57@mail.gmail.com>

On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > > WindowsError
> > > ------------
> > >
> > > This should be kept.  Unlike module specific exceptions, this
> exception
> > > occurs in multiple places and diverse applications.  It is
> appropriate
> > > to list as a builtin.
> > >
> > > "Too O/S specific" is not a reason for eliminating this.  Looking at
> the
> > > codebase there does not appear to be a good substitute.  Eliminating
> > > this one would break code, decrease clarity, and cause modules to
> grow
> > > competing variants.
> 
> [Brett]
> > I unfortunately forgot to add that the exception would be moved under
> > os, so it would be more of a renaming than a removal.
> 
> Isn't OSError already used for another purpose (non-platform dependent
> exceptions raised by the os module)?
> 

Don't quite follow what that has to do with making WindowsError become
os.WindowsError.  Yes, OSError is meant for platform-agnostic OS
errors by the os module, but how does that affect the proposed move of
WindowsError?

-Brett

From raymond.hettinger at verizon.net  Thu Aug 11 01:45:40 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 10 Aug 2005 19:45:40 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab100508101524538e097c@mail.gmail.com>
Message-ID: <000801c59e05$9e054de0$6f14c797@oemcomputer>

> > Then I don't follow what you mean by "moved under os".
> 
> In other words, to get the exception, do ``from os import
> WindowsError``.  Unfortunately we don't have a generic win module to
> put it under.  Maybe in the platform module instead?

-1 on either.  The WindowsError exception needs to in the main exception
tree.  It occurs in too many different modules and applications.  That
is a good reason for being in the main tree.

If the name bugs you, I would support renaming it to PlatformError or
somesuch.  That would make it free for use with Mac errors and Linux
errors.  Also, it wouldn't tie a language feature to the name of an MS
product.


Raymond


From bcannon at gmail.com  Thu Aug 11 02:06:22 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 10 Aug 2005 17:06:22 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <000801c59e05$9e054de0$6f14c797@oemcomputer>
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
Message-ID: <bbaeab1005081017063a4f9b15@mail.gmail.com>

On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > > Then I don't follow what you mean by "moved under os".
> >
> > In other words, to get the exception, do ``from os import
> > WindowsError``.  Unfortunately we don't have a generic win module to
> > put it under.  Maybe in the platform module instead?
> 
> -1 on either.  The WindowsError exception needs to in the main exception
> tree.  It occurs in too many different modules and applications.  That
> is a good reason for being in the main tree.
> 

Where is it used so much?  In the stdlib, grepping for WindowsError
recursively in Lib in 2.4 turns up only one module raising it
(subprocess) and only two modules with a total of three places of
catching it (ntpath once, urllib twice).  In Module, there are no
hits.

> If the name bugs you, I would support renaming it to PlatformError or
> somesuch.  That would make it free for use with Mac errors and Linux
> errors.  Also, it wouldn't tie a language feature to the name of an MS
> product.
> 

I can compromise to this if others prefer this alternative.  Anybody
else have an opinion?

-Brett

From aahz at pythoncraft.com  Thu Aug 11 02:16:06 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 10 Aug 2005 17:16:06 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab1005081017063a4f9b15@mail.gmail.com>
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
	<bbaeab1005081017063a4f9b15@mail.gmail.com>
Message-ID: <20050811001606.GA14208@panix.com>

On Wed, Aug 10, 2005, Brett Cannon wrote:
> On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>>
>> If the name bugs you, I would support renaming it to PlatformError or
>> somesuch.  That would make it free for use with Mac errors and Linux
>> errors.  Also, it wouldn't tie a language feature to the name of an MS
>> product.
> 
> I can compromise to this if others prefer this alternative.  Anybody
> else have an opinion?

Googling for "windowserror python" produces 800 hits.  So yes, it does
seem to be widely used.  I'm -0 on renaming; +1 on leaving things as-is.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From trentm at ActiveState.com  Thu Aug 11 02:25:38 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 10 Aug 2005 17:25:38 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab1005081017063a4f9b15@mail.gmail.com>
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
	<bbaeab1005081017063a4f9b15@mail.gmail.com>
Message-ID: <20050811002538.GA27433@ActiveState.com>

[Brett Cannon wrote]
> Where is it used so much?  In the stdlib, grepping for WindowsError
> recursively in Lib in 2.4 turns up only one module raising it
> (subprocess) and only two modules with a total of three places of
> catching it (ntpath once, urllib twice).  In Module, there are no
> hits.

Just a data point (not really following this thread): The PyWin32
sources raise WindowsError twice (one of them is
win32\Demos\winprocess.py which is probably where subprocess got it
from) an catches it in 11 places.

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From bcannon at gmail.com  Thu Aug 11 02:47:39 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 10 Aug 2005 17:47:39 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <20050811001606.GA14208@panix.com>
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
	<bbaeab1005081017063a4f9b15@mail.gmail.com>
	<20050811001606.GA14208@panix.com>
Message-ID: <bbaeab10050810174771335836@mail.gmail.com>

On 8/10/05, Aahz <aahz at pythoncraft.com> wrote:
> On Wed, Aug 10, 2005, Brett Cannon wrote:
> > On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> >>
> >> If the name bugs you, I would support renaming it to PlatformError or
> >> somesuch.  That would make it free for use with Mac errors and Linux
> >> errors.  Also, it wouldn't tie a language feature to the name of an MS
> >> product.
> >
> > I can compromise to this if others prefer this alternative.  Anybody
> > else have an opinion?
> 
> Googling for "windowserror python" produces 800 hits.  So yes, it does
> seem to be widely used.  I'm -0 on renaming; +1 on leaving things as-is.

But Googling for "attributeerror python" turns up 94,700, a factor of
over 118.  OSError turns up 20,300 hits; a factor of 25.  Even
EnvironmentError turns up more at 5,610 and I would expect most people
don't use this class directly that often.

While 800 might seem large, it's puny compared to other exceptions. 
Plus, if you look at the first 10 hits, 4 are from PEP 348, one of
which is the top hit.  =)

-Brett

From kbk at shore.net  Thu Aug 11 03:34:43 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Wed, 10 Aug 2005 21:34:43 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200508110134.j7B1YhHG024463@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  352 open ( -2) /  2896 closed ( +8) /  3248 total ( +6)
Bugs    :  913 open ( +4) /  5162 closed (+10) /  6075 total (+14)
RFE     :  191 open ( +0) /   178 closed ( +0) /   369 total ( +0)

New / Reopened Patches
______________________

compiler package: "global a; a=5"  (2005-08-04)
       http://python.org/sf/1251748  opened by  Armin Rigo

Simplying Tkinter's event loop  (2005-08-05)
       http://python.org/sf/1252236  opened by  Michiel de Hoon

modulefinder misses modules  (2005-08-05)
       http://python.org/sf/1252550  opened by  Thomas Heller

poplib list() docstring fix  (2005-08-05)
CLOSED http://python.org/sf/1252706  opened by  Steve Greenland

QuickTime API needs corrected object types  (2005-08-09)
       http://python.org/sf/1254695  opened by  Christopher K Davis

GCC detection for runtime_library_dirs when ccache is used  (2005-08-09)
       http://python.org/sf/1254718  opened by  Seo Sanghyeon

Patches Closed
______________

Faster commonprefix in macpath, ntpath, etc.  (2005-01-20)
       http://python.org/sf/1105730  closed by  birkenfeld

poplib list() docstring fix  (2005-08-05)
       http://python.org/sf/1252706  closed by  birkenfeld

absolute paths cause problems for MSVC  (2003-10-21)
       http://python.org/sf/827386  closed by  loewis

Fix LINKCC (Bug #1189330)  (2005-07-15)
       http://python.org/sf/1239112  closed by  loewis

file.encoding support for file.write and file.writelines  (2005-06-04)
       http://python.org/sf/1214889  closed by  birkenfeld

st_gen and st_birthtime support for FreeBSD  (2005-04-11)
       http://python.org/sf/1180695  closed by  loewis

Add unicode for sys.argv, os.environ, os.system  (2005-07-02)
       http://python.org/sf/1231336  closed by  loewis

Refactoring Python/import.c  (2004-12-30)
       http://python.org/sf/1093253  closed by  theller

New / Reopened Bugs
___________________

cgitb gives wrong lineno inside try:..finally:  (2005-08-03)
       http://python.org/sf/1251026  opened by  Rob W.W. Hooft

Decoding with unicode_internal segfaults on UCS-4 builds  (2005-08-03)
       http://python.org/sf/1251300  opened by  nhaldimann

smtplib and email.py  (2005-08-03)
       http://python.org/sf/1251528  opened by  Cosmin Nicolaescu

Python 2.4.1 crashes when importing the attached script  (2005-08-04)
       http://python.org/sf/1251631  opened by  Viktor Ferenczi

Fail codecs.lookup() on 'mbcs' and 'tactis'  (2005-08-04)
       http://python.org/sf/1251921  reopened by  lemburg

Fail codecs.lookup() on 'mbcs' and 'tactis'  (2005-08-04)
       http://python.org/sf/1251921  opened by  liturgist

Issue with telnetlib read_until not timing out  (2005-08-04)
       http://python.org/sf/1252001  opened by  padded

IOError after normal write  (2005-08-04)
       http://python.org/sf/1252149  opened by  Patrick Gerken

os.system on win32 can't handle pathnames with spaces  (2005-08-05)
CLOSED http://python.org/sf/1252733  opened by  Ori Avtalion

non-admin install may fail (win xp pro)  (2005-07-05)
CLOSED http://python.org/sf/1232947  reopened by  loewis

raw_input() displays wrong unicode prompt  (2005-01-10)
       http://python.org/sf/1099364  reopened by  prikryl

Python interpreter unnecessarily linked against c++ runtime  (2005-08-08)
       http://python.org/sf/1254125  opened by  Zak Kipling

parser fails on long non-ascii lines if coding declared  (2005-08-08)
CLOSED http://python.org/sf/1254248  opened by  Oleg Noga

Docs for list.extend() are incorrect  (2005-08-08)
CLOSED http://python.org/sf/1254362  opened by  Kent Johnson

"appropriately decorated" is undefined in MultiFile.push doc  (2005-08-09)
       http://python.org/sf/1255218  opened by  Alan

float('-inf')  (2005-08-10)
       http://python.org/sf/1255395  opened by  Steven Bird

bug in use of __getattribute__ ?  (2005-08-10)
CLOSED http://python.org/sf/1256010  opened by  sylvain ferriol

Bugs Closed
___________

numarray in debian python 2.4.1  (2005-08-02)
       http://python.org/sf/1249903  closed by  birkenfeld

incorrect description of range function  (2005-08-02)
       http://python.org/sf/1250306  closed by  birkenfeld

isinstance() fails depending on how modules imported  (2005-08-01)
       http://python.org/sf/1249615  closed by  hgibson50

set of pdb breakpoint fails  (2005-07-30)
       http://python.org/sf/1248127  closed by  birkenfeld

Fail codecs.lookup() on 'mbcs' and 'tactis'  (2005-08-04)
       http://python.org/sf/1251921  closed by  loewis

os.system on win32 can't handle pathnames with spaces  (2005-08-05)
       http://python.org/sf/1252733  closed by  salty-horse

distutils: MetroWerks support can go  (2005-07-17)
       http://python.org/sf/1239948  closed by  jackjansen

non-admin install may fail (win xp pro)  (2005-07-05)
       http://python.org/sf/1232947  closed by  loewis

segfault in os module  (2005-06-24)
       http://python.org/sf/1226969  closed by  loewis

LINKCC incorrect  (2005-04-25)
       http://python.org/sf/1189330  closed by  loewis

parser fails on long non-ascii lines if coding declared  (2005-08-08)
       http://python.org/sf/1254248  closed by  doerwalter

Docs for list.extend() are incorrect  (2005-08-08)
       http://python.org/sf/1254362  closed by  birkenfeld

bug in use of __getattribute__ ?  (2005-08-10)
       http://python.org/sf/1256010  closed by  birkenfeld


From raymond.hettinger at verizon.net  Thu Aug 11 03:44:05 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 10 Aug 2005 21:44:05 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <20050811001606.GA14208@panix.com>
Message-ID: <000301c59e16$28d453c0$8b33c797@oemcomputer>

[Brett]
> I can compromise to this if others prefer this alternative.  Anybody
> else have an opinion?

We're not opinion shopping -- we're looking for analysis.  Py3.0 is not
supposed to just a Python variant -- it is supposed to be better.  It is
not about making compromises -- it is about only making changes that are
clear improvements.  First, do no harm.

It is an abuse of the PEP process to toss up one random idea after
another with whimsical justifications, zero research, zero analysis of
the implications, no respect for existing code, no recognition that the
current design is somewhat successful, and contravention of basic design
principles (Zen of Python).

The only thing worse is wasting everyone's time by sticking to the
proposals like glue when others take the time to think it through and
offer sound reasons why the proposal is not a good idea.


[Aahz]
> Googling for "windowserror python" produces 800 hits.  So yes, it does
> seem to be widely used.  I'm -0 on renaming; +1 on leaving things
as-is.

Well said.  Squirreling WindowsError away in another namespace harms
existing code, reduces clarity, and offers no offsetting gains.  It is
simply crummy design to take a multi-module, multi-application exception
and push it down into a module namespace.

+0 on renaming; +1 on leaving as-is.


Raymond


From raymond.hettinger at verizon.net  Thu Aug 11 05:32:57 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 10 Aug 2005 23:32:57 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <bbaeab1005081019302e7755ac@mail.gmail.com>
Message-ID: <000901c59e25$5e823fa0$7c06a044@oemcomputer>

>  There
> is a reason you listed writing a PEP on your own on the "School of
> Hard Knocks" list; it isn't easy.  I am trying my best here.

Hang in there.  Do what you can to make sure we get a result we can live
with.


-- R


From foom at fuhm.net  Thu Aug 11 16:57:21 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 11 Aug 2005 10:57:21 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <001701c59d38$3517aa80$302ac797@oemcomputer>
References: <001701c59d38$3517aa80$302ac797@oemcomputer>
Message-ID: <97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net>

On Aug 9, 2005, at 7:15 PM, Raymond Hettinger wrote:
> The data gathered by Jack and Steven's research indicate that the  
> number
> of cases where TerminatingException would be useful is ZERO.  Try  
> not to
> introduce a new builtin that no one will ever use.  Try not to add  
> a new
> word whose only function is to replace a two-word tuple (TOOWTDI).   
> Try
> not to unnecessarily nest the tree (FITBN).  Try not to propose
> solutions to problems that don't exist (PBP).

I disagree. TerminatingException is useful. For the immediate future,  
I'd like to be able to write code like this (I'm assuming that  
"except:" means what it means now, because changing that for Py2.5  
would be insane):
   try:
     TerminatingException
   except NameError:
     # compatibility with python < 2.5
     TerminatingException = (KeyboardInterrupt, SystemExit)

   try:
     foo....
   except TerminatingException:
     raise
   except:
     print "error message"

What this gets me:
  1) easy backwards compatibility with earlier pythons which still  
have KeyboardInterrupt and SystemExit under Exception and don't  
provide TerminatingException
  2) I still catch string exceptions, in case anyone raises one
  3) Forward compatibility with pythons that add more types of  
terminating exceptions.

James

From foom at fuhm.net  Thu Aug 11 19:10:28 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 11 Aug 2005 13:10:28 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <000801c59e05$9e054de0$6f14c797@oemcomputer>
References: <000801c59e05$9e054de0$6f14c797@oemcomputer>
Message-ID: <A9572A67-94C4-45D9-9AA2-3FF3AA94889E@fuhm.net>


On Aug 10, 2005, at 7:45 PM, Raymond Hettinger wrote:

>>> Then I don't follow what you mean by "moved under os".
>>>
>>
>> In other words, to get the exception, do ``from os import
>> WindowsError``.  Unfortunately we don't have a generic win module to
>> put it under.  Maybe in the platform module instead?
>>
>
> -1 on either.  The WindowsError exception needs to in the main  
> exception
> tree.  It occurs in too many different modules and applications.  That
> is a good reason for being in the main tree.
>
> If the name bugs you, I would support renaming it to PlatformError or
> somesuch.  That would make it free for use with Mac errors and Linux
> errors.  Also, it wouldn't tie a language feature to the name of an MS
> product.

WindowsError is an important distinction because its error codes are  
to be interepreted as being from Microsoft's windows error code list.  
That is a useful meaning. PlatformError is completely meaningless.  
Whether or not Python should really be raising errors with error  
numbers from the MS error number list instead of translating them to  
standard error codes is another issue...but as long as it does so, it  
should do so using WindowsError.

James

From foom at fuhm.net  Thu Aug 11 23:19:29 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 11 Aug 2005 17:19:29 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <20050811114121.781B.JCARLSON@uci.edu>
References: <001701c59d38$3517aa80$302ac797@oemcomputer>
	<97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net>
	<20050811114121.781B.JCARLSON@uci.edu>
Message-ID: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>

On Aug 11, 2005, at 2:41 PM, Josiah Carlson wrote:
> Remember, the Exception reorganization is for Python 3.0/3k/whatever,
> not for 2.5 .

Huh, I could *swear* we were talking about fixing things for  
2.5...but I see at least the current version of the PEP says it's  
talking about 3.0. If that's true, this is hardly worth discussing as  
3.0 is never going to happen anyways.

And here I was hoping this was an actual proposal. Ah well, then.

James

From jimjjewett at gmail.com  Thu Aug 11 23:21:19 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 11 Aug 2005 17:21:19 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Objects
	setobject.c, 1.45, 1.46
In-Reply-To: <20050811075856.1C4B31E4004@bag.python.org>
References: <20050811075856.1C4B31E4004@bag.python.org>
Message-ID: <fb6fbf5605081114213b56ca8@mail.gmail.com>

(1)  Is there a reason that you never shrink sets for discard/remove/pop?  

(set difference will do a one-time shrink, if there are enough dummy
entries, but even then, it doesn't look at the %filled, so a
merge-related overallocation will stick around)

I note the you do the same with dicts, but I think sets are a more
natural candidate for "this is the set of things I still have to
process, in any order".  (I suppose enforcing an order with deque may
be faster -- unless I'm worried about duplicates.)

(2)  When adding an element, you check that 

    if (!(so->used > n_used && so->fill*3 >= (so->mask+1)*2))

Is there any reason to use that +1?  Without it, resizes will happen
element sooner, but probably not much more often -- and you could
avoid an add on every insert.
(I suppose dictionaries have the same question.)

(3)  In set_merge, when finding the new size, you use (so->fill + other->used)

Why so->fill?  If it is different from so->used, then the extras are
dummy entries that it would be good to replace.

(I note that dictobject does use ->used.)

-jJ

From bcannon at gmail.com  Thu Aug 11 23:28:01 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 11 Aug 2005 14:28:01 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>
References: <001701c59d38$3517aa80$302ac797@oemcomputer>
	<97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net>
	<20050811114121.781B.JCARLSON@uci.edu>
	<7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>
Message-ID: <bbaeab1005081114283b115fd9@mail.gmail.com>

On 8/11/05, James Y Knight <foom at fuhm.net> wrote:
> On Aug 11, 2005, at 2:41 PM, Josiah Carlson wrote:
> > Remember, the Exception reorganization is for Python 3.0/3k/whatever,
> > not for 2.5 .
> 
> Huh, I could *swear* we were talking about fixing things for
> 2.5...but I see at least the current version of the PEP says it's
> talking about 3.0. If that's true, this is hardly worth discussing as
> 3.0 is never going to happen anyways.
> 

And why do you think it will never happen?  Guido has already said
publicly multiple times that the 2.x branch will not go past 2.9, so
unless Python goes stale there will be a 3.0 release.  Python 3.0
might not be around the corner, but will come eventually and this
stuff needs to get done at some point.

-Brett

From gvanrossum at gmail.com  Fri Aug 12 00:10:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 11 Aug 2005 15:10:24 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>
References: <001701c59d38$3517aa80$302ac797@oemcomputer>
	<97674DB3-344F-4271-8857-8F798FA29D50@fuhm.net>
	<20050811114121.781B.JCARLSON@uci.edu>
	<7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>
Message-ID: <ca471dc20508111510574b7a79@mail.gmail.com>

On 8/11/05, James Y Knight <foom at fuhm.net> wrote:
>  If that's true, this is hardly worth discussing as
> 3.0 is never going to happen anyways.

You are wrong. So wrong.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Fri Aug 12 01:36:20 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 11 Aug 2005 19:36:20 -0400
Subject: [Python-Dev] [Python-checkins]
 python/dist/src/Objectssetobject.c, 1.45, 1.46
In-Reply-To: <fb6fbf5605081114213b56ca8@mail.gmail.com>
Message-ID: <000801c59ecd$7afe5620$a838c797@oemcomputer>

[Jim Jewett]
> (1)  Is there a reason that you never shrink sets for
discard/remove/pop?

Yes, to avoid adding an O(n) step to what would otherwise be an O(1)
operation.  These tight, granular methods are so fast that even checking
for potential resizes would impact their performance.

Also, I was keeping the dict philosophy of shrinking only when an item
is added.  That approach prevents thrashing in the face of a series of
alternating add/pop operations.

OTOH, practicality beats purity.  Set differencing demands some
downsizing code (see below).


> (set difference will do a one-time shrink, if there are enough dummy
> entries, but even then, it doesn't look at the %filled, so a
> merge-related overallocation will stick around)
> 
> I note the you do the same with dicts, but I think sets are a more
> natural candidate for "this is the set of things I still have to
> process, in any order".  (I suppose enforcing an order with deque may
> be faster -- unless I'm worried about duplicates.)

It is all about balancing trade-offs.  Dummies have very little impact
on iteration speed, it is the used/(mask+1) sparseness ratio that
matters.  Also, they have very little impact on lookup time unless the
table is nearly full (and it affects not-found searches more than
successful searches).  Resizing is not a cheap operation.  The right
balance is very likely application dependent.  For now, my goal is to
deviate from dict code only for clear improvements (i.e. lookups based
on entry rather than just the key).


> (2)  When adding an element, you check that
> 
>     if (!(so->used > n_used && so->fill*3 >= (so->mask+1)*2))
> 
> Is there any reason to use that +1?  Without it, resizes will happen
> element sooner, but probably not much more often -- and you could
> avoid an add on every insert.
> (I suppose dictionaries have the same question.)

Without the +1, small dicts and sets could only hold four entries
instead of five (which has shown itself to be a better cutoff point).

Even if this didn't apply to sets, I still aspire to keep true to
dictobject.c.  That code has been thoroughly tested and tuned.  By
starting with mature code, I've saved years of evolution.  Also, there
is a maintenance benefit -- developers familiar with dictobject.c will
find setobject.c to be instantly recognizable.  There is only one new
trick, set_swap_bodies(), and that is thoroughly commented.


> (3)  In set_merge, when finding the new size, you use (so->fill +
other-
> >used)
> 
> Why so->fill?  If it is different from so->used, then the extras are
> dummy entries that it would be good to replace.
> (I note that dictobject does use ->used.)

The cop-out answer is that this is what is done in PyDict_Merge().

I believe the reasoning behind that design was to provide the best guess
as to the maximum amount of space that could be consumed by the
impending insertions.  If they will all fit, then resizing is skipped.
The approach reflects a design that values avoiding resizes more than it
values eliminating dummy entries.  AFAICT, dummy elimination is a
by-product of resizing rather than its goal.

With sets, I followed that design except for set differencing.  In
dictionaries, there is no parallel operation of mass deletion.  I had to
put in some control so that s-=t wouldn't leave a giant set with only a
handful of non-dummy entries.  This reflects the space saving goal for
the Py2.5 updates.  There was also a goal to eliminate redundant calls
to PyObject_Hash().  The nice performance improvement was an unexpected
bonus.


Your questions are good.  Thanks for reading the code and thinking about
it.  Hope you enjoy the new implementation which for the first time can
outperform dicts in terms of both space and speed.


Raymond


From raymond.hettinger at verizon.net  Fri Aug 12 01:44:54 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 11 Aug 2005 19:44:54 -0400
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <7E57B044-7521-4ADE-8FB4-1C3A8759117D@fuhm.net>
Message-ID: <000901c59ece$acdddd40$a838c797@oemcomputer>

[James Y Knight]
> Huh, I could *swear* we were talking about fixing things for
> 2.5...but I see at least the current version of the PEP says it's
> talking about 3.0. If that's true, this is hardly worth discussing as
> 3.0 is never going to happen anyways.
> 
> And here I was hoping this was an actual proposal. Ah well, then.

Whenever a 3.0 aimpoint is agreed upon, as much as possible will be
introduced before then (pretty much everything that doesn't break code).

Ideally, the final step to 3.0 will consist primary of dropping obsolete
things that had been kept only for backwards compatibility.


Raymond


From anthony at interlink.com.au  Fri Aug 12 02:51:55 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 11 Aug 2005 17:51:55 -0700
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
Message-ID: <200508111751.56899.anthony@interlink.com.au>

So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
we cut a release candidate either on the 7th or 14th, and a final a week 
later. 

In addition, I'd like to suggest we think about a first alpha of 2.5 sometime 
during March 2006, with a final release sometime around May-June. This would 
mean (assuming people are happy with this) we need to make a list of what's 
still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting 
for code. Once that's done, there will be a final 2.4.3 sometime after or 
close to the 2.5 final release.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From anthony at interlink.com.au  Fri Aug 12 03:02:42 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 11 Aug 2005 18:02:42 -0700
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <Pine.LNX.4.58.0508081926010.2814@bagira>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
	<Pine.LNX.4.58.0508081926010.2814@bagira>
Message-ID: <200508111802.44357.anthony@interlink.com.au>

On Monday 08 August 2005 20:13, Ilya Sandler wrote:
> > At OSCON, Anthony Baxter made the point that pdb is currently one of the
> > more unPythonic modules.
>
> What is unpythonic about pdb? Is this part of Anthony's presentation
> online? (Google found a summary and slides from presentation but they
> don't say anything about pdb's deficiencies)

It was a lightning talk, I'll put the slides up somewhere at some point.
My experience with pdb is that it's more or less impossible to extend or
subclass it in any way, and the code is pretty nasty. In addition, pretty
much everyone I asked "which modules in the std lib need to be seriously 
fixed" listed pdb first (and sometimes first, second and third). 

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From anthony at interlink.com.au  Fri Aug 12 03:14:10 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 11 Aug 2005 18:14:10 -0700
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F68C36.4090208@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>
	<42F68754.6090400@minkirri.apana.org.au>
	<42F68C36.4090208@v.loewis.de>
Message-ID: <200508111814.13072.anthony@interlink.com.au>

On Sunday 07 August 2005 15:33, Martin v. L?wis wrote:
> Ah, ok. That's true. It doesn't mean you can't do proper merging
> with subversion - it only means that it is harder, as you need to
> figure out the revision range that you want to merge.
>
> If this is too painful, you can probably use subversion to store
> the relevant information. For example, you could define a custom
> property on the directory, last_merge_from_trunk, which you
> would always update after you have done a merge operation. Then
> you don't have to look through history to find out when you
> last merged.

This is what I do with shtoom - I have properties branchURI and branchRev on 
the root of the branch. I can then use these when landing the branch. It 
seems to work well enough for me. 

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From bob at redivi.com  Fri Aug 12 03:18:30 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 11 Aug 2005 15:18:30 -1000
Subject: [Python-Dev] pdb: should next command be extended?
In-Reply-To: <200508111802.44357.anthony@interlink.com.au>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
	<Pine.LNX.4.58.0508081926010.2814@bagira>
	<200508111802.44357.anthony@interlink.com.au>
Message-ID: <24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com>


On Aug 11, 2005, at 3:02 PM, Anthony Baxter wrote:

> On Monday 08 August 2005 20:13, Ilya Sandler wrote:
>
>>> At OSCON, Anthony Baxter made the point that pdb is currently one  
>>> of the
>>> more unPythonic modules.
>>>
>>
>> What is unpythonic about pdb? Is this part of Anthony's presentation
>> online? (Google found a summary and slides from presentation but they
>> don't say anything about pdb's deficiencies)
>>
>
> It was a lightning talk, I'll put the slides up somewhere at some  
> point.
> My experience with pdb is that it's more or less impossible to  
> extend or
> subclass it in any way, and the code is pretty nasty. In addition,  
> pretty
> much everyone I asked "which modules in the std lib need to be  
> seriously
> fixed" listed pdb first (and sometimes first, second and third).

One thing PDB needs is a mode that runs as a background thread and  
opens up a socket so that another Python process can talk to it, for  
embedded/remote/GUI debugging.  This is what IDLE, Wing, and WinPDB  
(haven't tried it yet <http://www.digitalpeers.com/pythondebugger/ 
index.html>) do.

Unfortunately, most of the other Python IDE's run interpreters and  
debuggers in-process, so it makes them unsuitable for developing GUI  
and embedded apps and opens you up for crashing the IDE as well as  
whatever code you're trying to fix.

-bob


From theller at python.net  Fri Aug 12 10:42:09 2005
From: theller at python.net (Thomas Heller)
Date: Fri, 12 Aug 2005 10:42:09 +0200
Subject: [Python-Dev] Exception Reorg PEP revised yet again
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
	<bbaeab1005081017063a4f9b15@mail.gmail.com>
Message-ID: <ek8zv95q.fsf@python.net>

Brett Cannon <bcannon at gmail.com> writes:

> On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>> > > Then I don't follow what you mean by "moved under os".
>> >
>> > In other words, to get the exception, do ``from os import
>> > WindowsError``.  Unfortunately we don't have a generic win module to
>> > put it under.  Maybe in the platform module instead?
>> 
>> -1 on either.  The WindowsError exception needs to in the main exception
>> tree.  It occurs in too many different modules and applications.  That
>> is a good reason for being in the main tree.
>> 
>
> Where is it used so much?  In the stdlib, grepping for WindowsError
> recursively in Lib in 2.4 turns up only one module raising it
> (subprocess) and only two modules with a total of three places of
> catching it (ntpath once, urllib twice).  In Module, there are no
> hits.
>

I don't know how you've been grepping, but the Python api functions to
raise WindowsErrors are named like PyErr_SetFromWindowsErr() or so.

Typically, WindowsErrors are raised when Win32 API functions fail.
In the core extension modules, I find at least mmapmodule.c,
posixmodule.c, _subprocess.c, and _winreg.c raising them.  It may be a
bit hidden, because the docs for _winreg mention only EnvironmentError,
but they are wrong:

C:\>py
Python 2.5a0 (#60, Jul  4 2005, 19:53:27) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import _winreg
>>> _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT, "blah")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
WindowsError: [Errno 2] Das System kann die angegebene Datei nicht finden
>>>

>> If the name bugs you, I would support renaming it to PlatformError or
>> somesuch.  That would make it free for use with Mac errors and Linux
>> errors.  Also, it wouldn't tie a language feature to the name of an
>> MS product.
>> 
>
> I can compromise to this if others prefer this alternative.  Anybody
> else have an opinion?

Win32 has the FormatError() api to convert error codes into descriptions
- these descriptions are very useful, as are the error codes when you
catch errors in client code.

I would say as long as the Python core contains win32 specific modules
like _winreg WindowsError should stay.  For the name, I have no
preference but I see no need to change it.

Thomas

PS: For ctypes, it doesn't matter if WindowsError stays or not.  No
problem to invent my own WindowsError if it goes away.


From tzot at mediconsa.com  Fri Aug 12 10:44:19 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Fri, 12 Aug 2005 11:44:19 +0300
Subject: [Python-Dev] Terminology for PEP 343
References: <000d01c57dbc$71df2420$1330cb97@oemcomputer>
	<2macl7xxpa.fsf@starship.python.net>
Message-ID: <ddhnhi$so$1@sea.gmane.org>

"Michael Hudson" <mwh at python.net> wrote in message 
news:2macl7xxpa.fsf at starship.python.net...
>
> Guard?  Monitor?  Don't really like either of these.
>

I know I am late, but since guard means something else, 'sentinel' (in the 
line of __enter__ and __exit__ interpretation) could be an alternative. 
Tongue in cheek. 


From dw at botanicus.net  Fri Aug 12 11:05:46 2005
From: dw at botanicus.net (David Wilson)
Date: Fri, 12 Aug 2005 10:05:46 +0100
Subject: [Python-Dev] dev listinfo page (was: Re:  Python + Ping)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com>
Message-ID: <42FC666A.90206@botanicus.net>

Hello,

Would it perhaps be an idea, given the number of users posting to the 
dev list, to put a rather obvious warning on the listinfo page:

    http://mail.python.org/mailman/listinfo/python-dev

Something like

<div style="margin: 1em; background: red; color: white; padding: 1ex; 
font-weight: bold;">
     Do not post general Python questions to this list! For help
     with Python please see the Python help page!
</div>


David.

From mwh at python.net  Fri Aug 12 13:29:49 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 12 Aug 2005 12:29:49 +0100
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
In-Reply-To: <200508111751.56899.anthony@interlink.com.au> (Anthony Baxter's
	message of "Thu, 11 Aug 2005 17:51:55 -0700")
References: <200508111751.56899.anthony@interlink.com.au>
Message-ID: <2m8xz7wfyq.fsf@starship.python.net>

Anthony Baxter <anthony at interlink.com.au> writes:

> So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
> we cut a release candidate either on the 7th or 14th, and a final a week 
> later. 

Cool.  I'm not sure how many outstanding bugs should be fixed before
2.4.2.  Some stuff to do with files with PEP 263 style declarations?
(Walter?  I've lost track of these).

I think I should probably just check my fix for "PyXxx_Check(x) trusts
x->ob_type->tp_mro" (http://python.org/sf/1153075) in to both
branches, unless someone can think of a good reason not to (Armin?).
(The whole area could do with some work, really, but that's another
story).

> In addition, I'd like to suggest we think about a first alpha of 2.5 sometime 
> during March 2006, with a final release sometime around May-June. This would 
> mean (assuming people are happy with this) we need to make a list of what's 
> still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting 
> for code. Once that's done, there will be a final 2.4.3 sometime after or 
> close to the 2.5 final release.

I have some outstanding patches:

1) My PEP 343 implementation (http://python.org/sf/1235943).  Needs
   reviewing, but docs are in another patch.  I also recently realized
   that my patch is incomplete, we should accept stuff like this:

      with cm as (a,b,c): ...

   where cm.__enter__ returns a 3-sequence.  My patch just allows a
   NAME after the 'as' pseudo-keyword (if anyone else wants to fix
   this, be my guest :)

2) The new-style exceptions patch (http://python.org/sf/1104669).
   This mostly needs documentation, but could also do with
   review/testing.

3) "explicit sign variable for longs" (http://python.org/sf/1177779).
   This is a user-invisible patch, really, so I'm not so concerned
   about it (I'd like to follow it up by emitting DeprecationWarnings
   on ob_size abuse in 2.6 and disallowing it in 2.7 -- or maybe we
   could even emit DeprecationWarnings in 2.5 already).

4) "__slots__ for subclasses of variable length types"
   (http://python.org/sf/1173475) -- this is very pie-in-the-sky and
   in fact the attached patch is completely broken, but I think work
   in this area would still be a good thing.  Review the others before
   looking at this one, please :)

... then there's the ast-branch, of course ...

Is there a 2.5 release PEP yet?

Cheers,
mwh

-- 
  If i don't understand lisp, it would be wise to not bray about
  how lisp is stupid or otherwise criticize, because my stupidity
  would be archived and open for all in the know to see.
                                                -- Xah, comp.lang.lisp

From walter at livinglogic.de  Fri Aug 12 15:34:35 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 12 Aug 2005 15:34:35 +0200
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
In-Reply-To: <2m8xz7wfyq.fsf@starship.python.net>
References: <200508111751.56899.anthony@interlink.com.au>
	<2m8xz7wfyq.fsf@starship.python.net>
Message-ID: <42FCA56B.5060604@livinglogic.de>

Michael Hudson wrote:

> Anthony Baxter <anthony at interlink.com.au> writes:
> 
>>So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
>>we cut a release candidate either on the 7th or 14th, and a final a week 
>>later. 
> 
> Cool.  I'm not sure how many outstanding bugs should be fixed before
> 2.4.2.  Some stuff to do with files with PEP 263 style declarations?
> (Walter?  I've lost track of these).

True, there's a whole bunch of them (mostly duplicates):

Bug #1076985: Incorrect behaviour of StreamReader.readline leads to 
crash (fixed)
Bug #1089395: segfault/assert in tokenizer (fixed)
Bug #1098990: codec readline() splits lines apart (fixed)
Bug #1163244: Syntax error on large file with MBCS encoding (open)
Bug #1175396: codecs.readline sometimes removes newline chars (open)
Bug #1178484: Erroneous line number error in Py2.4.1 (open)
Bug #1200686: SyntaxError raised on win32 for correct files (open, 
probably duplicate)
Bug #1211639: parser tells invalid syntax with correct code (duplicate)
Bug #1218930: Parser chokes on big files (duplicate)
Bug #1225059: Line endings problem with Python 2.4.1 cause imports to 
fail (duplicate)
Bug #1241507: StreamReader broken for byte string to byte string codecs 
(fixed)
Bug #1251631: Python 2.4.1 crashes when importing the attached script 
(open, probably duplicate)
Patch #1101726: Patch for potential buffer overrun in tokenizer.c (applied)

Most of them are fixed. #1178484 is waiting for a final OK.

Bye,
    Walter D?rwald

From barry at python.org  Fri Aug 12 15:40:22 2005
From: barry at python.org (Barry Warsaw)
Date: Fri, 12 Aug 2005 09:40:22 -0400
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
In-Reply-To: <200508111751.56899.anthony@interlink.com.au>
References: <200508111751.56899.anthony@interlink.com.au>
Message-ID: <1123854022.10627.2.camel@geddy.wooz.org>

On Thu, 2005-08-11 at 20:51, Anthony Baxter wrote:
> So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
> we cut a release candidate either on the 7th or 14th, and a final a week 
> later. 

Cool.  I'd like to commit the patches in this bug report:

https://sourceforge.net/tracker/index.php?func=detail&aid=900092&group_id=5470&atid=105470

which fixes a long standing hotshot bug.  Any objections?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050812/33f20adc/attachment.pgp

From paolo_veronelli at libero.it  Fri Aug 12 17:14:24 2005
From: paolo_veronelli at libero.it (Paolino)
Date: Fri, 12 Aug 2005 17:14:24 +0200
Subject: [Python-Dev] set.remove feature/bug
Message-ID: <42FCBCD0.1000406@libero.it>

I can't contact sourceforge bug tracker sorry.

set.remove is trying to freeze sets when they are used as keys.No matter 
  if an __hash__ method is defined.

This is incoherent with Set.remove and dict.__delete__ & co.

If this is a feature ,then I ask strongly to keep sets module in the 
stdlib for ever.

Or if there is a workaround, please tell me here because python-list 
didn't help.

class H(set):
    def __hash__(self):return id(self)
s=H()

f=set()
f.add(s)
f.remove(s) #  this fails

Regards Paolino

From raymond.hettinger at verizon.net  Fri Aug 12 17:44:38 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 12 Aug 2005 11:44:38 -0400
Subject: [Python-Dev] set.remove feature/bug
In-Reply-To: <42FCBCD0.1000406@libero.it>
Message-ID: <002701c59f54$c0074ba0$dd1ecb97@oemcomputer>

[Paolino]
> I can't contact sourceforge bug tracker sorry.

I've added a bug report for you:
   www.python.org/sf/1257731


> set.remove is trying to freeze sets when they are used as keys.No
matter
>   if an __hash__ method is defined.

Will fix.  Feel free to email me off-list with any questions.


Raymond


From tzot at mediconsa.com  Fri Aug 12 18:21:57 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Fri, 12 Aug 2005 19:21:57 +0300
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
References: <200508111751.56899.anthony@interlink.com.au>
	<2m8xz7wfyq.fsf@starship.python.net>
Message-ID: <ddiib8$f07$1@sea.gmane.org>

"Michael Hudson" <mwh at python.net> wrote in message 
news:2m8xz7wfyq.fsf at starship.python.net...
> Anthony Baxter <anthony at interlink.com.au> writes:
>
>> So I'm currently planning for a 2.4.2 sometime around mid September. I 
>> figure
>> we cut a release candidate either on the 7th or 14th, and a final a week
>> later.
>
> Cool.  I'm not sure how many outstanding bugs should be fixed before
> 2.4.2.  Some stuff to do with files with PEP 263 style declarations?
> (Walter?  I've lost track of these).

This is a serious issue (spurious syntax errors).

One bug about files with encoding declarations is www.python.org/sf/1163244 
.  So far, it seems that source files having a size of f*n+x (for some small 
indeterminate value of x, and f is a power of 2 like 512 or 1024) 
occasionally fail to compile with spurious syntax errors.  (I once had a 
file show up the line with the "syntax error", and the reported line was 
comprised half from the failing line and half from the line 
above --unfortunately I kept the file for examination in a USB key that some 
colleague formatted).  The syntax errors disappear if the coding declaration 
is removed or if some blank lines are inserted before the failing line.

I think this occurs only on Windows, so it should be something to do with 
line endings and buffering.

At the moment I'm trying to create a minimal file that when imported fails 
with 2.4.1 .  I'll update the case as soon as I have one, but I wanted to 
draw some attention in python-dev in case it rings a bell. 


From tzot at mediconsa.com  Fri Aug 12 18:32:16 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Fri, 12 Aug 2005 19:32:16 +0300
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
References: <200508111751.56899.anthony@interlink.com.au><2m8xz7wfyq.fsf@starship.python.net>
	<ddiib8$f07$1@sea.gmane.org>
Message-ID: <ddiiuj$gst$1@sea.gmane.org>

> At the moment I'm trying to create a minimal file that when imported fails
> with 2.4.1 .  I'll update the case as soon as I have one, but I wanted to
> draw some attention in python-dev in case it rings a bell.

Please ignore my previous message --through gmane I saw only mwh's message, 
and after sending my reply, I got Walter's message. 


From jcarlson at uci.edu  Fri Aug 12 18:44:09 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 12 Aug 2005 09:44:09 -0700
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
In-Reply-To: <200508111751.56899.anthony@interlink.com.au>
References: <200508111751.56899.anthony@interlink.com.au>
Message-ID: <20050812091538.7836.JCARLSON@uci.edu>

For 2.5a1...

Some exposure of _PyLong_AsByteArray() and _PyLong_FromByteArray() to
Python. There was a discussion about this almost a year ago
(http://python.org/sf/1023290), and no mechanism (struct format code
addition, binascii.tolong/fromlong, long.tostring/fromstring, ...)
actually made it into Python 2.4 .  At this point, I'd be happy to get
/any/ mechanism, with a preference to struct and/or binascii (I'd put
them in both, if only because different groups of people people may look
for them in both places, and people who use one tend to like to use that
one for as much as possible, and because the code additions in both are
minor).

 - Josiah


Anthony Baxter <anthony at interlink.com.au> wrote:
> 
> So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
> we cut a release candidate either on the 7th or 14th, and a final a week 
> later. 
> 
> In addition, I'd like to suggest we think about a first alpha of 2.5 sometime 
> during March 2006, with a final release sometime around May-June. This would 
> mean (assuming people are happy with this) we need to make a list of what's 
> still outstanding for 2.5. There's a bunch of accepted PEPs that are waiting 
> for code. Once that's done, there will be a final 2.4.3 sometime after or 
> close to the 2.5 final release.
> 
> Anthony
> -- 
> Anthony Baxter     <anthony at interlink.com.au>
> It's never too late to have a happy childhood.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu


From bcannon at gmail.com  Fri Aug 12 19:00:38 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 12 Aug 2005 10:00:38 -0700
Subject: [Python-Dev] Exception Reorg PEP revised yet again
In-Reply-To: <ek8zv95q.fsf@python.net>
References: <bbaeab100508101524538e097c@mail.gmail.com>
	<000801c59e05$9e054de0$6f14c797@oemcomputer>
	<bbaeab1005081017063a4f9b15@mail.gmail.com> <ek8zv95q.fsf@python.net>
Message-ID: <bbaeab1005081210002035cf20@mail.gmail.com>

On 8/12/05, Thomas Heller <theller at python.net> wrote:
> Brett Cannon <bcannon at gmail.com> writes:
> 
> > On 8/10/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> >> > > Then I don't follow what you mean by "moved under os".
> >> >
> >> > In other words, to get the exception, do ``from os import
> >> > WindowsError``.  Unfortunately we don't have a generic win module to
> >> > put it under.  Maybe in the platform module instead?
> >>
> >> -1 on either.  The WindowsError exception needs to in the main exception
> >> tree.  It occurs in too many different modules and applications.  That
> >> is a good reason for being in the main tree.
> >>
> >
> > Where is it used so much?  In the stdlib, grepping for WindowsError
> > recursively in Lib in 2.4 turns up only one module raising it
> > (subprocess) and only two modules with a total of three places of
> > catching it (ntpath once, urllib twice).  In Module, there are no
> > hits.
> >
> 
> I don't know how you've been grepping, but the Python api functions to
> raise WindowsErrors are named like PyErr_SetFromWindowsErr() or so.
> 

Forgot to add that to the grep statement after I discovered that.

> Typically, WindowsErrors are raised when Win32 API functions fail.
> In the core extension modules, I find at least mmapmodule.c,
> posixmodule.c, _subprocess.c, and _winreg.c raising them.  It may be a
> bit hidden, because the docs for _winreg mention only EnvironmentError,
> but they are wrong:
> 
> C:\>py
> Python 2.5a0 (#60, Jul  4 2005, 19:53:27) [MSC v.1310 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import _winreg
> >>> _winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT, "blah")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> WindowsError: [Errno 2] Das System kann die angegebene Datei nicht finden
> >>>
> 
> >> If the name bugs you, I would support renaming it to PlatformError or
> >> somesuch.  That would make it free for use with Mac errors and Linux
> >> errors.  Also, it wouldn't tie a language feature to the name of an
> >> MS product.
> >>
> >
> > I can compromise to this if others prefer this alternative.  Anybody
> > else have an opinion?
> 
> Win32 has the FormatError() api to convert error codes into descriptions
> - these descriptions are very useful, as are the error codes when you
> catch errors in client code.
> 
> I would say as long as the Python core contains win32 specific modules
> like _winreg WindowsError should stay.  For the name, I have no
> preference but I see no need to change it.
> 

OK, then it will just stay as-is.

People can expect an updated PEP sometime this weekend.

-Brett

From python at rcn.com  Fri Aug 12 19:14:11 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 12 Aug 2005 13:14:11 -0400
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
Message-ID: <001001c59f61$43a72a00$9023a044@oemcomputer>

[Josiah]
> At this point, I'd be happy to get
> /any/ mechanism, with a preference to struct and/or binascii 

Assign 1023290 to me and I'll get it done in the next month or so.


Raymond

From martin at v.loewis.de  Fri Aug 12 23:51:35 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 Aug 2005 23:51:35 +0200
Subject: [Python-Dev] plans for 2.4.2 and 2.5a1
In-Reply-To: <200508111751.56899.anthony@interlink.com.au>
References: <200508111751.56899.anthony@interlink.com.au>
Message-ID: <42FD19E7.6060108@v.loewis.de>

Anthony Baxter wrote:
> So I'm currently planning for a 2.4.2 sometime around mid September. I figure 
> we cut a release candidate either on the 7th or 14th, and a final a week 
> later. 

I'm returning only on Sep 12 from vacation, so the Windows binaries of a
release candidate would have to wait until that Monday; the 14th would
suit me better. Unfortunately, I'm likely also travelling on 19..23,
so the final release would have to wait until Sep 24 or so.

Regards,
Martin

From skip at pobox.com  Sat Aug 13 02:51:38 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 12 Aug 2005 19:51:38 -0500
Subject: [Python-Dev] Hosting svn.python.org
In-Reply-To: <20050812231416.638DB1E4005@bag.python.org>
References: <20050812231416.638DB1E4005@bag.python.org>
Message-ID: <17149.17434.157348.230440@montanaro.dyndns.org>


    martin> Log Message:
    martin> Add wush.net hosting.
    ...
    martin> +  * Greg Stein suggested http://www.wush.net/subversion.php. 
    ...

I will enthusiastically cast my vote for tummy.com, Sean Reifschneider's
company.  Mojam leases a server there (Mojam & Musi-Cal websites running
CentOS 4, Apache+mod_perl, Python, Mason, MySQLdb, Mailman, etc).  Their
service has been absolutely awesome.  Sean is one of the python.org
webmasters to boot, so he knows our culture very well already.

Skip

From goodger at python.org  Sat Aug 13 03:57:46 2005
From: goodger at python.org (David Goodger)
Date: Fri, 12 Aug 2005 21:57:46 -0400
Subject: [Python-Dev] new PEP type: Process
Message-ID: <42FD539A.1060407@python.org>

Barry Warsaw and I, the PEP editors, have been discussing the need for a new PEP
type lately.  Martin von L?wis' PEP 347 was a prime example of a PEP that didn't
fit into the existing Standards Track and Informational categories.  We agreed
upon a new "Process" PEP type.  For more information, please see PEP 1
(http://www.python.org/peps/pep-0001.html) -- the type of which has also been
changed to Process.

Other good examples of Process PEPs are the release schedule PEPs, and I
understand there may be a new one soon.

(Please cc: any PEP-related mail to peps at python.org)

-- 
David Goodger <http://python.net/~goodger>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050812/18686911/signature.pgp

From kbenoit at opersys.com  Thu Aug 11 04:59:12 2005
From: kbenoit at opersys.com (Kristian Benoit)
Date: Wed, 10 Aug 2005 22:59:12 -0400
Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions
Message-ID: <1123729152.23755.3.camel@localhost>

In the C version of expat, handlers receive a void *userdata, but it is
not the case in the python version.

This means one cant parse multiple files at the same time using the same
handlers. You cant pass the context current context to the handler, you must
base your code on global variables which is not so nice.

Thanks

Please leave the cc in the mail header as I'm not subscribed to the
list.

Kristian


From senko.rasic at gmail.com  Fri Aug 12 01:40:03 2005
From: senko.rasic at gmail.com (Senko Rasic)
Date: Fri, 12 Aug 2005 01:40:03 +0200
Subject: [Python-Dev] Extension to dl module to allow passing strings from
	native function
Message-ID: <48bbc5810508111640a6bd03e@mail.gmail.com>

Hi all,

recently I've tried to use dl module functionality to interface with
external C function.
(It was a quick hack so I didn't want to write wrapper code). To my
dismay I learned that
call method doesn't allow passing data by reference (since strings are
immutable in
python) - but passing pointers around and modifying caller's data is
used all the time
in C, so that makes dl practically useless.

I've hacked the method to allow mutable data, by allocating temporary
buffers for
all string arguments passed to it, calling the c function, and then
constructing new
strings from the data in those buffers and returning them in a tuple
together with function
return code.

Combined with pack/unpack from struct module, this allows passing any structure
to and from the external C function, so, imho, it's a useful thing to
have. To my knowledge,
this functionality can't be achieved in pure python programs, and
there's no alternative
dynamic loader module that can do it.

More info with examples:
   http://ptlo.blogspot.com/2005/08/pyinvoke.html
Source:
   http://software.senko.net/pub/python-dl2.tgz

(the tarball contains setup.py and my dlmodule.c version, for experimenting
without patching the official module, and patch made against (fairly recent) cvs
version of dlmodule.c)

Thoughts, comments? Could this be put in standard module, does it make
sense, etc?

Regards,
Senko

-- 
Senko Rasic <senko at senko dot net>

From aahz at pythoncraft.com  Sat Aug 13 06:51:26 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 12 Aug 2005 21:51:26 -0700
Subject: [Python-Dev] new PEP type: Process
In-Reply-To: <42FD539A.1060407@python.org>
References: <42FD539A.1060407@python.org>
Message-ID: <20050813045125.GA1985@panix.com>

On Fri, Aug 12, 2005, David Goodger wrote:
>
> Barry Warsaw and I, the PEP editors, have been discussing the
> need for a new PEP type lately.  Martin von L?wis' PEP 347 was
> a prime example of a PEP that didn't fit into the existing
> Standards Track and Informational categories.  We agreed upon a
> new "Process" PEP type.  For more information, please see PEP 1
> (http://www.python.org/peps/pep-0001.html) -- the type of which has
> also been changed to Process.

Go ahead and make PEP 6 a Process PEP.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From raymond.hettinger at verizon.net  Sat Aug 13 13:34:34 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 13 Aug 2005 07:34:34 -0400
Subject: [Python-Dev] Discussion draft: Proposed Py2.5 C API for set and
	frozenset objects
Message-ID: <000701c59ffa$fb54b480$772dc797@oemcomputer>

The object and types
--------------------
PySetObject                     subclass of object
                                used for both sets and frozensets
PySet_Type                      a basetype
PyFrozenSet_Type                a basetype

The type check macros
---------------------
PyFrozenSet_CheckExact(ob)      a frozenset
PyAnySet_CheckExact(ob)         a set or frozenset
PyAnySet_Check(ob)              a set, frozenset, or subclass of either

The constructors
----------------
obj PySet_New(it)               takes an iterable or NULL; returns new
ref
obj PyFrozenSet_New(it)         takes an iterable or NULL; returns new
ref

The fine grained methods
------------------------
int PySet_Size(so)

int PySet_Contains(so, key)     1 for yes; 0 for no; -1 for error
                                raises TypeError for unhashable key
                                does not automatically convert to
frozenset
                                
int PySet_Add(so, key)          0 for success; -1 for error
                                raises TypeError for unhashable key
                                raises MemoryError if no room to grow

obj PySet_Pop(so)               return new ref or NULL on failure
                                raises KeyError if set is emtpy

int PySet_Discard(so, key)      1 if found and removed
                                0 if not found (does not raise KeyError)
                                -1 on error
                                raises TypeError for unhashable key
                                does not automatically convert to
frozenset


Course grained methods left for access through PyObject_CallMethod()
--------------------------------------------------------------------
copy, clear, union, intersection, difference, symmetric_difference,
update,
intersection_update, difference_update, symmetric_difference_update
issubset, issuperset, __reduce__

Other functions left for access through the existing abstract API
-----------------------------------------------------------------
PyObject_RichCompareBool()
PyObject_Hash()
PyObject_Repr()
PyObject_IsTrue()
PyObject_Print()
PyObject_GetIter()


From martin at v.loewis.de  Sat Aug 13 13:47:10 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 Aug 2005 13:47:10 +0200
Subject: [Python-Dev] Hosting svn.python.org
In-Reply-To: <17149.17434.157348.230440@montanaro.dyndns.org>
References: <20050812231416.638DB1E4005@bag.python.org>
	<17149.17434.157348.230440@montanaro.dyndns.org>
Message-ID: <42FDDDBE.3040903@v.loewis.de>

skip at pobox.com wrote:
> I will enthusiastically cast my vote for tummy.com, Sean Reifschneider's
> company.  Mojam leases a server there (Mojam & Musi-Cal websites running
> CentOS 4, Apache+mod_perl, Python, Mason, MySQLdb, Mailman, etc).  Their
> service has been absolutely awesome.

But we don't want to lease a server - we are looking for an Subversion
hoster. If we *just* wanted a server, there would be no reason to
drop (*) the current svn.python.org.

So what precisely is the Subversion offer of tummy.com ($/per month
for what disk limit, monthly download limit, number of developers
limit, backup service, email notification, ability for offsite
download of the repository tarball, what access method (is svn+ssh
supported, anonymous WebDAV))?

In case this isn't clear yet: several people are concerned that
running the Python svn repository by volunteers will risk service
outage, and unnecessarily consume volunteer resources. So just replacing
the machine we get for free now with a machine we have to pay for
won't do any good.

I understand that I could now go to tummy.com, contact them, and
research all details myself. But I'm not willing to: everybody
who wants to suggest a different service should find out all the
details of that service, and report them so I can include them
into the PEP.

Regards,
Martin

(*) This PEP is actually not at all about svn.python.org, and the
pydotorg SVN repository. Those are in the realms of the infrastructure
committee, and they do a great job. The PEP is *only* about
migrating the Python source code proper from CVS (along with
the other code snippets that are in that CVS).

From martin at v.loewis.de  Sat Aug 13 13:50:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 Aug 2005 13:50:06 +0200
Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions
In-Reply-To: <1123729152.23755.3.camel@localhost>
References: <1123729152.23755.3.camel@localhost>
Message-ID: <42FDDE6E.2050309@v.loewis.de>

Kristian Benoit wrote:
> This means one cant parse multiple files at the same time using the same
> handlers. You cant pass the context current context to the handler, you must
> base your code on global variables which is not so nice.

This is not true. You can create multiple parsers, and then can make the
callback functions bound methods, using self to store parse-specific
data. There is no need to have extra callback data.

Regards,
Martin

From martin at v.loewis.de  Sat Aug 13 13:56:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 Aug 2005 13:56:48 +0200
Subject: [Python-Dev] Extension to dl module to allow passing strings
 from	native function
In-Reply-To: <48bbc5810508111640a6bd03e@mail.gmail.com>
References: <48bbc5810508111640a6bd03e@mail.gmail.com>
Message-ID: <42FDE000.9080508@v.loewis.de>

Senko Rasic wrote:
> Thoughts, comments? Could this be put in standard module, does it make
> sense, etc?

Are you aware of the ctypes module?

http://starship.python.net/crew/theller/ctypes/

Regards,
Martin

From goodger at python.org  Sat Aug 13 14:38:36 2005
From: goodger at python.org (David Goodger)
Date: Sat, 13 Aug 2005 08:38:36 -0400
Subject: [Python-Dev] new PEP type: Process
In-Reply-To: <20050813045125.GA1985@panix.com>
References: <42FD539A.1060407@python.org> <20050813045125.GA1985@panix.com>
Message-ID: <42FDE9CC.8030008@python.org>

[Aahz]
> Go ahead and make PEP 6 a Process PEP.

Done!

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050813/c4db8c5f/signature.pgp

From wsanchez at wsanchez.net  Sat Aug 13 18:02:00 2005
From: wsanchez at wsanchez.net (=?ISO-8859-1?Q?Wilfredo_S=E1nchez_Vega?=)
Date: Sat, 13 Aug 2005 09:02:00 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <bbaeab1005080320431cfca77@mail.gmail.com>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
Message-ID: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net>

   I'm curious about why Python lacks FileNotFoundError,  
PermissionError and the like as subclasses of IOError.

   Catching IOError and looking at errno to figure out what went  
wrong seems pretty unpythonic, and I've often wished for built-in  
subclasses of IOError.

   I sometimes subclass them myself, but a lot of the time, I'm  
catching such exceptions as thrown by the standard library.

     -wsv


From gvanrossum at gmail.com  Sat Aug 13 23:02:54 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 13 Aug 2005 14:02:54 -0700
Subject: [Python-Dev] xml.parsers.expat no userdata in callback functions
In-Reply-To: <42FDDE6E.2050309@v.loewis.de>
References: <1123729152.23755.3.camel@localhost> <42FDDE6E.2050309@v.loewis.de>
Message-ID: <ca471dc2050813140273d32115@mail.gmail.com>

> Kristian Benoit wrote:
> > This means one cant parse multiple files at the same time using the same
> > handlers. You cant pass the context current context to the handler, you must
> > base your code on global variables which is not so nice.
> 
"Martin v. L?wis" replied:
> This is not true. You can create multiple parsers, and then can make the
> callback functions bound methods, using self to store parse-specific
> data. There is no need to have extra callback data.

What he said. Kristian's complaint is probably a common misconception
about Python -- not too many languages have unified the concepts of
"bound methods" and "callables" so completely as Python. Every
callable is in a sense a closure (or can be). Nested functions are
other examples.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 13 23:27:22 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 13 Aug 2005 14:27:22 -0700
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FBA376.5030605@canonical.com>
References: <42FBA376.5030605@canonical.com>
Message-ID: <ca471dc20508131427b6aa0d4@mail.gmail.com>

With permission, I'm forwarding an email from Mark Shuttleworth about
Bazaar-2 (aka Bazaar-NG), a distributed source control system (not
entirely unlike bitkeeper, I presume) written in Python and in use by
the Ubuntu system. What do people think of using this for Python? Is
it the right model? Do we want to encourage widespread experimentation
with the Python source code?

--Guido van Rossum (home page: http://www.python.org/~guido/)

---------- Forwarded message ----------
From: Mark Shuttleworth <mark at canonical.com>
Date: Aug 11, 2005 12:13 PM
Subject: Distributed RCS
To: Guido van Rossum <gvanrossum at gmail.com>
Cc: Steve Alexander <steve at canonical.com>, Martin Pool
<mbp at canonical.com>, Fredrik Lundh <fredrik at pythonware.com>


Hi Guido

Steve forwarded your mail to me regarding distributed revision control,
so I thought I'd follow up with some thoughts on why I agree with
Frederick Lundh that it's important, and where we are going with the
Bazaar project.

First, distributed RCS systems reduce the barrier to participation.
Anybody can create their own branches, and begin work on features, with
full revision control support, without having to coordinate with the
core RCS server sysadmin. So, for example, if someone gets an idea to
work on PEP X, they can simply create a branch, and start hacking on it
locally, with full RCS functionality like commit and undo, and logs of
their changes over time. They can easily merge continually from the
trunk, to keep their branch up to date. And they can publish their
branch using only a web server.

With Bazaar, these branches can be personal or shared group branches.

The net effect of this is to make branching a core part of the
development process. Each feature gets developed on a branch, and then
merged when its ready. Instead of passing patches around in email, you
find yourself passing out branch references, which are much easier to
deal with since they are always "up to date". In Launchpad, we have
evolved to work around this branch-per-feature approach, and built a
review process so that each branch gets a review before the code is
merged to the trunk.

It also has a positive social impact, because groups that are interested
in a feautre can begin to collaborate on it immediately rather than
waiting to get consensus from everybody else, they just start their
branch and get more testing when it is reaching a reasonable state of
maturity - then the project decides whether or not it lands. That
results in less argument about whether or not a feature is a good idea
before anybody really knows what it's going to look like. Those who are
interested, participate, and those who aren't reserve judgement till
it's done.

As for Bazaar, we have just wrapped up our latest sprint, where we
decided that bazaar-ng (bzr), which is being written in Python by Martin
Pool, will become Bazaar 2.x, in the first quarter of 2006. The current
1.x line of development has served us well, but the ideas we developed
and which have been implemented as a working bazaar-ng reference by
Martin are now proven enough that I'm committing the project (Ubuntu,
and all of Launchpad) to it. Martin will continue to work on it full
time, and will be joined by the current Bazaar 1.x team, Robert Collins,
David Allouche and James Blackwell. That makes for a substantial chunk
of resources but I think it's worth it because we need a truly superb
free revision control system when dealing with something as large and
complex as an entire distribution.

The whole of Ubuntu will be in Bazaar in due course. Currently, we have
about 500 upstreams published in the Bazaar 1.x format (see
http://bazaar.ubuntu.com/ for the list), all of those will be converted
to Bazaar 2.x and in addition we will continue to publish more and more
upstreams in the 2.x archive format. We actively convert CVS and SVN
upstreams and publish them in the Bazaar format to allow us to use a
single, distributed revision control system across all of those
packages. So there's a lot of real-world data and real-world coding
going on with Bazaar as the RCS holding it all together.

Perhaps more importantly, we are integrating Bazaar tightly with the
other Launchpad applications, Rosetta and Malone. This means that bug
tracking and translation will be "branch aware". You will be able to
close a bug by noting that a commit in one of your branches fixes the
bug, then merging it into the relevant mainline branch, and have the
launchpad bug tracker automatically mark the bug as closed, if you wish.
Similarly you will be able to get the latest translations just by
merging from the branch published by Rosetta that has the latest
translations in it for your application.

The combination of distributed revision control, and ultimately
integrated bug tracking and translations, will I think be a very
efficient platform for collaborative development.

Bazaar is free, and the use of Launchpad is free though we have not yet
released the code to the web services for bug tracking and translation.

I hope that puts bazaar into perspective for you. Give it a spin - the
development 2.x codebase is robust enough now to handle a line of
development and do basic merging, we are switching our own development
to the pre-release 2.x line in October, and we will switch over all the
public archives we maintain in around March next year.

Cheers,
Mark

From skip at pobox.com  Sun Aug 14 01:00:37 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 13 Aug 2005 18:00:37 -0500
Subject: [Python-Dev] cvs to bzr?
Message-ID: <17150.31637.180169.877441@montanaro.dyndns.org>

Based on the message Guido forwarded, I installed bazaar-ng.  From Mark's
note it seems they convert cvs repositories to bzr repositories, but I
didn't see any mention in the bzr docs of any sort of cvs2bzr tool.
Likewise, Google didn't turn up anything obvious.  Anyone know of something?

Thx,

Skip


From nas at arctrix.com  Sun Aug 14 02:02:40 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 13 Aug 2005 18:02:40 -0600
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <ca471dc20508131427b6aa0d4@mail.gmail.com>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
Message-ID: <20050814000240.GA5470@mems-exchange.org>

On Sat, Aug 13, 2005 at 02:27:22PM -0700, Guido van Rossum wrote:
> What do people think of using this for Python?

I think it deserves consideration.  One idea would be to have a
Bazaar-NG repository that tracks the CVS SF repository.  I haven't
tried it yet but there is a tool called Tailor[1] that automates the
task.  That would give people a chance to experiment with Bazaar-NG
(and still work with SF is down) without committing to it.

> Is it the right model? Do we want to encourage widespread
> experimentation with the Python source code?

I think Python works fairly well with the centralized model.
However, I expect it's hard to know what we are missing.

  Neil

1. http://darcs.net/DarcsWiki/Tailor

From nas at arctrix.com  Sun Aug 14 02:03:46 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 13 Aug 2005 18:03:46 -0600
Subject: [Python-Dev] cvs to bzr?
In-Reply-To: <17150.31637.180169.877441@montanaro.dyndns.org>
References: <17150.31637.180169.877441@montanaro.dyndns.org>
Message-ID: <20050814000346.GB5470@mems-exchange.org>

On Sat, Aug 13, 2005 at 06:00:37PM -0500, skip at pobox.com wrote:
> Based on the message Guido forwarded, I installed bazaar-ng.  From Mark's
> note it seems they convert cvs repositories to bzr repositories, but I
> didn't see any mention in the bzr docs of any sort of cvs2bzr tool.

Haven't tried it but should work:

    http://darcs.net/DarcsWiki/Tailor 

From gvanrossum at gmail.com  Sun Aug 14 02:11:05 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 13 Aug 2005 17:11:05 -0700
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FD2A13.8000900@canonical.com>
References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com>
Message-ID: <ca471dc205081317119d41003@mail.gmail.com>

Another fwd, describing how Steve Alexander's group user bazaar.

--Guido van Rossum (home page: http://www.python.org/~guido/)

---------- Forwarded message ----------
From: Steve Alexander <steve at canonical.com>
Date: Aug 12, 2005 4:00 PM
Subject: Re: Distributed RCS
To: Guido van Rossum <gvanrossum at gmail.com>
Cc: Mark Shuttleworth <mark at canonical.com>, Martin Pool
<mbp at canonical.com>, Fredrik Lundh <fredrik at pythonware.com>


Hi Guido,

I'm not going to post to python-dev just now, because I'm leaving on 1.5
weeks vacation tomorrow, and I'd rather be absent than unable to answer
questions promptly.

Martin Pool will be around next week, and will be able to take part in
discussions on the list.

Feel free to post all or part of Mark's or my emails to the python lists.

Mark wrote:
>
> I hope that puts bazaar into perspective for you. Give it a spin - the
> development 2.x codebase is robust enough now to handle a line of
> development and do basic merging, we are switching our own development
> to the pre-release 2.x line in October, and we will switch over all the
> public archives we maintain in around March next year.

A large part of the internal development at Canonical is the Launchpad
system.  This is about 30-40 kloc of Python code, including various
Twisted services, cron scripts, a Zope 3 web application, database
tools, ...

It's being worked on by 20 software developers.  Everyone uses bazaar
1.4 or 1.5, and around October, we'll be switching to use bazaar 2.x.

I'll describe how we work on Launchpad using Bazaar.  This is all from
the Bazaar 1.x perspective, and some things will become simpler when we
change to using Bazaar 2.x.

I've left the description quite long, as I hope it will give you some of
the flavour of working with a distributed RCS.


== Two modes of working: shared branches and PQM ==

Bazaar supports two different modes of working for a group like the
Launchpad team.

1. There's a shared read/write place that all the developers have access
to.  This is contains the branches we release from, and represents the
"trunk" of the codebase.

2. A "virtual person" called the "patch queue manager" (PQM) has
exclusive write access to a collection of branches.  PQM takes
instructions as GPG signed emails from launchpad developers, to merge
their code into PQM's branches.

We use the latter mode because we have PQM configured not only to accept
requests to merge code into PQM's codebase, but to run all the tests
first and refuse to merge if any test fails.


== The typical flow of work on Launchpad ==

Say I want to work on some new feature for Launchpad.  What do I do?

1. I use 'baz switch' to change my working tree from whatever I was
working on last, and make it become PQM's latest code.

  baz switch rocketfuel at canonical.com/launchpad--devel--0

 "rocketfuel" is the code-name for the branches we release our
 code from.  PQM manages the rocketfuel branches.  In Bazaar 1.x,
 collections of branches are called "archives" and are identified
 by an email address plus some other optional information.
 So, "rocketfuel at canonical.com" is PQM's email address.
 "launchpad--devel--0" is simply the name of the main launchpad
 branch.  The format of branch names is very strict in Bazaar 1.x.
 It is much more open in Bazaar 2.x.


2. I use 'baz branch' to create my own branch of this code that I can
commit changes to.

  baz branch steve.alexander at canonical.com/launchpad--ImproveLogins--0

 My archive is called "steve.alexander at canonical.com".  The branch
 will be used to work on the login functionality of Launchpad, so
 I have named the branch "launchpad--ImproveLogins--0".


3. I hack on the code, and from time to time commit my changes.  I need
to 'baz add' new files and directories, and 'baz rm' to remove files,
and 'baz mv' to move files around.

  # hack hack hack
  baz commit -s "Refactored the whatever.py module."
  # hack hack hack
  baz del whatever_deprecated.py
  baz commit -s "Removed deprecated whatevers."
  # hack hack hack


4. Let's say I hacked on some stuff, but I didn't commit it.  I don't
like what I did, and I want to start again.

  # hack hack hack
  baz undo

 'baz undo' puts the source code back into the state it was in after the
last commit, and puts the changes somewhere.  If I change my mind again,
I can say 'baz redo', and get my changes back.


5. All this hacking and committing has been happening on my own
workstation, without a connection to the internet.  Perhaps I've been on
a plane or at a cafe.  When I have a connection again, I can make my
work available for others to see by mirroring my code to a central
location.  Each Launchpad developer has a mirror of the archive they use
for Launchpad work on a central machine at the Canonical data centre.
In our case, the mirror command uses sftp to copy the latest changes I
have made into the mirror on this central server.

  baz archive-mirror


6. Because we have a strict code review proccess for Launchpad
development, I can't (or rather, shouldn't) submit my changes to PQM
yet.  I should get it reviewed.  But, let's say Andrew wants to do some
work that depends on my work, before my work has made its way into PQM's
rocketfuel "Trunk".  He can simply merge from me.

  # in Andrew's working tree, on his workstation.
  baz merge steve.alexander at canonical.com/launchpad--ImproveLogins--0
  baz commit -s "Merged steve's ImproveLogins work."

 When Andrew eventually gets his work reviewed, and sends it on to PQM
to be merged into Rocketfuel, the Bazaar merging algorithms will work
out that Andrew merged from me, and will sort things out.  Of course,
there can be conflicts when people have worked in divergent ways on the
same code.  These are resolved in a similar way to CVS or SVN.


7. I want to get my code reviewed by a member of the review team.  I add
the details of my branch to the PendingReviews page on the launchpad
development wiki.  This wiki is publicly readable.

  https://wiki.launchpad.canonical.com/PendingReviews

There is a script that periodically reads the PendingReviews page,
attempts to merge the branches listed there into rocketfuel (just as PQM
would do), and produces a diff for use by the review team.  The diff
represents what changes would be made to the rocketfuel Trunk were the
branch in question to be sent to PQM.  This diff is often enough for the
reviewers to work with.  If they need to see more context, they can
simply check out the branch in question using 'baz get branchname'.

The script also highlights whether there were any conflicts that would
prevent a merge, and gives an indication of the size of the change.

The script's output is accessible only to Launchpad developers.
However, I've made a couple of screenshots to give you some idea of what
it looks like.

This is the summary page, that uses information taken from the
PendingReviews wiki page.

  http://people.ubuntu.com/~stevea/branch-summary.png

This is a typical diff representing what is to be merged.

  http://people.ubuntu.com/~stevea/branch-diff.png


The reviewer sends an email to the author of the code, cc the
launchpad-reviews mailing list.  The review email typically has sections
of code included, each line prefixed with '> ', with comments, questions
and requests for improvement beneath each section of code.  The reviewer
will either approve the code for merging, approve the code providing
certain remedial actions are taken, or reject the code, requiring a new
review later.


8. My code has been successfully reviewed by JamesH, so I send a signed
mail to PQM asking to merge my work into rocketfuel.

  submit-merge "r=JamesH, Improvements to logging in." pqm at pqm.ubuntu.com

PQM checks that each merge request has r=someone in the message, as a
reminder that launchpad developers need to have their code reviewed.

The submit-merge script gets takes the archive name, the branch name,
and the "patch level" that the branch is at, composes an email saying

  "pqm, please merge
   steve.alexander at canonical.com/launchpad--ImproveLogins-0--patch-18
   into rocketfuel at canonical.com/launchpad--devel--0."

Signs it with my gpg key, and mails it.

Some time later, once PQM has merged the code and successfully run all
the launchpad tests, an email will go out to me, and to a pqm-commits
mailing list, saying that the merge was successful.  If it was
unsuccessful, I get an email with the error output.

An irc robot listens to the pqm-commits mailing list, and announces new
landings to the rocketfuel Trunk on irc.


== Naming branches ==

The Launchpad team is distributed around the world.  To cope with this,
and also to get our community of users involved in the development of
the software, Launchpad development emphasises writing specifications
and proposals, and implementing features based on these proposals.

You can read all the launchpad proposals on the launchpad development wiki.

  https://wiki.launchpad.canonical.com/

So, we usually name branches after the specification that is being
implemented on that branch.  The branch is named near the top of the
specification, so someone reading the specification who has access to
the source code can see what's happening with the implementation.

Branches are also often named after bugs.  For example,
launchpad--bug123--0.

The use of '--' in branch names, and the '--0' thing at the end is
occassionally useful, but more of a hangover from the 'tla' system that
bazaar is based on.  This strict branch naming format is not being
carried over into bazaar 2.


== External contributors ==

The source code to Launchpad is not available at this time.  We intend
to make it open source at some point in the future, but I'm not sure
when that will be.

Let's consider what would happen if we decided to make the Launchpad
code fully open source tommorrow.

Someone from outside of Canonical could get a copy of the main launchpad
"rocketfuel" branch, make their own branch by branching from the
rocketfuel branch, do a bunch of work, mirror it to their own website,
and email a Canonical launchpad developer to ask that it be reviewed, or
merged into that launchpad developer's branch.

This way, even though an outside contributor doesn't have rights with
PQM, they could still make fine-grained commits, merge frmo a variety of
places, and participate at the same level as someone employed by Canonical.

--
Steve Alexander

From tjreedy at udel.edu  Sun Aug 14 05:07:49 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 13 Aug 2005 23:07:49 -0400
Subject: [Python-Dev] Distributed RCS
References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com>
	<ca471dc205081317119d41003@mail.gmail.com>
Message-ID: <ddmci6$901$1@sea.gmane.org>

> Another fwd, describing how Steve Alexander's group user bazaar.

I found this rather clear and easy to understand even without having 
directly used CVS (other than to browse).  Some of the automation features 
seem useful but I don't know whether they are specific to bazaar.  Anyway, 
my thoughts.

It seems to me that auto testing of the tentatively updated trunk before 
final commitment would avoid the 'who checked in test-breaking code' 
messages that appear here occasionally.  But it requires that the update + 
test-suite time be less than the average inter-update interval.

I understand the main difference between baz and cvs (and similar) to be 
that checked-out-to-developers copies remain 'within' the distributed 
system and accessible to the master system rather than becoming external 
(and lost track of) copies.  In consequence (again if I understand 
correctly), pre- and post-review diffs and merges are done directly between 
the developers branch and the current system trunk rather than (for diffs) 
with a possibly out-of-date master on the developer's machine, leading to 
trunk updates with a possibly out-of-date diff.  If so, this would 
eliminate reviewers having to make requests such as 'please run a new diff 
against the current CVS head' that I remember sometimes seeing on the SF 
tracker.

The current bottleneck in Python development appears to be patch reviews. 
So merely making submission and commitment easier will not help much.  An 
alternative to more reviewers is more automation to make more effective use 
of existing reviewers.  (And this might also encourage more reviewers.) 
The Launchpad group seems to be ahead in this regard, but I don't know how 
much this is due to using bazaar.  In any case, ease of improving the 
review process might be a criterion for choosing a source code system.  But 
I leave this to ML.

*Other things being equal*, using a state-of-the-art development system 
written in Python to develop Python would be a marketing plus.

Terry J. Reedy


From anthony at interlink.com.au  Sun Aug 14 06:36:33 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sun, 14 Aug 2005 14:36:33 +1000
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <ca471dc20508131427b6aa0d4@mail.gmail.com>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
Message-ID: <200508141436.36913.anthony@interlink.com.au>

I have great hopes for baz-ng, but I don't know that it's really ready for 
production use just yet. I don't know that we want to be right out on the 
bleeding edge of revision control systems for Python. 

The current bazaar, last time I looked (a few months ago) did not work on 
Windows. This is a complete deal-breaker for us, unless we can agree to dump 
that Windows support (who needs it, really? <wink>) I *hope* that baz-ng will
work fine on Windows - I haven't looked too closely at that side of it.

Anthony 
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From skip at pobox.com  Sun Aug 14 13:44:25 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 14 Aug 2005 06:44:25 -0500
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <200508141436.36913.anthony@interlink.com.au>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<200508141436.36913.anthony@interlink.com.au>
Message-ID: <17151.11929.958753.192768@montanaro.dyndns.org>

    Anthony> The current bazaar, last time I looked (a few months ago) did
    Anthony> not work on Windows. This is a complete deal-breaker for us,

I assume it would be a deal breaker for many people.  According to the
Bazaar-NG website it works on "Linux, Windows and Mac OS X, or any system
with a Python interpreter".  If it's that platform-independent, perhaps it
will work on some systems that don't support CVS.  It does require Python
2.4, though I doubt that would be a great hardship for many people
interested in Python development.

Skip

From martin at v.loewis.de  Sun Aug 14 14:01:46 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Aug 2005 14:01:46 +0200
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <ca471dc20508131427b6aa0d4@mail.gmail.com>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
Message-ID: <42FF32AA.7040506@v.loewis.de>

Guido van Rossum wrote:
> With permission, I'm forwarding an email from Mark Shuttleworth about
> Bazaar-2 (aka Bazaar-NG), a distributed source control system (not
> entirely unlike bitkeeper, I presume) written in Python and in use by
> the Ubuntu system. What do people think of using this for Python? Is
> it the right model? 

Like Skip, I tried experimenting with it. While that may be the right
model, I don't think it is the right software. In bazaar-ng 0.0.5 (which
is what Debian unstable currently has), bzr commit would not open
a text editor, but require the commit message on the command line;
selective commit of only some of the changed files is also not
supported. bzr diff cannot show the changes between two revisions,
and cannot show revisions across branches.

So I assume that using bazaar-ng right now would cause problems in
day-to-day usage.

Regards,
Martin

From skip at pobox.com  Sun Aug 14 14:37:55 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 14 Aug 2005 07:37:55 -0500
Subject: [Python-Dev] cvs to bzr?
In-Reply-To: <20050814000346.GB5470@mems-exchange.org>
References: <17150.31637.180169.877441@montanaro.dyndns.org>
	<20050814000346.GB5470@mems-exchange.org>
Message-ID: <17151.15139.655967.250675@montanaro.dyndns.org>


    >> ... I didn't see any mention in the bzr docs of any sort of cvs2bzr
    >> tool.

    Neil> Haven't tried it but should work:

    Neil>     http://darcs.net/DarcsWiki/Tailor 

Thanks Neil.  I downloaded it last night and played around a bit.  What
follows is a description that will hopefully keep others from stepping in
the same booby traps I did.

It doesn't appear to work at this point, both based on attempts to use it
and hints on the above page.  After some reading, I was able to pull the
latest version with

    wget --exclude-directories=/~lele/projects/tailor/_darcs --mirror \
         --no-parent  --no-host-directories --cut-dirs=3 -e robots=off \
         http://nautilus.homeip.net/~lele/projects/tailor/

(The wget example at the top of the wiki page points to an older version.)

Unfortunately, it (like the older version) is missing

    if __name__ == "__main__":
        main()

in tailor.py and has no call to its main() function anywhere in the source.
This seemed very odd to me, so I added one, as well as a #! line.  I
eventually noticed that there is a vcpx package and in the directory above,
a number of other files, tailor, a bunch of index.html files and a README.
It was reminiscent of expanding a tar file in the bad old days (or unzipping
a zip file nowadays) before everbody got the idea that it would be a good
idea to create a top-level directory...

Anyway, I'm still struggling with it.  If I get further I'll post my
results.  If others have gotten further, tips would be appreciated.

Skip

From skip at pobox.com  Sun Aug 14 14:46:16 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 14 Aug 2005 07:46:16 -0500
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FF32AA.7040506@v.loewis.de>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
Message-ID: <17151.15640.173982.961359@montanaro.dyndns.org>


    Martin> Like Skip, I tried experimenting with it.  While that may be the
    Martin> right model, I don't think it is the right software. [problems
    Martin> elided]

    Martin> So I assume that using bazaar-ng right now would cause problems
    Martin> in day-to-day usage.

Granted.  What is the cost of waiting a bit longer to see if it (or
something else) gets more usable and would hit the mark better than svn?  I
presume that once we switch away from cvs to something else, it's unlikely
we would switch again unless some huge roadblock appeared that made the
initial change the wrong one.

I was amazed at the number of different version control systems out there
now.  CVS, while enormously successful from a practical standpoint, clearly
has its detractors.  That there are so many alternatives suggests that it's
not clear yet what the "correct" feature set for a version control system
is.

Skip

From martin at v.loewis.de  Sun Aug 14 18:16:11 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Aug 2005 18:16:11 +0200
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <17151.15640.173982.961359@montanaro.dyndns.org>
References: <42FBA376.5030605@canonical.com>	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
Message-ID: <42FF6E4B.4000206@v.loewis.de>

skip at pobox.com wrote:
> Granted.  What is the cost of waiting a bit longer to see if it (or
> something else) gets more usable and would hit the mark better than svn?  I
> presume that once we switch away from cvs to something else, it's unlikely
> we would switch again unless some huge roadblock appeared that made the
> initial change the wrong one.

It depends on what "a bit" is. Waiting a month would be fine; waiting
two years might be pointless.

So I think I will personally pursue PEP 347 (switching to SVN); it will
be then an issue of BDFL pronouncement.

Regards,
Martin

From nas at arctrix.com  Sun Aug 14 19:12:59 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 14 Aug 2005 11:12:59 -0600
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FF6E4B.4000206@v.loewis.de>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
Message-ID: <20050814171259.GA8200@mems-exchange.org>

On Sun, Aug 14, 2005 at 06:16:11PM +0200, "Martin v. L?wis" wrote:
> It depends on what "a bit" is. Waiting a month would be fine; waiting
> two years might be pointless.

It looks like the process of converting a CVS repository to
Bazaar-NG does not yet work well (to be kind).  The path
CVS->SVN->bzr would probably work better.  I suspect cvs2svn has
been used on quite a few CVS repositories already.  I don't think
going to SVN first would lose any information.

My vote is to continue with the migration to SVN.  We can
re-evaluate Bazaar-NG at a later time.

  Neil

From ronaldoussoren at mac.com  Sun Aug 14 19:17:00 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sun, 14 Aug 2005 19:17:00 +0200
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
Message-ID: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>

Hi,

I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a  
checkout that is less than two hours old. I'm building a standard  
unix tree (no framework install):

$ ./configure --prefix=/opt/python/2.5
...
$ make
...
ar cr libpython2.5.a Modules/config.o Modules/getpath.o Modules/ 
main.o Modules/gcmodule.o
ar cr libpython2.5.a Modules/threadmodule.o  Modules/signalmodule.o   
Modules/posixmodule.o  Modules/errnomodule.o  Modules/_sre.o  Modules/ 
_codecsmodule.o  Modules/zipimport.o  Modules/symtablemodule.o   
Modules/xxsubtype.o
ranlib libpython2.5.a
c++  -u _PyMac_Error -o python.exe \
                 Modules/python.o \
                 libpython2.5.a -ldl
case $MAKEFLAGS in \
*-s*)  CC='gcc' LDSHARED='gcc  -bundle -undefined dynamic_lookup'  
OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./ 
setup.py -q build;; \
*)  CC='gcc' LDSHARED='gcc  -bundle -undefined dynamic_lookup' OPT='- 
DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./setup.py  
build;; \
esac
make: *** [sharedmods] Error 139

This is a segmentation fault when trying to build extensions:

$ ./python.exe
Python 2.5a0 (#5, Aug 14 2005, 18:20:08)
[GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> import setup
Segmentation fault

The minimal import that causes a crash is 'import  
distutils.sysconfig'. I've rebuild using --enable-debug and --with- 
pydebug to check if gdb could tell me more.

The start of the stacktrace:
(gdb) r -c 'import distutils.sysconfig'
Starting program: /Volumes/Data/Users/ronald/Python/python-HEAD/dist/ 
src/python.exe -c 'import distutils.sysconfig'
Reading symbols for shared libraries . done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xcbcbcbd3
0x001500b0 in structseq_dealloc (obj=0x3c3878) at Objects/structseq.c:47
47                      Py_XDECREF(obj->ob_item[i]);
(gdb) where
#0  0x001500b0 in structseq_dealloc (obj=0x3c3878) at Objects/ 
structseq.c:47
#1  0x0002fdb0 in _Py_Dealloc (op=0x3c3878) at Objects/object.c:1883
#2  0x000eedb4 in frame_dealloc (f=0x1816c18) at Objects/ 
frameobject.c:394
#3  0x0002fdb0 in _Py_Dealloc (op=0x1816c18) at Objects/object.c:1883
#4  0x000dd1d0 in fast_function (func=0x390038, pp_stack=0xbfffd788,  
n=1, na=1, nk=0) at Python/ceval.c:3654
#5  0x000dcdc8 in call_function (pp_stack=0xbfffd788, oparg=1) at  
Python/ceval.c:3590
#6  0x000d6aa4 in PyEval_EvalFrameEx (f=0x610358, throw=0) at Python/ 
ceval.c:2181
#7  0x000d98d0 in PyEval_EvalCodeEx (co=0x5180d8, globals=0x5146c8,  
locals=0x5146c8, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0,  
defcount=0, closure=0x0) at Python/ceval.c:2748
#8  0x000ce270 in PyEval_EvalCode (co=0x5180d8, globals=0x5146c8,  
locals=0x5146c8) at Python/ceval.c:490
#9  0x0001643c in PyImport_ExecCodeModuleEx (name=0xbfffe808  
"distutils.sysconfig", co=0x5180d8, pathname=0xbfffde4c "/Volumes/ 
Data/Users/ronald/Python/python-HEAD/dist/src/Lib/distutils/ 
sysconfig.pyc") at Python/import.c:620

At the DECREF, i == 17, size == 18 and obj->ob_item[i] == 0xcbcbcbcb,  
and obj is an posix.stat_result.


From nas at arctrix.com  Sun Aug 14 19:21:07 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 14 Aug 2005 11:21:07 -0600
Subject: [Python-Dev] cvs to bzr?
In-Reply-To: <20050814000346.GB5470@mems-exchange.org>
References: <17150.31637.180169.877441@montanaro.dyndns.org>
	<20050814000346.GB5470@mems-exchange.org>
Message-ID: <20050814172107.GB8200@mems-exchange.org>

On Sat, Aug 13, 2005 at 06:03:46PM -0600, Neil Schemenauer wrote:
> Haven't tried it but should work:
> 
>     http://darcs.net/DarcsWiki/Tailor 

After applying the attached patch, this command seemed to work for
converting the initial revision:

~/src/cvsync/tailor.py --source-kind cvs --target-kind bzr \
    --bootstrap --repository ~/Python/python-cvsroot -m python \
    --revision r16b1 py_bzr

After, I think this command is supposed to bring the bzr repostiory
up-to-date:

    cd py_bzr; ~/src/cvsync/tailor.py -v 

It does not seem to work for me (it only updates one file and then
quits).  cvs2svn seems to be much more mature.

  Neil
-------------- next part --------------
diff -rN -u old-cvsync/vcpx/bzr.py new-cvsync/vcpx/bzr.py
--- old-cvsync/vcpx/bzr.py	2005-08-14 09:43:15.000000000 -0600
+++ new-cvsync/vcpx/bzr.py	2005-08-14 10:38:02.000000000 -0600
@@ -29,14 +29,23 @@
 
     ## SyncronizableTargetWorkingDir
 
-    def _addEntries(self, root, entries):
-        """
-        Add a sequence of entries.
-        """
+    def _addPathnames(self, root, entries):
+        c = SystemCommand(working_dir=root, command="bzr add --no-recurse"
+                                                    " %(entries)s")
+        c(entries=' '.join([shrepr(e) for e in entries]))
 
+    def _addSubtree(self, root, *entries):
         c = SystemCommand(working_dir=root, command="bzr add %(entries)s")
-        c(entries=' '.join([shrepr(e.name) for e in entries]))
+        c(entries=' '.join([shrepr(e) for e in entries]))
 
+    def _removePathnames(self, root, names):
+        pass # bzr handles this itself
+
+    def _renamePathname(self, root, oldname, newname):
+        c = SystemCommand(working_dir=root,
+                          command="bzr mv %(old)s %(new)s")
+        c(old=shrepr(oldname), new=shrepr(newname))
+        
     def _commit(self,root, date, author, remark, changelog=None, entries=None):
         """
         Commit the changeset.
@@ -112,7 +121,7 @@
 
         # Create the .bzrignore file, that contains a glob per line,
         # with all known VCs metadirs to be skipped.
-        ignore = open(join(root, '.hgignore'), 'w')
+        ignore = open(join(root, '.bzrignore'), 'w')
         ignore.write('\n'.join(['(^|/)%s($|/)' % md
                                 for md in IGNORED_METADIRS]))
         ignore.write('\ntailor.log\ntailor.info\n')


From gvanrossum at gmail.com  Sun Aug 14 20:11:46 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 14 Aug 2005 11:11:46 -0700
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <20050814171259.GA8200@mems-exchange.org>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
	<20050814171259.GA8200@mems-exchange.org>
Message-ID: <ca471dc20508141111fb5fe3c@mail.gmail.com>

On 8/14/05, Neil Schemenauer <nas at arctrix.com> wrote:
> It looks like the process of converting a CVS repository to
> Bazaar-NG does not yet work well (to be kind).  The path
> CVS->SVN->bzr would probably work better.  I suspect cvs2svn has
> been used on quite a few CVS repositories already.  I don't think
> going to SVN first would lose any information.
> 
> My vote is to continue with the migration to SVN.  We can
> re-evaluate Bazaar-NG at a later time.

That's looks like a fair assessment -- although it means twice the
conversion pain for developers.

It sounds as if bazaar-NG can use a bit of its own medicine -- I hope
everybody who found a bug in their tools submitted a patch! :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sun Aug 14 20:13:29 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 14 Aug 2005 11:13:29 -0700
Subject: [Python-Dev] Fwd:  PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1123986783.21455.35.camel@linux.site>
References: <1123986783.21455.35.camel@linux.site>
Message-ID: <ca471dc205081411131c89135c@mail.gmail.com>

Here's another POV. (Why does evereybody keep emailing me personally?)

--Guido van Rossum (home page: http://www.python.org/~guido/)

---------- Forwarded message ----------
From: Daniel Berlin <dberlin at dberlin.org>
Date: Aug 13, 2005 7:33 PM
Subject: Re: [Python-Dev] PEP: Migrating the Python CVS to Subversion
To: gvanrossum at gmail.com


(Sorry for the lack of proper References: headers, this is a reply to
the email archive).

It's been a couple years since i've been in the world of python-dev, but
apparently i'm rejoining the mailing list at just the right time.

Take all of this for what it is worth:

I'm currently responsible for GCC's bugzilla, wiki, in addition to
maintaining several optimization areas of the compiler :P.
In addition, i'm responsible for pushing GCC (my main love and world :P)
towards Subversion.  I should note my bias at this point.  I now have
full commit access to Subversion.  However, I've also submitted patches
to monotone, etc.

We had a long thread about the various alternatives (arch, bzr, etc),
and besides "freeness" constraints on what we can run on gcc.gnu.org as
an FSF project, it wouldn't have mattered anyway.  This has been in the
planning for about a year now (mainly waiting for new hardware).

Originally, we were hoping to move GCC to monotone, but it didn't mature
fast enough (it's way too slow), and we couldn't make it centralized
enough for our tastes (more later).

The rest of the free tools other than subversion (arch, monotone, git,
darcs, etc) simply couldn't handle our repository with reasonable
speed/hardware.  GCC has project history dating back to 1987.  It's a 4
gig CVS repo containing > 1000 tags, and > 300 branches.  The changelog
alone has 30k trunk revisions.   Those distributed systems that carry
full history often can't deal with this fast enough or in any space
efficient way.  arch was an example of this.  It had a retarded
mechanism that forced users to care about caching certain revisions to
speed it up , instead of doing it on it's own.  I've never tried
converting this repo to bazaar-ng, it wasn't far enough along when i
started.  It also had no cvs2bzr type program, and we aren't about to
lose all our history.  Except for monotone (builtin cvs_import) and
subversion (cvs2svn), none of the cvs2* programs i've run across either
run in reasonable time (those that don't actually understand how to
extract rcs revisions would take weeks to convert our repo, literally),
or could handle all the complexities a repository with our history
present (branches of branches, etc).  Most simply crash in weird ways or
run out of memory :).

Anyway:
Monotone took 45 minutes just to diff two old revisions that are one
revision away from each other.
CVS takes about 2 minutes for the same operation.
SVN on fsfs takes 4 seconds.

The converted SVN repo has > 100000 revisions, and is only ~15% bigger
than the cvs repo (mostly due to stupid copies it has to do to handle
some tag fuckery people were doing in some cases.  If it had been
subversion from the start, it would have been smaller).

We have cvs2svn speedup patches that were done with the KDE folks that
make cvs2svn io bound again instead of cpu bound (it was O(N^2) in
extracting cvs revision texts before).  It takes 17 hours to convert the
gcc repository now, only 45 minutes of cpu time :).  It used to take 52
hours.

I've also talked with Linus about version control before.  He believes
extreme distributed development is the way to go.  I believe heavily
that in most cases where you have a mix of corporations and free
developers, it ends up causing people to "hide the ball" more than they
should.  This is particularly prevalent in GCC.  We don't want design
and development done and then sent as mega-patches presented as fait
accompli, then watch these people whine as their designs get torn apart.
We'd rather have the discussion on the mailing list and the work done in
a visible place (IE CVS branch stored in some central place) rather than
getting patch bombs.  As a result (and there are many other reasons, i'm
just presenting one of them), we actually don't *want* to move from a
centralized model, in order to help control the social and political
problems we'd have to face if we went fully distributed.

Python may not face any of these problem to the degree that GCC does (i
doubt many projects do actually. GCC is a very weird and tense political
situation :P), because of size, etc, in which case, a distributed model
may make more sense.  However, you need to be careful to make sure that
people understand that it hasn't actually changed your real development
process (PEP's, etc), only the workflow used to implement it.

From bcannon at gmail.com  Sun Aug 14 20:37:43 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 14 Aug 2005 11:37:43 -0700
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
Message-ID: <bbaeab10050814113758030ff9@mail.gmail.com>

On 8/14/05, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> Hi,
> 
> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a
> checkout that is less than two hours old. I'm building a standard
> unix tree (no framework install):
> 
> $ ./configure --prefix=/opt/python/2.5
> ...
> $ make
> ...
> ar cr libpython2.5.a Modules/config.o Modules/getpath.o Modules/
> main.o Modules/gcmodule.o
> ar cr libpython2.5.a Modules/threadmodule.o  Modules/signalmodule.o
> Modules/posixmodule.o  Modules/errnomodule.o  Modules/_sre.o  Modules/
> _codecsmodule.o  Modules/zipimport.o  Modules/symtablemodule.o
> Modules/xxsubtype.o
> ranlib libpython2.5.a
> c++  -u _PyMac_Error -o python.exe \
>                  Modules/python.o \
>                  libpython2.5.a -ldl
> case $MAKEFLAGS in \
> *-s*)  CC='gcc' LDSHARED='gcc  -bundle -undefined dynamic_lookup'
> OPT='-DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./
> setup.py -q build;; \
> *)  CC='gcc' LDSHARED='gcc  -bundle -undefined dynamic_lookup' OPT='-
> DNDEBUG -g -O3 -Wall -Wstrict-prototypes' ./python.exe -E ./setup.py
> build;; \
> esac
> make: *** [sharedmods] Error 139
> 
> This is a segmentation fault when trying to build extensions:
> 

I can verify the breakage.  I did a ``make distclean``, updated,
built, and got the same 139 error.

I am short on time today, so I don't think I will be able to dive into
this right away.

-Brett

From mwh at python.net  Sun Aug 14 21:09:21 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 14 Aug 2005 20:09:21 +0100
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com> (Ronald
	Oussoren's message of "Sun, 14 Aug 2005 19:17:00 +0200")
References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
Message-ID: <2mslxcuyhq.fsf@starship.python.net>

Ronald Oussoren <ronaldoussoren at mac.com> writes:

> Hi,
>
> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a  
> checkout that is less than two hours old. I'm building a standard  
> unix tree (no framework install):

It seems very likely that it was this change:

http://fisheye.cenqua.com/changelog/~br=MAIN/python?cs=MAIN:loewis:20050809145951

Refcounting, posixmodule.c, aiee!  You are in a maze of twisty
#ifdefs, all alike.  I'll probably find the problem fairly soon, we'll
see... :)

Cheers,
mwh

-- 
  All obscurity will buy you is time enough to contract venereal
  diseases.                                  -- Tim Peters, python-dev

From ndbecker2 at gmail.com  Sun Aug 14 20:14:22 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Sun, 14 Aug 2005 14:14:22 -0400
Subject: [Python-Dev] Fwd: Distributed RCS
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
Message-ID: <ddo1ka$5vb$1@sea.gmane.org>

I encourage everyone to look at mercurial.  It is also written in Python.  I
am using it daily.


From mwh at python.net  Sun Aug 14 21:21:31 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 14 Aug 2005 20:21:31 +0100
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net> (
	=?iso-8859-1?q?Wilfredo_S=E1nchez_Vega's_message_of?= "Sat, 13 Aug 2005
	09:02:00 -0700")
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net>
Message-ID: <2moe80uxxg.fsf@starship.python.net>

Wilfredo S?nchez Vega <wsanchez at wsanchez.net> writes:

>    I'm curious about why Python lacks FileNotFoundError,  
> PermissionError and the like as subclasses of IOError.

Good question.  Lack of effort/inertia?

>    Catching IOError and looking at errno to figure out what went  
> wrong seems pretty unpythonic, and I've often wished for built-in  
> subclasses of IOError.

The py library does this (http://codespeak.net/py).

>    I sometimes subclass them myself, but a lot of the time, I'm  
> catching such exceptions as thrown by the standard library.

Well, indeed.  OTOH, functions like os.open aren't really *meant* to
be pythonic.

I don't think this is something I can get interested enough in to work
on myself.

Cheers,
mwh

-- 
  <spiv> As far as I'm concerned, the meat pie is the ultimate unit
         of currency.                           -- from Twisted.Quotes

From gvanrossum at gmail.com  Sun Aug 14 21:27:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun, 14 Aug 2005 12:27:18 -0700
Subject: [Python-Dev] Exception Reorg PEP checked in
In-Reply-To: <2moe80uxxg.fsf@starship.python.net>
References: <bbaeab1005080320431cfca77@mail.gmail.com>
	<6AA5C3D1-6AEC-4CE4-AEB9-84FBDA10EFA9@wsanchez.net>
	<2moe80uxxg.fsf@starship.python.net>
Message-ID: <ca471dc2050814122765bbe335@mail.gmail.com>

On 8/14/05, Michael Hudson <mwh at python.net> wrote:
> Wilfredo S?nchez Vega <wsanchez at wsanchez.net> writes:
> 
> >    I'm curious about why Python lacks FileNotFoundError,
> > PermissionError and the like as subclasses of IOError.
> 
> Good question.  Lack of effort/inertia?

Well, I wonder how often it's needed. My typical use is this:

try:
    f = open(filename)
except IOError, err:
    print "Can't open %s: %s" % (filename, err)
   return

and the error printed contains all the necessary details (in fact it
even repeats the filename, so I could probably just say "print err").

Why do you need to know the exact reason for the failure? If you
simply want to know whether the file exists, I'd use os.path.exists()
or isfile(). (Never mind that this is the sometimes-frowned-upon
look-before-you-leap; I think it's often fine.)

Also note that providing the right detail can be very OS specific.
Python doesn't just run on Unix and Windows.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mwh at python.net  Sun Aug 14 21:35:00 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 14 Aug 2005 20:35:00 +0100
Subject: [Python-Dev] Distributed RCS
In-Reply-To: <ddmci6$901$1@sea.gmane.org> (Terry Reedy's message of "Sat, 13
	Aug 2005 23:07:49 -0400")
References: <42FBA376.5030605@canonical.com> <42FD2A13.8000900@canonical.com>
	<ca471dc205081317119d41003@mail.gmail.com>
	<ddmci6$901$1@sea.gmane.org>
Message-ID: <2mk6iouxaz.fsf@starship.python.net>

"Terry Reedy" <tjreedy at udel.edu> writes:

> It seems to me that auto testing of the tentatively updated trunk before 
> final commitment would avoid the 'who checked in test-breaking code' 
> messages that appear here occasionally.  

I don't think there's any fundamental impossibility in setting up such
a system for CVS, and am pretty certain there's not for SVN.

> But it requires that the update + test-suite time be less than the
> average inter-update interval.

Indeed.

> The current bottleneck in Python development appears to be patch reviews. 

And acting on those reviews...

> So merely making submission and commitment easier will not help much. 

I'm not sure, I think it could help quite a bit.

> An alternative to more reviewers is more automation to make more
> effective use of existing reviewers.  (And this might also encourage
> more reviewers.)  The Launchpad group seems to be ahead in this
> regard, but I don't know how much this is due to using bazaar.  In
> any case, ease of improving the review process might be a criterion
> for choosing a source code system.  But I leave this to ML.
>
> *Other things being equal*, using a state-of-the-art development system 
> written in Python to develop Python would be a marketing plus.

I think the words "stable" and "reliable" should be in there somewhere
:)

I don't get the impression bazaar-ng is there yet.

Cheers,
mwh

-- 
  Unfortunately, nigh the whole world is now duped into thinking that 
  silly fill-in forms on web pages is the way to do user interfaces.  
                                        -- Erik Naggum, comp.lang.lisp

From gustavo at niemeyer.net  Sun Aug 14 23:21:40 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Sun, 14 Aug 2005 18:21:40 -0300
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FF32AA.7040506@v.loewis.de>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
Message-ID: <20050814212140.GA11278@burma.localdomain>

> Like Skip, I tried experimenting with it. While that may be the right
> model, I don't think it is the right software. In bazaar-ng 0.0.5 (which
> is what Debian unstable currently has), bzr commit would not open
> a text editor, but require the commit message on the command line;
> selective commit of only some of the changed files is also not
> supported. bzr diff cannot show the changes between two revisions,

The development version has all of those features implemented already.

> and cannot show revisions across branches.

I'm not sure about this one, though.

-- 
Gustavo Niemeyer
http://niemeyer.net

From martin at v.loewis.de  Sun Aug 14 23:43:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Aug 2005 23:43:54 +0200
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
In-Reply-To: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
Message-ID: <42FFBB1A.8060206@v.loewis.de>

Ronald Oussoren wrote:
> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a  
> checkout that is less than two hours old. I'm building a standard  
> unix tree (no framework install):

I just committed what I think is a bugfix for the recent st_gen support.
Unfortunately, I can't try the code, since I don't have access to
BSD/OSX at the moment.

So please report whether there is any change in behaviour.

Regards,
Martin

From martin at v.loewis.de  Sun Aug 14 23:47:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Aug 2005 23:47:27 +0200
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <ca471dc20508141111fb5fe3c@mail.gmail.com>
References: <42FBA376.5030605@canonical.com>	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>	<17151.15640.173982.961359@montanaro.dyndns.org>	<42FF6E4B.4000206@v.loewis.de>	<20050814171259.GA8200@mems-exchange.org>
	<ca471dc20508141111fb5fe3c@mail.gmail.com>
Message-ID: <42FFBBEF.5060202@v.loewis.de>

Guido van Rossum wrote:
> It sounds as if bazaar-NG can use a bit of its own medicine -- I hope
> everybody who found a bug in their tools submitted a patch! :-)

I had problems finding the place where the bazaar-NG source code
repository is stored - is there a public access to the HEAD version?
There also doesn't appear to be a bug tracker - but I found a mentioning
that bug reports should go to the mailing list.

Regards,
Martin

From martin at v.loewis.de  Sun Aug 14 23:58:46 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 Aug 2005 23:58:46 +0200
Subject: [Python-Dev] Fwd:  PEP: Migrating the Python CVS to Subversion
In-Reply-To: <ca471dc205081411131c89135c@mail.gmail.com>
References: <1123986783.21455.35.camel@linux.site>
	<ca471dc205081411131c89135c@mail.gmail.com>
Message-ID: <42FFBE96.7040000@v.loewis.de>

Guido van Rossum wrote:
> Here's another POV.

I think I agree with Daniel's view, in particular wrt. to performance.
Whatever the replacement tool, it should perform as well or better
than CVS currently does; it also shouldn't perform much worse than
subversion.

I've been using git (or, rather, cogito) to keep up-to-date with the
Linux kernel. While performance of git is really good, storage
requirements are *quite* high, and initial "checkout" takes a long
time - even though the Linux kernel repository stores virtual no
history (there was a strict cut when converting the bitkeeper HEAD).
So these distributed tools would cause quite some disk consumption
on client machines. bazaar-ng apparently supports only-remote
repositories as well, so that might be no concern.

Regards,
Martin

From dberlin at dberlin.org  Mon Aug 15 00:07:39 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Sun, 14 Aug 2005 18:07:39 -0400
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <20050814171259.GA8200@mems-exchange.org>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
	<20050814171259.GA8200@mems-exchange.org>
Message-ID: <1124057259.25267.70.camel@linux.site>

On Sun, 2005-08-14 at 11:12 -0600, Neil Schemenauer wrote:
> On Sun, Aug 14, 2005 at 06:16:11PM +0200, "Martin v. L?wis" wrote:
> > It depends on what "a bit" is. Waiting a month would be fine; waiting
> > two years might be pointless.
> 
> It looks like the process of converting a CVS repository to
> Bazaar-NG does not yet work well (to be kind).  The path
> CVS->SVN->bzr would probably work better.  I suspect cvs2svn has
> been used on quite a few CVS repositories already.  I don't think
> going to SVN first would lose any information.

It doesn't.

As a data point, CVS2SVN can handle gcc's massive cvs repository, which
has merged rcs file information in it dating back to 1987, >1000 tags,
and > 300 branches.

Besides monotone's cvs_import, it's actually the only properly designed
cvs converter I've seen in a while (Properly designed in that it
actually uses the necessary and correct algorithms to get all the
weirdities of cvs branches and tags right).

I'm not sure how big python's repo is, but you probably want to use the
attached patch to speed up cvs2svn.  It changes it to reconstruct the
revisions on it's own instead of calling cvs or rcs.  For GCC, and KDE,
this makes a significant difference (17 hours for our 4 gig cvs repo
convresion instead of 52 hours), because it was spawning cvs/rcs 50
billion times, and the milliseconds add up :)


> My vote is to continue with the migration to SVN.  We can
> re-evaluate Bazaar-NG at a later time.
GCC is moving to SVN (very soon now, within 2 months), and this has been
my viewpoint as well.

It's much easier to go from something that has changesets and global
revisions, to a distributed system, if you want to, than it is to try to
reconstruct that info from CVS on your own :).

Subversion also has excellent language bindings, including the python
bindings.  That's how i've hooked it up to gcc's bugzilla.  You could
easily write something to transform *from* subversion to another system
using the bindings.

Things like viewcvs use the python bindings to deal with the svn
repository entirely.  

--Dan

-------------- next part --------------
Index: cvs2svn
===================================================================
--- cvs2svn	(revision 1423)
+++ cvs2svn	(working copy)
@@ -166,6 +166,10 @@
 # grouping.  See design-notes.txt for details.
 DATAFILE = 'cvs2svn-data'
 
+REVISIONS_DB = 'cvs2svn-cvsrepo.db'
+
+CHECKOUT_DB = 'cvs2svn-cvsco.db'
+
 # This file contains a marshalled copy of all the statistics that we
 # gather throughout the various runs of cvs2svn.  The data stored as a
 # marshalled dictionary.
@@ -355,40 +359,7 @@
                    " cvsroot\n" % (error_prefix, cvsroot, fname))
   sys.exit(1)
 
-def get_co_pipe(c_rev, extra_arguments=None):
-  """Return a command string, and the pipe created using that string.
-  C_REV is a CVSRevision, and EXTRA_ARGUMENTS is used to add extra
-  arguments.  The pipe returns the text of that CVS Revision."""
-  ctx = Ctx()
-  if extra_arguments is None:
-    extra_arguments = []
-  if ctx.use_cvs:
-    pipe_cmd = [ 'cvs' ] + ctx.cvs_global_arguments + \
-               [ 'co', '-r' + c_rev.rev, '-p' ] + extra_arguments + \
-               [ ctx.cvs_module + c_rev.cvs_path ];
-  else:
-    pipe_cmd = [ 'co', '-q', '-x,v', '-p' + c_rev.rev ] + extra_arguments + \
-               [ c_rev.rcs_path() ]
-  pipe = SimplePopen(pipe_cmd, True)
-  pipe.stdin.close()
-  return pipe_cmd, pipe
-
-def generate_ignores(c_rev):
-  # Read in props
-  pipe_cmd, pipe = get_co_pipe(c_rev)
-  buf = pipe.stdout.read(PIPE_READ_SIZE)
-  raw_ignore_val = ""
-  while buf:
-    raw_ignore_val = raw_ignore_val + buf
-    buf = pipe.stdout.read(PIPE_READ_SIZE)
-  pipe.stdout.close()
-  error_output = pipe.stderr.read()
-  exit_status = pipe.wait()
-  if exit_status:
-    sys.exit("%s: The command '%s' failed with exit status: %s\n"
-             "and the following output:\n"
-             "%s" % (error_prefix, pipe_cmd, exit_status, error_output))
-
+def generate_ignores(raw_ignore_val):
   # Tweak props: First, convert any spaces to newlines...
   raw_ignore_val = '\n'.join(raw_ignore_val.split())
   raw_ignores = raw_ignore_val.split('\n')
@@ -614,9 +585,7 @@
 DB_OPEN_READ = 'r'
 DB_OPEN_NEW = 'n'
 
-# A wrapper for anydbm that uses the marshal module to store items as
-# strings.
-class Database:
+class SDatabase:
   def __init__(self, filename, mode):
     # pybsddb3 has a bug which prevents it from working with
     # Berkeley DB 4.2 if you open the db with 'n' ("new").  This
@@ -635,22 +604,24 @@
 
     self.db = anydbm.open(filename, mode)
 
-  def has_key(self, key):
-    return self.db.has_key(key)
+  def __getattr__(self, name):
+    return getattr(self.db, name)
 
+# A wrapper for anydbm that uses the marshal module to store items as
+# strings.
+class Database(SDatabase):
+
   def __getitem__(self, key):
     return marshal.loads(self.db[key])
 
   def __setitem__(self, key, value):
     self.db[key] = marshal.dumps(value)
 
-  def __delitem__(self, key):
-    del self.db[key]
-
   def get(self, key, default):
-    if self.has_key(key):
-      return self.__getitem__(key)
-    return default
+    try:
+      return marshal.loads(self.db[key])
+    except KeyError:
+      return default
 
 
 class StatsKeeper:
@@ -841,6 +812,192 @@
     Cleanup().register(temp(TAGS_DB), pass8)
 
 
+def msplit(stri):
+  re = [ i + "\n" for i in stri.split("\n") ]
+  re[-1] = re[-1][:-1]
+  if not re[-1]:
+    del re[-1]
+  return re
+
+
+class RCSStream:
+  ad_command = re.compile('^([ad])(\d+)\\s(\\d+)')
+  a_command = re.compile('^a(\d+)\\s(\\d+)')
+
+  def __init__(self):
+    self.texts = []
+
+  def copy(self):
+    ret = RCSStream()
+    ret.texts = self.texts[:]
+    return ret
+
+  def setText(self, text):
+    self.texts = msplit(text)
+
+  def getText(self):
+    return "".join(self.texts)
+
+  def applyDiff(self, diff):
+    diffs = msplit(diff)
+    adjust = 0
+    i = 0
+    while i < len(diffs):
+      admatch = self.ad_command.match(diffs[i])
+      i += 1
+      try:
+        cn = int(admatch.group(3))
+      except:
+        print diffs
+        raise RuntimeError, 'Error parsing diff commands'
+      if admatch.group(1) == 'd': # "d" - Delete command
+        sl = int(admatch.group(2)) - 1 + adjust
+        del self.texts[sl:sl + cn]
+        adjust -= cn
+      else: # "a" - Add command
+        sl = int(admatch.group(2)) + adjust
+        self.texts[sl:sl] = diffs[i:i + cn]
+        adjust += cn
+        i += cn
+
+  def invertDiff(self, diff):
+    diffs = msplit(diff)
+    ndiffs = []
+    adjust = 0
+    i = 0
+    while i < len(diffs):
+      admatch = self.ad_command.match(diffs[i])
+      i += 1
+      try:
+        cn = int(admatch.group(3))
+      except:
+        raise RuntimeError, 'Error parsing diff commands'
+      if admatch.group(1) == 'd': # "d" - Delete command
+        sl = int(admatch.group(2)) - 1 + adjust
+        # handle substitution explicitly, as add must come after del
+        # (last add may have incomplete line)
+        if i < len(diffs):
+          amatch = self.a_command.match(diffs[i])
+        else:
+          amatch = None
+        if amatch and int(amatch.group(1)) + adjust == sl + cn:
+          cn2 = int(amatch.group(2))
+          i += 1
+          ndiffs += ["d%d %d\na%d %d\n" % (sl + 1, cn2, sl + cn2, cn)] + \
+                    self.texts[sl:sl + cn]
+          self.texts[sl:sl + cn] = diffs[i:i + cn2]
+          adjust += cn2 - cn
+          i += cn2
+        else:
+          ndiffs += ["a%d %d\n" % (sl, cn)] + self.texts[sl:sl + cn]
+          del self.texts[sl:sl + cn]
+          adjust -= cn
+      else: # "a" - Add command
+        sl = int(admatch.group(2)) + adjust
+        ndiffs += ["d%d %d\n" % (sl + 1, cn)]
+        self.texts[sl:sl] = diffs[i:i + cn]
+        adjust += cn
+        i += cn
+    return "".join(ndiffs)
+
+  def zeroDiff(self):
+    if not self.texts:
+      return ""
+    return "a0 " + str(len(self.texts)) + "\n" + "".join(self.texts)
+
+
+class CVSCheckout:
+
+  class Rev: pass
+
+  __shared_state = { }
+  def __init__(self):
+    self.__dict__ = self.__shared_state
+
+  def init(self):
+    self.co_db = SDatabase(temp(CHECKOUT_DB), DB_OPEN_NEW)
+    Cleanup().register(temp(CHECKOUT_DB), pass8)
+    self.rev_db = SDatabase(temp(REVISIONS_DB), DB_OPEN_READ)
+    self.files = { }
+
+  def done(self):
+    print "leftover revisions:"
+    for file in self.files:
+      print file + ':',
+      for r in self.files[file]:
+        print r,
+      print
+    self.co_db.close()
+    self.rev_db.close()
+
+  def init_file(self, fname):
+    revs = { }
+    for line in self.rev_db[fname].split('\n'):
+      prv = None
+      for r in line.split():
+        try:
+          rev = revs[r]
+        except KeyError:
+          rev = CVSCheckout.Rev()
+          rev.ref = 0
+          rev.prev = None
+          revs[r] = rev
+        if prv:
+          revs[prv].prev = r
+          rev.ref += 1
+        prv = r
+    return revs
+
+  def checkout_i(self, fname, revs, r, co, ref):
+    rev = revs[r]
+    if rev.prev:
+      prev = revs[rev.prev]
+      try:
+        key = fname + '/' + rev.prev
+        co.setText(self.co_db[key])
+        prev.ref -= 1
+        if not prev.ref:
+#          print "used saved", fname, rev.prev, "- deleted"
+          del revs[rev.prev]
+          del self.co_db[key]
+#        else:
+#          print "used saved", fname, rev.prev, "- keeping. ref is now", prev.ref
+      except KeyError:
+        self.checkout_i(fname, revs, rev.prev, co, 1)
+    try:
+      co.applyDiff(self.rev_db[fname + '/' + r])
+    except KeyError:
+      pass
+    rev.ref -= ref
+    if rev.ref:
+#      print "checked out", fname, r, "- saving. ref is", rev.ref
+      self.co_db[fname + '/' + r] = co.getText()
+    else:
+#      print "checked out", fname, r, "- not saving"
+      del revs[r]
+
+  def checkout_ii(self, fname, revs, r, cvtnl=None):
+    co = RCSStream()
+    self.checkout_i(fname, revs, r, co, 0)
+    rv = co.getText()
+    if cvtnl:
+      rv = rv.replace('\r\n', '\n').replace('\r', '\n')
+    return rv
+
+  def checkout(self, c_rev, cvtnl=None):
+    try:
+      revs = self.files[c_rev.fname]
+      rv = self.checkout_ii(c_rev.fname, revs, c_rev.rev, cvtnl)
+      if not revs:
+        del self.files[c_rev.fname]
+    except KeyError:
+      revs = self.init_file(c_rev.fname)
+      rv = self.checkout_ii(c_rev.fname, revs, c_rev.rev, cvtnl)
+      if revs:
+        self.files[c_rev.fname] = revs
+    return rv
+
+
 class CVSRevision:
   def __init__(self, ctx, *args):
     """Initialize a new CVSRevision with Ctx object CTX, and ARGS.
@@ -848,7 +1005,6 @@
     If CTX is None, the following members and methods of the
     instantiated CVSRevision class object will be unavailable (or
     simply will not work correctly, if at all):
-       cvs_path
        svn_path
        svn_trunk_path
        is_default_branch_revision()
@@ -870,7 +1026,6 @@
        prev_rev        -->  (string or None) previous CVS rev, e.g., "1.2"
        rev             -->  (string) this CVS rev, e.g., "1.3"
        next_rev        -->  (string or None) next CVS rev, e.g., "1.4"
-       file_in_attic   -->  (char or None) true if RCS file is in Attic
        file_executable -->  (char or None) true if RCS file has exec bit set. 
        file_size       -->  (int) size of the RCS file
        deltatext_code  -->  (char) 'N' if non-empty deltatext, else 'E'
@@ -883,16 +1038,16 @@
     The two forms of initialization are equivalent."""
 
     self._ctx = ctx
-    if len(args) == 16:
+    if len(args) == 15:
       (self.timestamp, self.digest, self.prev_timestamp, self.op,
-       self.prev_rev, self.rev, self.next_rev, self.file_in_attic,
+       self.prev_rev, self.rev, self.next_rev,
        self.file_executable, self.file_size, self.deltatext_code,
        self.fname, 
        self.mode, self.branch_name, self.tags, self.branches) = args
     elif len(args) == 1:
-      data = args[0].split(' ', 14)
+      data = args[0].split(' ', 13)
       (self.timestamp, self.digest, self.prev_timestamp, self.op,
-       self.prev_rev, self.rev, self.next_rev, self.file_in_attic,
+       self.prev_rev, self.rev, self.next_rev,
        self.file_executable, self.file_size, self.deltatext_code,
        self.mode, self.branch_name, numtags, remainder) = data
       # Patch up data items which are not simple strings
@@ -905,8 +1060,6 @@
         self.prev_rev = None
       if self.next_rev == "*":
         self.next_rev = None
-      if self.file_in_attic == "*":
-        self.file_in_attic = None
       if self.file_executable == "*":
         self.file_executable = None
       self.file_size = int(self.file_size)
@@ -923,12 +1076,11 @@
       self.branches = branches_and_fname[:-1]
       self.fname = branches_and_fname[-1]
     else:
-      raise TypeError, 'CVSRevision() takes 2 or 16 arguments (%d given)' % \
+      raise TypeError, 'CVSRevision() takes 2 or 15 arguments (%d given)' % \
           (len(args) + 1)
-    if ctx is not None:
-      self.cvs_path = relative_name(self._ctx.cvsroot, self.fname[:-2])
-      self.svn_path = self._make_path(self.cvs_path, self.branch_name)
-      self.svn_trunk_path = self._make_path(self.cvs_path)
+    if ctx is not None: # strictly speaking this check is now superfluous
+      self.svn_path = self._make_path(self.fname, self.branch_name)
+      self.svn_trunk_path = self._make_path(self.fname)
 
   # The 'primary key' of a CVS Revision is the revision number + the
   # filename.  To provide a unique key (say, for a dict), we just glom
@@ -941,10 +1093,10 @@
     return revnum + "/" + self.fname
 
   def __str__(self):
-    return ('%08lx %s %s %s %s %s %s %s %s %d %s %s %s %d%s%s %d%s%s %s' % (
+    return ('%08lx %s %s %s %s %s %s %s %d %s %s %s %d%s%s %d%s%s %s' % (
       self.timestamp, self.digest, self.prev_timestamp or "*", self.op,
       (self.prev_rev or "*"), self.rev, (self.next_rev or "*"),
-      (self.file_in_attic or "*"), (self.file_executable or "*"),
+      (self.file_executable or "*"),
       self.file_size,
       self.deltatext_code, (self.mode or "*"), (self.branch_name or "*"),
       len(self.tags), self.tags and " " or "", " ".join(self.tags),
@@ -967,11 +1119,11 @@
     return 0
 
   def is_default_branch_revision(self):
-    """Return 1 if SELF.rev of SELF.cvs_path is a default branch
+    """Return 1 if SELF.rev of SELF.fname is a default branch
     revision according to DEFAULT_BRANCHES_DB (see the conditions
     documented there), else return None."""
-    if self._ctx._default_branches_db.has_key(self.cvs_path):
-      val = self._ctx._default_branches_db[self.cvs_path]
+    if self._ctx._default_branches_db.has_key(self.fname):
+      val = self._ctx._default_branches_db[self.fname]
       val_last_dot = val.rindex(".")
       our_last_dot = self.rev.rindex(".")
       default_branch = val[:val_last_dot]
@@ -1031,19 +1183,6 @@
     else:
       return self._ctx.trunk_base + '/' + path
 
-  def rcs_path(self):
-    """Returns the actual filesystem path to the RCS file of this
-    CVSRevision."""
-    if self.file_in_attic is None:
-      return self.fname
-    else:
-      basepath, filename = os.path.split(self.fname)
-      return os.path.join(basepath, 'Attic', filename)
-
-  def filename(self):
-    "Return the last path component of self.fname, minus the ',v'"
-    return os.path.split(self.fname)[-1][:-2]
-
 class SymbolDatabase:
   """This database records information on all symbols in the RCS
   files.  It is created in pass 1 and it is used in pass 2."""
@@ -1177,6 +1316,8 @@
   def __init__(self):
     self.revs = open(temp(DATAFILE + REVS_SUFFIX), 'w')
     Cleanup().register(temp(DATAFILE + REVS_SUFFIX), pass2)
+    self.revisions_db = SDatabase(temp(REVISIONS_DB), DB_OPEN_NEW)
+    Cleanup().register(temp(REVISIONS_DB), pass8)
     self.resync = open(temp(DATAFILE + RESYNC_SUFFIX), 'w')
     Cleanup().register(temp(DATAFILE + RESYNC_SUFFIX), pass2)
     self.default_branches_db = Database(temp(DEFAULT_BRANCHES_DB), DB_OPEN_NEW)
@@ -1211,6 +1352,8 @@
     if not canonical_name == filename:
       self.file_in_attic = 1
 
+    self.stream = RCSStream()
+
     file_stat = os.stat(filename)
     # The size of our file in bytes
     self.file_size = file_stat[stat.ST_SIZE]
@@ -1247,6 +1390,8 @@
     # distinguish between an add and a change.
     self.rev_state = { }
 
+    self.empty_1111 = None
+
     # Hash mapping branch numbers, like '1.7.2', to branch names,
     # like 'Release_1_0_dev'.
     self.branch_names = { }
@@ -1505,6 +1650,10 @@
         # finished the for-loop (no resyncing was performed)
         return
 
+  def writeout(self, r, tx):
+    if tx:
+      self.revisions_db[self.rel_name + '/' + r] = tx
+
   def set_revision_info(self, revision, log, text):
     timestamp, author, old_ts = self.rev_data[revision]
     digest = sha.new(log + '\0' + author).hexdigest()
@@ -1552,13 +1701,15 @@
       deltatext_code = DELTATEXT_NONEMPTY
     else:
       deltatext_code = DELTATEXT_EMPTY
+      if revision == '1.1.1.1':
+        self.empty_1111 = 1
 
     c_rev = CVSRevision(Ctx(), timestamp, digest, prev_timestamp, op,
                         self.prev_rev[revision], revision,
                         self.next_rev.get(revision),
-                        self.file_in_attic, self.file_executable,
+                        self.file_executable,
                         self.file_size,
-                        deltatext_code, self.fname,
+                        deltatext_code, self.rel_name,
                         self.mode, self.rev_to_branch_name(revision),
                         self.taglist.get(revision, []),
                         self.branchlist.get(revision, []))
@@ -1568,6 +1719,16 @@
     if not self.metadata_db.has_key(digest):
       self.metadata_db[digest] = (author, log)
 
+    if trunk_rev.match(revision):
+      if revision not in self.next_rev:
+        self.stream.setText(text)
+      else:
+        self.writeout(self.next_rev[revision], self.stream.invertDiff(text))
+      if not self.prev_rev[revision]:
+        self.writeout(revision, self.stream.zeroDiff())
+    else:
+      self.writeout(revision, text)
+
   def parse_completed(self):
     # Walk through all branches and tags and register them with
     # their parent branch in the symbol database.
@@ -1579,8 +1740,33 @@
 
     self.num_files = self.num_files + 1
 
+    tree = [ ]
+    for r in self.prev_rev:
+      if r not in self.next_rev and not (r == "1.1.1.1" and self.empty_1111):
+        while self.rev_state[r] == 'dead':
+          pr = self.prev_rev[r]
+          if not pr:
+            break
+          if self.next_rev.get(pr) != r:
+            break
+          r = pr
+        else:
+          rvs = [ ]
+          while 1:
+            rvs.append(r)
+            pr = self.prev_rev[r]
+            if not pr:
+              break
+            if self.next_rev.get(pr) != r:
+              rvs.append(pr)
+              break
+            r = pr
+          tree.append(" ".join(rvs))
+    self.revisions_db[self.rel_name] = "\n".join(tree)
+
   def write_symbol_db(self):
     self.symbol_db.write()
+    self.revisions_db.close()
 
 class SymbolingsLogger:
   """Manage the file that contains lines for symbol openings and
@@ -2038,7 +2224,7 @@
         if not c_rev.branches:
           continue
         cvs_generated_msg = ('file %s was initially added on branch %s.\n'
-                             % (c_rev.filename(),
+                             % (os.path.split(c_rev.fname)[-1],
                                 c_rev.branches[0]))
         author, log_msg = \
             Ctx()._persistence_manager.svn_commit_metadata[c_rev.digest]
@@ -3389,7 +3575,7 @@
     keywords = None
 
     if self.mime_mapper:
-      mime_type = self.mime_mapper.get_type_from_filename(c_rev.cvs_path)
+      mime_type = self.mime_mapper.get_type_from_filename(c_rev.fname)
 
     if not c_rev.mode == 'b':
       if not self.no_default_eol:
@@ -3684,10 +3870,12 @@
     if props_len:
       props_header = 'Prop-content-length: %d\n' % props_len
 
+    co = CVSCheckout().checkout(c_rev, s_item.needs_eol_filter)
+
     # treat .cvsignore as a directory property
     dir_path, basename = os.path.split(c_rev.svn_path)
     if basename == ".cvsignore":
-      ignore_vals = generate_ignores(c_rev)
+      ignore_vals = generate_ignores(co)
       ignore_contents = '\n'.join(ignore_vals)
       ignore_contents = ('K 10\nsvn:ignore\nV %d\n%s\n' % \
                          (len(ignore_contents), ignore_contents))
@@ -3705,73 +3893,24 @@
                           % (self._utf8_path(dir_path), ignore_len,
                              ignore_len, ignore_contents))
 
-    # If the file has keywords, we must use -kk to prevent CVS/RCS from
-    # expanding the keywords because they must be unexpanded in the
-    # repository, or Subversion will get confused.
-    if s_item.has_keywords:
-      pipe_cmd, pipe = get_co_pipe(c_rev, [ '-kk' ])
-    else:
-      pipe_cmd, pipe = get_co_pipe(c_rev)
+    checksum = md5.new()
+    checksum.update(co)
 
     self.dumpfile.write('Node-path: %s\n'
                         'Node-kind: file\n'
                         'Node-action: %s\n'
                         '%s'  # no property header if no props
-                        'Text-content-length: '
+                        'Text-content-length: %d\n'
+                        'Text-content-md5: %s\n'
+                        'Content-length: %d\n'
+                        '\n'
                         % (self._utf8_path(c_rev.svn_path),
-                           action, props_header))
-
-    pos = self.dumpfile.tell()
-
-    self.dumpfile.write('0000000000000000\n'
-                        'Text-content-md5: 00000000000000000000000000000000\n'
-                        'Content-length: 0000000000000000\n'
-                        '\n')
-
+                           action, props_header,
+                           len(co), checksum.hexdigest(),
+                           len(co) + props_len))
     if prop_contents:
       self.dumpfile.write(prop_contents)
-
-    # Insert a filter to convert all EOLs to LFs if neccessary
-    if s_item.needs_eol_filter:
-      data_reader = LF_EOL_Filter(pipe.stdout)
-    else:
-      data_reader = pipe.stdout
-
-    # Insert the rev contents, calculating length and checksum as we go.
-    checksum = md5.new()
-    length = 0
-    while True:
-      buf = data_reader.read(PIPE_READ_SIZE)
-      if buf == '':
-        break
-      checksum.update(buf)
-      length = length + len(buf)
-      self.dumpfile.write(buf)
-
-    pipe.stdout.close()
-    error_output = pipe.stderr.read()
-    exit_status = pipe.wait()
-    if exit_status:
-      sys.exit("%s: The command '%s' failed with exit status: %s\n"
-               "and the following output:\n"
-               "%s" % (error_prefix, pipe_cmd, exit_status, error_output))
-
-    # Go back to patch up the length and checksum headers:
-    self.dumpfile.seek(pos, 0)
-    # We left 16 zeros for the text length; replace them with the real
-    # length, padded on the left with spaces:
-    self.dumpfile.write('%16d' % length)
-    # 16... + 1 newline + len('Text-content-md5: ') == 35
-    self.dumpfile.seek(pos + 35, 0)
-    self.dumpfile.write(checksum.hexdigest())
-    # 35... + 32 bytes of checksum + 1 newline + len('Content-length: ') == 84
-    self.dumpfile.seek(pos + 84, 0)
-    # The content length is the length of property data, text data,
-    # and any metadata around/inside around them.
-    self.dumpfile.write('%16d' % (length + props_len))
-    # Jump back to the end of the stream
-    self.dumpfile.seek(0, 2)
-
+    self.dumpfile.write(co)
     # This record is done (write two newlines -- one to terminate
     # contents that weren't themselves newline-termination, one to
     # provide a blank line for readability.
@@ -4208,7 +4347,7 @@
                         warning_prefix)
 
         msg = "RESYNC: '%s' (%s): old time='%s' delta=%ds" \
-              % (c_rev.cvs_path, c_rev.rev, time.ctime(c_rev.timestamp),
+              % (c_rev.fname, c_rev.rev, time.ctime(c_rev.timestamp),
                  record[2] - c_rev.timestamp)
         Log().write(LOG_VERBOSE, msg)
 
@@ -4322,6 +4461,9 @@
   Log().write(LOG_QUIET, "Done.")
 
 def pass8():
+  checkout = CVSCheckout()
+  checkout.init()
+
   svncounter = 2 # Repository initialization is 1.
   repos = SVNRepositoryMirror()
   persistence_manager = PersistenceManager(DB_OPEN_READ)
@@ -4346,6 +4488,8 @@
 
   repos.finish()
 
+  checkout.done()
+
 _passes = [
   pass1,
   pass2,
@@ -4389,7 +4533,6 @@
     self.no_default_eol = 0
     self.eol_from_mime_type = 0
     self.keywords_off = 0
-    self.use_cvs = None
     self.svnadmin = "svnadmin"
     self.username = None
     self.print_help = 0
@@ -4492,8 +4635,6 @@
   print '  --profile            profile with \'hotshot\' (into file cvs2svn.hotshot)'
   print '  --dry-run            do not create a repository or a dumpfile;'
   print '                       just print what would happen.'
-  print '  --use-cvs            use CVS instead of RCS \'co\' to extract data'
-  print '                       (only use this if having problems with RCS)'
   print '  --svnadmin=PATH      path to the svnadmin program'
   print '  --trunk-only         convert only trunk commits, not tags nor branches'
   print '  --trunk=PATH         path for trunk (default: %s)'    \
@@ -4538,7 +4679,7 @@
                                  "username=", "existing-svnrepos",
                                  "branches=", "tags=", "encoding=",
                                  "force-branch=", "force-tag=", "exclude=",
-                                 "use-cvs", "mime-types=",
+                                 "mime-types=",
                                  "eol-from-mime-type", "no-default-eol",
                                  "trunk-only", "no-prune", "dry-run",
                                  "dump-only", "dumpfile=", "tmpdir=",
@@ -4588,8 +4729,6 @@
       ctx.dumpfile = value
     elif opt == '--tmpdir':
       ctx.tmpdir = value
-    elif opt == '--use-cvs':
-      ctx.use_cvs = 1
     elif opt == '--svnadmin':
       ctx.svnadmin = value
     elif opt == '--trunk-only':
@@ -4673,30 +4812,6 @@
                      "existing directory.\n" % ctx.cvsroot)
     sys.exit(1)
 
-  if ctx.use_cvs:
-    # Ascend above the specified root if necessary, to find the cvs_repository
-    # (a directory containing a CVSROOT directory) and the cvs_module (the
-    # path of the conversion root within the cvs repository)
-    # NB: cvs_module must be seperated by '/' *not* by os.sep .
-    ctx.cvs_repository = os.path.abspath(ctx.cvsroot)
-    prev_cvs_repository = None
-    ctx.cvs_module = ""
-    while prev_cvs_repository != ctx.cvs_repository:
-      if os.path.isdir(os.path.join(ctx.cvs_repository, 'CVSROOT')):
-        break
-      prev_cvs_repository = ctx.cvs_repository
-      ctx.cvs_repository, module_component = os.path.split(ctx.cvs_repository)
-      ctx.cvs_module = module_component + "/" + ctx.cvs_module
-    else:
-      # Hit the root (of the drive, on Windows) without finding a CVSROOT dir.
-      sys.stderr.write(error_prefix +
-                       ": the path '%s' is not a CVS repository, nor a path " \
-                       "within a CVS repository.  A CVS repository contains " \
-                       "a CVSROOT directory within its root directory.\n" \
-                       % ctx.cvsroot)
-      sys.exit(1)
-    os.environ['CVSROOT'] = ctx.cvs_repository
-
   if (not ctx.target) and (not ctx.dump_only) and (not ctx.dry_run):
     sys.stderr.write(error_prefix +
                      ": must pass one of '-s' or '--dump-only'.\n")
@@ -4772,28 +4887,6 @@
                      % ctx.tmpdir)
     sys.exit(1)
 
-  if ctx.use_cvs:
-    def cvs_ok():
-      pipe = SimplePopen([ 'cvs' ] + Ctx().cvs_global_arguments + \
-                         [ '--version' ], True)
-      pipe.stdin.close()
-      pipe.stdout.read()
-      errmsg = pipe.stderr.read()
-      status = pipe.wait()
-      ok = len(errmsg) == 0 and status == 0
-      return (ok, status, errmsg)
-
-    ctx.cvs_global_arguments = [ "-q", "-R" ]
-    ok, cvs_exitstatus, cvs_errmsg = cvs_ok()
-    if not ok:
-      ctx.cvs_global_arguments = [ "-q" ]
-      ok, cvs_exitstatus, cvs_errmsg = cvs_ok()
-
-    if not ok:
-      sys.stderr.write(error_prefix +
-                       ": error executing CVS: status %s, error output:\n" \
-                       % (cvs_exitstatus) + cvs_errmsg)
-  
   # But do lock the tmpdir, to avoid process clash.
   try:
     os.mkdir(os.path.join(ctx.tmpdir, 'cvs2svn.lock'))


From dberlin at dberlin.org  Mon Aug 15 00:09:04 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Sun, 14 Aug 2005 18:09:04 -0400
Subject: [Python-Dev] Fwd:  PEP: Migrating the Python CVS to Subversion
In-Reply-To: <ca471dc205081411131c89135c@mail.gmail.com>
References: <1123986783.21455.35.camel@linux.site>
	<ca471dc205081411131c89135c@mail.gmail.com>
Message-ID: <1124057345.25267.72.camel@linux.site>

On Sun, 2005-08-14 at 11:13 -0700, Guido van Rossum wrote:
> Here's another POV. (Why does evereybody keep emailing me personally?)
> 

Because we love you, and I forgot to cc python-dev.


From gustavo at niemeyer.net  Mon Aug 15 00:14:39 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Sun, 14 Aug 2005 19:14:39 -0300
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FFBBEF.5060202@v.loewis.de>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
	<20050814171259.GA8200@mems-exchange.org>
	<ca471dc20508141111fb5fe3c@mail.gmail.com>
	<42FFBBEF.5060202@v.loewis.de>
Message-ID: <20050814221439.GB11278@burma.localdomain>

> I had problems finding the place where the bazaar-NG source code
> repository is stored - is there a public access to the HEAD version?

You may use rsync:

  rsync -av --delete bazaar-ng.org::bazaar-ng/bzr/bzr.dev .

Or bzr itself:

  bzr branch http://bazaar-ng.org/bzr/bzr.dev

Regards,

-- 
Gustavo Niemeyer
http://niemeyer.net

From martin at v.loewis.de  Mon Aug 15 00:15:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 15 Aug 2005 00:15:30 +0200
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <1124057259.25267.70.camel@linux.site>
References: <42FBA376.5030605@canonical.com>	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>	<17151.15640.173982.961359@montanaro.dyndns.org>	<42FF6E4B.4000206@v.loewis.de>	<20050814171259.GA8200@mems-exchange.org>
	<1124057259.25267.70.camel@linux.site>
Message-ID: <42FFC282.9060307@v.loewis.de>

Daniel Berlin wrote:
> I'm not sure how big python's repo is, but you probably want to use the
> attached patch to speed up cvs2svn.  It changes it to reconstruct the
> revisions on it's own instead of calling cvs or rcs. 

Thanks for the patch, but cvs2svn works fairly well for us as is (in
the version that was released with Debian sarge); see

http://www.python.org/peps/pep-0347.html

for the conversion procedure. On the machine where I originally did
the conversion, the script required 7h; on my current machine, it is
done in 1:40 or so, which is acceptable.

Out of curiosity: do you use the --cvs-revnums parameter? Should we?

Regards,
Martin

From dberlin at dberlin.org  Mon Aug 15 00:25:02 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Sun, 14 Aug 2005 18:25:02 -0400
Subject: [Python-Dev] Fwd:  PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42FFBE96.7040000@v.loewis.de>
References: <1123986783.21455.35.camel@linux.site>
	<ca471dc205081411131c89135c@mail.gmail.com>
	<42FFBE96.7040000@v.loewis.de>
Message-ID: <1124058303.25267.88.camel@linux.site>

On Sun, 2005-08-14 at 23:58 +0200, "Martin v. L?wis" wrote:
> Guido van Rossum wrote:
> > Here's another POV.
> 
> I think I agree with Daniel's view, in particular wrt. to performance.
> Whatever the replacement tool, it should perform as well or better
> than CVS currently does; it also shouldn't perform much worse than
> subversion.

Then, in fairness, I should note that annotate is slower on subversion
(and monotone, and anything using binary deltas) than CVS.

This is because you can't generate line-diffs that annotate wants from
binary copy + add diffs.  You have to reconstruct the actual revisions
and then line diff them.    Thus, CVS is O(N) here, and SVN and other
binary delta users are  O(N^2).

You wouldn't really notice the speed difference when you are annotating
a file with 100 revisions.  You would if you annotate the 800k changelog
which has 30k trunk revisions.  CVS takes 4 seconds, svn takes ~5
minutes, the whole time being spent in doing diffs of those revisions.
I rewrote the blame algorithm recently so that it will only take about 2
minutes on changelog, but it cheats because it knows it can stop early
because it's blamed all the revisions (since our changelog rotates).

For those curious, you also can't directly generate "always-correct"
byte-level differences from the diffs, since their goal is to find the
most space efficient way to transform rev old into rev new, *not* record
actual byte-level changes that occurred between old and new.  It may
turn out that doing an add of 2 bytes is cheaper than specifying the
opcode for copy(start,len).  Actual diffs are produced by reproducing
the texts and line diffing them.  Such is the cost of efficient
storage :).

> 
> I've been using git (or, rather, cogito) to keep up-to-date with the
> Linux kernel. While performance of git is really good, storage
> requirements are *quite* high, and initial "checkout" takes a long
> time - even though the Linux kernel repository stores virtual no
> history (there was a strict cut when converting the bitkeeper HEAD).
> So these distributed tools would cause quite some disk consumption
> on client machines. bazaar-ng apparently supports only-remote
> repositories as well, so that might be no concern.

The argument "network and disk is cheap" doesn't work for us when you
are talking 5-10 gigabytes of initial transfer :).  However, I doubt
it's more than a hundred meg or so for python, if that.

You may run into these problems in 10 years :)


From mwh at python.net  Mon Aug 15 00:31:33 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 14 Aug 2005 23:31:33 +0100
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
In-Reply-To: <42FFBB1A.8060206@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun,
	14 Aug 2005 23:43:54 +0200")
References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
	<42FFBB1A.8060206@v.loewis.de>
Message-ID: <2mfytcup4q.fsf@starship.python.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> Ronald Oussoren wrote:
>> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a  
>> checkout that is less than two hours old. I'm building a standard  
>> unix tree (no framework install):
>
> I just committed what I think is a bugfix for the recent st_gen support.
> Unfortunately, I can't try the code, since I don't have access to
> BSD/OSX at the moment.
>
> So please report whether there is any change in behaviour.

Seems to have done the trick, thanks.

Cheers,
mwh

-- 
  <SteveA> I just had a very odd phone call
  <SteveA> from a researcher with the french TV station "TF1"
  <SteveA> asking about inflatable football referees
                                                -- from Twisted.Quotes

From dberlin at dberlin.org  Mon Aug 15 00:32:31 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Sun, 14 Aug 2005 18:32:31 -0400
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FFC282.9060307@v.loewis.de>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>	<20050814171259.GA8200@mems-exchange.org>
	<1124057259.25267.70.camel@linux.site> <42FFC282.9060307@v.loewis.de>
Message-ID: <1124058751.25267.94.camel@linux.site>

On Mon, 2005-08-15 at 00:15 +0200, "Martin v. L?wis" wrote:
> Daniel Berlin wrote:
> > I'm not sure how big python's repo is, but you probably want to use the
> > attached patch to speed up cvs2svn.  It changes it to reconstruct the
> > revisions on it's own instead of calling cvs or rcs. 
> 
> Thanks for the patch, but cvs2svn works fairly well for us as is (in
> the version that was released with Debian sarge); see
> 
> http://www.python.org/peps/pep-0347.html
> 
> for the conversion procedure. On the machine where I originally did
> the conversion, the script required 7h; on my current machine, it is
> done in 1:40 or so, which is acceptable.
> 
> Out of curiosity: do you use the --cvs-revnums parameter? Should we?

No.  In our case, it doesn't buy us anything.

In the name of continuity, we have to make the old cvsweb urls work with
new viewcvs urls anyway (they appear in bug reports, etc). We also don't
want to destroy the ability for people to diff existing cvs working
copies.  I may have been able to hack something around with cvs-revnums,
but not easily.

Thus, we are just going to keep a readonly version of the repo around,
and a readonly cvsweb,  with a warning at the top of the page  that the
current source is stored in subversion.


> 
> Regards,
> Martin


From martin at v.loewis.de  Mon Aug 15 00:33:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 15 Aug 2005 00:33:04 +0200
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <20050814221439.GB11278@burma.localdomain>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>
	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
	<20050814171259.GA8200@mems-exchange.org>
	<ca471dc20508141111fb5fe3c@mail.gmail.com>
	<42FFBBEF.5060202@v.loewis.de>
	<20050814221439.GB11278@burma.localdomain>
Message-ID: <42FFC6A0.9080509@v.loewis.de>

Gustavo Niemeyer wrote:
> You may use rsync:
> 
>   rsync -av --delete bazaar-ng.org::bazaar-ng/bzr/bzr.dev .
> 
> Or bzr itself:
> 
>   bzr branch http://bazaar-ng.org/bzr/bzr.dev

Ah, thanks. Fetching it with rsync is so much faster than fetching
it with bzr, though...

Regards,
Martin

From martin at v.loewis.de  Mon Aug 15 00:37:31 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 15 Aug 2005 00:37:31 +0200
Subject: [Python-Dev] Fwd:  PEP: Migrating the Python CVS to Subversion
In-Reply-To: <1124058303.25267.88.camel@linux.site>
References: <1123986783.21455.35.camel@linux.site>	
	<ca471dc205081411131c89135c@mail.gmail.com>
	<42FFBE96.7040000@v.loewis.de>
	<1124058303.25267.88.camel@linux.site>
Message-ID: <42FFC7AB.6000201@v.loewis.de>

Daniel Berlin wrote:
> The argument "network and disk is cheap" doesn't work for us when you
> are talking 5-10 gigabytes of initial transfer :).  However, I doubt
> it's more than a hundred meg or so for python, if that.
> 
> You may run into these problems in 10 years :)

I don't know how bazaar-ng would perform - but the converted fsfs svn
repository is 718MiB.

Of course, in 10 years, 5-10GiB of network transfer will be cheap :-)

Regards,
Martin

From lalo at exoweb.net  Mon Aug 15 03:58:58 2005
From: lalo at exoweb.net (Lalo Martins)
Date: Mon, 15 Aug 2005 09:58:58 +0800
Subject: [Python-Dev] cvs to bzr?
In-Reply-To: <17150.31637.180169.877441@montanaro.dyndns.org>
References: <17150.31637.180169.877441@montanaro.dyndns.org>
Message-ID: <ddostc$paf$1@sea.gmane.org>

And so says skip at pobox.com on 14/08/05 07:00...
> Based on the message Guido forwarded, I installed bazaar-ng.  From Mark's
> note it seems they convert cvs repositories to bzr repositories, but I
> didn't see any mention in the bzr docs of any sort of cvs2bzr tool.
> Likewise, Google didn't turn up anything obvious.  Anyone know of something?

Just for the sake of fairness - Mark's email states that they convert
cvs repositories to baz (Bazaar 1.x), not to bzr (Bazaar-NG, soon-to-be
Bazaar 2.x).  The tools to convert to bzr are not yet mature, as bzr
itself just recently started to solidify.  (The pace of development is
one of my favorite "features" about bzr; it's a testament to python and
to bzr itself.)

You can, however, convert from CVS to baz (arch), and from there to bzr.

best,
                                               Lalo Martins
--
      So many of our dreams at first seem impossible,
       then they seem improbable, and then, when we
       summon the will, they soon become inevitable.
--
http://www.exoweb.net/                  mailto:lalo at exoweb.net
GNU: never give up freedom                 http://www.gnu.org/


From skip at pobox.com  Mon Aug 15 04:39:41 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 14 Aug 2005 21:39:41 -0500
Subject: [Python-Dev] cvs to bzr?
In-Reply-To: <ddostc$paf$1@sea.gmane.org>
References: <17150.31637.180169.877441@montanaro.dyndns.org>
	<ddostc$paf$1@sea.gmane.org>
Message-ID: <17152.109.883835.190683@montanaro.dyndns.org>


    Lalo> You can, however, convert from CVS to baz (arch), and from there
    Lalo> to bzr.

Would this be with cscvs?  According to the cscvs wiki page at

    http://wiki.gnuarch.org/cscvs

cscvs is current unmaintained and can't handle repositories with branches.
In addition, it appears that to do a one-time convertsion from cvs to bzr I
will need to also install arch and baz as well as any other packages they
depend on.

Skip

From greg at electricrain.com  Mon Aug 15 05:34:49 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Sun, 14 Aug 2005 20:34:49 -0700
Subject: [Python-Dev] request for code review - hashlib - patch #1121611
Message-ID: <20050815033449.GW16043@zot.electricrain.com>

https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470

This is the hashlib module that speeds up python's md5 and sha1
support by using openssl (when available) as well as adding sha224/256
+ sha384/512 support (plus anything openssl provides).

I believe it is complete and ready to commit (hashlib-009.patch), any
objections?

compiled docs in html are here for easy perusal:

  http://electricrain.com/greg/hashlib-py25-doc/

thanks,
greg

From t-meyer at ihug.co.nz  Mon Aug 15 05:46:19 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Mon, 15 Aug 2005 15:46:19 +1200
Subject: [Python-Dev] python-dev Summary for 2005-07-16 through 2005-07-31
	[draft]
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB0479@its-xchg4.massey.ac.nz>

Here's July Part Two.  As usual, if anyone can spare the time to proofread
this (it's fairly short this fortnight!), that would be great!  Please send
any corrections or suggestions to Tim (tlesher at gmail.com), Steve
(steven.bethard at gmail.com) and/or me, rather than cluttering the list.
Ta!

=============
Announcements
=============

-------------------------------------------------
PyPy Sprint in Heidelberg 22nd - 29th August 2005
-------------------------------------------------

Heidelberg University in Germany will host a PyPy_ sprint from 22nd August
to 29th August. The sprint  will push towards the 0.7 release of PyPy_ which
hopes to reach Python 2.4.1 compliancy and to have  full, direct translation
into a low level language, instead of reinterpretation through CPython.  If
you'd like to help out, this is a great place to start!

For more information, see PyPy's `Heidelberg sprint`_ page.

.. _PyPy: http://codespeak.net/pypy
.. _Heidelberg sprint:
http://codespeak.net/pypy/index.cgi?extradoc/sprintinfo/Heidelberg-sprint.ht
ml

Contributing thread:

- `Next PyPy sprint: Heidelberg (Germany), 22nd-29th of August
<http://mail.python.org/pipermail/python-dev/2005-July/055031.html>`__


--------------------------------
zlib 1.2.3 in Python 2.4 and 2.5
--------------------------------

Trent Mick supplied a patch for updating Python from zlib 1.2.1 to zlib
1.2.3, which eliminates some  potential security vulnerabilities. Python
will move to this new version of zlib in both the  maintenance 2.4 branch
and the main (2.5) branch.

Contributing thread:

- `zlib 1.2.3 is just out
<http://mail.python.org/pipermail/python-dev/2005-July/054926.html>`__

=========
Summaries
=========

-------------------------------
Moving Python CVS to Subversion
-------------------------------

Martin v. L?wis submitted `PEP 347`_, covering changing from CVS to SVN for
source code revision  control of the Python repository, and moving from
hosting the repository on sourceforge.net to  python.org.

Moving to SVN from CVS met with general favour from most people, although
most were undecided about  moving from sourceforge.net to python.org.  The
additional administration requirements of the move  were the primary
concern, and moving to an alternative host was suggested.  Martin is open to
including suggestions for alternative hosts in the PEP, but is not
interested in carrying out such  research himself; as such, if alternative
hosts are to be included, someone needs to volunteer to  collect all the
required information and submit it to Martin.

Discussion about the conversion and the move is continuing in August.

.. _PEP 347: http://www.python.org/peps/pep-0347.html

Contributing thread:

- `PEP: Migrating the Python CVS to Subversion
<http://mail.python.org/pipermail/python-dev/2005- July/054950.html>`__

---------------------------------
Exception Hierarchy in Python 3.0
---------------------------------

Brett Cannon posted the first draft of `PEP 348`_, covering reorganisation
of exceptions in Python  3.0.  The initial draft included major changes to
the hierarchy, requiring any object raised to  inherit from a certain
superclass, and changing bare 'except' clauses to catch a specific
superclass.   The latter two proposals didn't generate much comment
(although Guido vacillated between removing bare  'except' clauses and not),
but the proposed hierarchy organisation and renaming was hotly discussed.

Nick Coghlan countered each revision of Brett's maximum-changes PEP with a
minimum-changes PEP, each  evolving through python-dev discussion, and
gradually moving to an acceptable middle ground.  At  present, it seems that
the changes will be much more minor than the original proposal.

The thread branched off into comments about `Python 3.0`_ changes in
general.  The consensus was  generally that although backwards compatibility
isn't required in Python 3.0, it should only be broken  when there is a
clear reason for it, and that, as much as possible, Python 3.0 should be
Python 2.9  without a lot of backwards compatibility code.  A number of
people indicated that they were reasonably  content with the existing
exception hierarchy, and didn't feel that major changes were required.

Guido suggested that a good principle for determining the ideal exception
hierarchy is whether there's  a use case for catching the common base class.
Marc-Andre Lemburg pointed out that when migrating  code changes in
Exception names are reasonably easy to automate, but changes in the
inheritance tree  are much more difficult.

Many exceptions were discussed at length (e.g. WindowsError, RuntimeError),
with debate about whether  they should continue to exist in Python 3.0, be
renamed, or be removed.  The PEP contains the current  status for each of
these exceptions.

The PEP evolution and discussion are still continuing in August, and since
this is for Python 3.0, are  likely to be considered open for some time yet.

.. _Python 3.0: http://www.python.org/peps/pep-3000.html
.. _PEP 348: http://www.python.org/peps/pep-0348.html

Contributing thread:

- `Pre-PEP: Exception Reorganization for Python 3.0
<http://mail.python.org/pipermail/python-dev/2005 -July/055018.html>`__

-----------------------------------------
Docstrings and the Official Documentation
-----------------------------------------

A new `bug report`_ pointed out that the docstring help for cgi.escape was
not as detailed as that in  the full documentation, prompting Skip Montanaro
to ask whether this should be the case or not.   Several reasons were
outlined why docstrings should be more of a "quick reference card" than a
"textbook" (i.e. maintain the status quo).  Tim Peters suggested that tools
to extract text from the  full documentation would be a more sensible method
of making the "textbook" available from help ()/pydoc; if anyone is
interested, then this would probably be the best way to start implementing
this.

.. _bug report: http://python.org/sf/1243553

Contributing thread:

- `should doc string content == documentation content?
<http://mail.python.org/pipermail/python- dev/2005-July/054928.html>`__

---------------------------
Syntax suggestion: "while:"
---------------------------

Martin Blais suggested "while:" as a syntactic shortcut for "while True:".
The suggestion was shot  down pretty quickly; not only is "while:" less
explicit than "while True:", but it introduces  readability problems for the
apparently large number of people who, when reading "while:", immediately
think "while what?"

Contributing thread:

- `while:
<http://mail.python.org/pipermail/python-dev/2005-July/054914.html>`__

------------------
Sets in Python 2.5
------------------

In Python 2.4, there is no C API for the built-in set type; you must use
PyObject_Call(), etc. as you  would in accessing other Python objects.
However, in Python 2.5, Raymond Hettinger plans to introduce  a C API along
with a new implementation of the set type that uses its own data structure
instead of  forwarding everything to dicts.

Contributing thread:

- `C api for built-in type set?
<http://mail.python.org/pipermail/python-dev/2005-July/054940.html>`__


===============
Skipped Threads
===============

- `Some RFE for review
<http://mail.python.org/pipermail/python-dev/2005-July/054896.html>`__
- `python/dist/src/Doc/lib emailutil.tex,1.11,1.12
<http://mail.python.org/pipermail/python-dev/2005- July/054902.html>`__
- `read only files
<http://mail.python.org/pipermail/python-dev/2005-July/054907.html>`__
- `builtin filter function
<http://mail.python.org/pipermail/python-dev/2005-July/054909.html>`__
- `Weekly Python Patch/Bug Summary
<http://mail.python.org/pipermail/python-dev/2005- July/054921.html>`__
- `Information request; Keywords: compiler compiler, EBNF, python, ISO 14977
<http://mail.python.org/pipermail/python-dev/2005-July/054925.html>`__
- `installation of python on a Zaurus
<http://mail.python.org/pipermail/python-dev/2005- July/054937.html>`__
- `python-dev summary for 2005-07-01 to 2005-07-15 [draft]
<http://mail.python.org/pipermail/python- dev/2005-July/054948.html>`__
- `math.fabs redundant?
<http://mail.python.org/pipermail/python-dev/2005-July/054991.html>`__


=================================================
Skipped Threads (covered in the previous summary)
=================================================

- `'With' context documentation draft (was Re: Terminology for PEP 343
<http://mail.python.org/pipermail/python-dev/2005-July/054891.html>`__
- `Adding the 'path' module (was Re: Some RFE for review)
<http://mail.python.org/pipermail/python- dev/2005-July/054894.html>`__
- `[C++-sig] GCC version compatibility
<http://mail.python.org/pipermail/python-dev/2005- July/054895.html>`__


From bcannon at gmail.com  Mon Aug 15 06:35:02 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sun, 14 Aug 2005 21:35:02 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
Message-ID: <bbaeab10050814213575ce6ad7@mail.gmail.com>

I am sure people mainly care about the big changes inroduced by
revision 1.8 of the PEP (http://www.python.org/peps/pep-0348.html). 
So, first is that WindowsError is staying.  Enough people want it to
stay and have a legitimate use that I removed the proposal to ditch
it.

Second, I changed the bare 'except' proposal again to recommend its
removal.  I had been feeling they should just go for about a week, but
I solidified my thinking when I was talking with Alex and Anna
Martelli and managed to convince them bare 'except's should go after
Alex initially thought they should be changed to be ``except
Exception``.  This obviously goes against what Guido last said he
wanted, but I hope I can convince him to get rid of bare 'except's.

Minor stuff is fleshing out the arguments for TerminatingException (I
am sure Raymond loves that I am leaving this part in  =) and adding a
Roadmap for the transition.

-Brett

From miha at mpi-magdeburg.mpg.de  Mon Aug 15 08:22:10 2005
From: miha at mpi-magdeburg.mpg.de (Michael Krasnyk)
Date: Mon, 15 Aug 2005 08:22:10 +0200
Subject: [Python-Dev] SWIG and rlcompleter
Message-ID: <43003492.8060904@mpi-magdeburg.mpg.de>

Hello all,

Recently I've found that rlcompleter does not work correctly with SWIG
generated classes.
In some cases dir(object) containes not only strings, but also type of
the object, smth like <class 'mywrapper.IClassPtr'>. 
And condition "word[:n] == attr" throws an exception.
Is it possible to solve this problem with following path?

--- cut ---
--- rlcompleter.py.org  2005-08-14 13:02:02.000000000 +0200
+++ rlcompleter.py      2005-08-14 13:18:59.000000000 +0200
@@ -136,8 +136,11 @@
         matches = []
         n = len(attr)
         for word in words:
-            if word[:n] == attr and word != "__builtins__":
-                matches.append("%s.%s" % (expr, word))
+            try:
+                if word[:n] == attr and word != "__builtins__":
+                    matches.append("%s.%s" % (expr, word))
+            except:
+                pass
         return matches

  def get_class_members(klass):
--- cut ---

Thanks in advance,
Michael Krasnyk


From stephen.thorne at gmail.com  Mon Aug 15 08:40:22 2005
From: stephen.thorne at gmail.com (Stephen Thorne)
Date: Mon, 15 Aug 2005 16:40:22 +1000
Subject: [Python-Dev] string_join overrides TypeError exception thrown in
	generator
Message-ID: <3e8ca5c8050814234020735adf@mail.gmail.com>

Hi,

An interesting problem was pointed out to me, which I have distilled
to this testcase:
def gen():
     raise TypeError, "I am a TypeError"
     yield 1

def one(): return ''.join( x for x in gen() )
def two(): return ''.join([x for x in gen()])

for x in one, two:
    try:
         x()
    except TypeError, e:
         print e

Expected output is:
"""
I am a TypeError
I am a TypeError
"""

Actual output is:
"""
sequence expected, generator found
I am a TypeError
"""

Upon looking at the implementation of 'string_join' in
stringobject.c[1], It's quite obvious what's gone wrong, an exception
has been triggered in PySequence_Fast, and string_join overrides that
exception, assuming that the only TypeErrors thrown by PySequence_Fast
are caused by 'orig' being a value that was an invalid sequence type,
ignoring the possibility that a TypeError could be thrown by
exhausting a generator.

	seq = PySequence_Fast(orig, "");
	if (seq == NULL) {
		if (PyErr_ExceptionMatches(PyExc_TypeError))
			PyErr_Format(PyExc_TypeError,
				     "sequence expected, %.80s found",
				     orig->ob_type->tp_name);
		return NULL;
	}

I can't see an obvious solution, but perhaps generators should get
special treatment regardless. Reading over this code it looks like the
generator is exhausted all at once, instead of incrementally..
-- 
Stephen Thorne
Development Engineer

[1] http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/stringobject.c?rev=2.231&view=markup

From benji at benjiyork.com  Mon Aug 15 13:30:36 2005
From: benji at benjiyork.com (Benji York)
Date: Mon, 15 Aug 2005 07:30:36 -0400
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <42FF6E4B.4000206@v.loewis.de>
References: <42FBA376.5030605@canonical.com>	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>
Message-ID: <43007CDC.6060200@benjiyork.com>

Martin v. L?wis wrote:
> skip at pobox.com wrote:
>>Granted.  What is the cost of waiting a bit longer to see if it (or
>>something else) gets more usable and would hit the mark better than svn?
> 
> It depends on what "a bit" is. Waiting a month would be fine; waiting
> two years might be pointless.

This might be too convoluted to consider, but I thought I might throw it 
out there.  We use svn for our repositories, but I've taken to also 
using bzr so I can do local commits and reversions (within a particular 
svn reversion).  I can imagine expanding that usage to sharing branches 
and such via bzr (or mercurial, which looks great), but keeping the 
trunk in svn.
--
Benji York

From ncoghlan at gmail.com  Mon Aug 15 14:28:09 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 15 Aug 2005 22:28:09 +1000
Subject: [Python-Dev] string_join overrides TypeError exception thrown
 in	generator
In-Reply-To: <3e8ca5c8050814234020735adf@mail.gmail.com>
References: <3e8ca5c8050814234020735adf@mail.gmail.com>
Message-ID: <43008A59.2070306@gmail.com>

Stephen Thorne wrote:
> I can't see an obvious solution, but perhaps generators should get
> special treatment regardless. Reading over this code it looks like the
> generator is exhausted all at once, instead of incrementally..

Indeed - str.join uses a multipass approach to build the final string, so it 
needs to ensure it has a reiterable to play with. PySequence_Fast achieves 
that, at the cost of dumping a generator into a sequence rather than building 
a string from it directly.

Unicode.join uses PySequence_Fast too, and has the same problem with masking 
the TypeError from the generator.

The calling code simply can't tell if the NULL return was set directly by 
PySequence_Fast, or was relayed by PySequence_List (which got it from 
_PyList_Extend, which got it from listextend, which got it from iternext, etc).

This is the kind of problem that PEP 344 is designed to solve :)

This also shows that argument validation is one of the cases where using an 
iterable instead of a generator is a good thing, since errors get raised where 
the generator is created, instead of where it is first used:

class gen(object):
   def __init__(self):
      raise TypeError, "I am a TypeError"
   def __iter__(self):
      yield 1

def one(): return ''.join( x for x in gen() )
def two(): return ''.join([x for x in gen()])

for x in one, two:
     try:
          x()
     except TypeError, e:
          print e

Hmm, makes me think of a neat little decorator:

def step_on_creation(gen):
     def start_gen(*args, **kwds):
         g = gen(*args, **kwds)
         g.next()
         return g
     start_gen.__name__ = gen.__name__
     start_gen.__doc__ = gen.__doc__
     start_gen.__dict__ = gen.__dict__
     return start_gen

@step_on_creation
def gen():
      # Setup executed at creation time
      raise TypeError, "I am a TypeError"
      yield None
      # The actual iteration steps
      yield 1


Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From raymond.hettinger at verizon.net  Mon Aug 15 15:16:47 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 15 Aug 2005 09:16:47 -0400
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <bbaeab10050814213575ce6ad7@mail.gmail.com>
Message-ID: <001601c5a19b$9b2abf80$af26c797@oemcomputer>

[Brett]
> This obviously goes against what Guido last said he
> wanted, but I hope I can convince him to get rid of bare 'except's.

-1 on eliminating bare excepts.  This unnecessarily breaks tons of code
without offering ANY compensating benefits.  There are valid use cases
for this construct.  It is completely Pythonic to have bare keywords
apply a useful default as an aid to readability and ease of coding.

+1 on the new BaseException

+1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt.

-1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except
TerminatingException".  1) Grepping existing code bases shows that these
two are almost never caught together so it is a bit silly to introduce a
second way to do it.  2) Efforts to keep the builtin namespace compact
argue against adding a new builtin that will almost never be used.  3)
The change unnecessarily sacrifices flatness, making the language more
difficult to learn.  4) The "self-documenting" rationale is weak -- if
needed, a two-word comment would suffice.  Existing code almost never
has had to comment on catching multiple exceptions -- the exception
tuple itself has been sufficiently obvious and explicit.  This rationale
assumes that code readers aren't smart enough to infer that SystemExit
has something to do with termination.


Raymond


From phd at mail2.phd.pp.ru  Mon Aug 15 15:33:33 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Mon, 15 Aug 2005 17:33:33 +0400
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
References: <bbaeab10050814213575ce6ad7@mail.gmail.com>
	<001601c5a19b$9b2abf80$af26c797@oemcomputer>
Message-ID: <20050815133333.GA17966@phd.pp.ru>

On Mon, Aug 15, 2005 at 09:16:47AM -0400, Raymond Hettinger wrote:
> It is completely Pythonic to have bare keywords
> apply a useful default as an aid to readability and ease of coding.

   Bare "while:" was rejected because of "while WHAT?!". Bare "except:"
does not cause "except WHAT?!" reaction. Isn't it funny?! (-:

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From raymond.hettinger at verizon.net  Mon Aug 15 15:47:54 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 15 Aug 2005 09:47:54 -0400
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <20050815133333.GA17966@phd.pp.ru>
Message-ID: <001801c5a19f$f43ae6a0$af26c797@oemcomputer>

> > It is completely Pythonic to have bare keywords
> > apply a useful default as an aid to readability and ease of coding.

[Oleg]
>    Bare "while:" was rejected because of "while WHAT?!". Bare
"except:"
> does not cause "except WHAT?!" reaction. Isn't it funny?! (-:

It's both funny and interesting.  It raises the question of what makes
the two different -- why is one instantly recognizable and why does the
other trigger a gag reflex.  My thought is that bare excepts occur in a
context that makes their meaning clear:

    try:
        block()
    except SpecificException:
        se_handler()
    except:
        handle_everything_else()

The pattern of use is similar to a "default" in a switch-case construct.
Viewed out-of-context, one would ask "default WHAT".  Viewed after a
series of case statements, the meaning is vividly clear.


Raymond


From tdickenson at devmail.geminidataloggers.co.uk  Mon Aug 15 16:48:11 2005
From: tdickenson at devmail.geminidataloggers.co.uk (Toby Dickenson)
Date: Mon, 15 Aug 2005 15:48:11 +0100
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
References: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
Message-ID: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>

On Monday 15 August 2005 14:16, Raymond Hettinger wrote:

> -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except
> TerminatingException".  

The rationale for including TerminatingException in the PEP would also be 
satisfied by having a TerminatingExceptions tuple (in the exceptions 
module?). It makes sense to express the classification of exceptions that are 
intended to terminate the interpreter, but we dont need to express that 
classification as inheritence.

-- 
Toby Dickenson

From nick.bastin at gmail.com  Mon Aug 15 18:27:36 2005
From: nick.bastin at gmail.com (Nicholas Bastin)
Date: Mon, 15 Aug 2005 12:27:36 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <42F6F61B.1080505@v.loewis.de>
References: <42E93940.6080708@v.loewis.de>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<42F6F61B.1080505@v.loewis.de>
Message-ID: <66d0a6e1050815092760a2dab3@mail.gmail.com>

On 8/8/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Nicholas Bastin wrote:
> > It's a mature product.  I would hope that that would count for
> > something.
> 
> Sure. But so is subversion.

I will then assume that you and I have different ideas of what 'mature' means.

> So I should then remove your offer to host a perforce installation,
> as you never made such an offer, right?

Correct.
 . 
> Yes. That's what this PEP is for. So I guess you are -1 on the
> PEP.

Not completely.  More like -0 at the moment.  We need a better system,
but I think we shouldn't just pick a system because it's the one the
PEP writer preferred - there should be some sort of effort to test a
few systems (including bug trackers).  I know this is work, but this
isn't just something we can change easily again later.

--
Nick

From ronaldoussoren at mac.com  Mon Aug 15 09:15:59 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Mon, 15 Aug 2005 09:15:59 +0200
Subject: [Python-Dev] build problems on macosx (CVS HEAD)
In-Reply-To: <42FFBB1A.8060206@v.loewis.de>
References: <39DD497A-89DB-4DF0-A272-62BB66CEC090@mac.com>
	<42FFBB1A.8060206@v.loewis.de>
Message-ID: <1A6CE290-CD7F-4AEC-B649-39CB4ED57E8B@mac.com>


On 14-aug-2005, at 23:43, Martin v. L?wis wrote:

> Ronald Oussoren wrote:
>
>> I'm trying to build CVS HEAD on OSX 10.4.2 (Xcode 2.1), with a
>> checkout that is less than two hours old. I'm building a standard
>> unix tree (no framework install):
>>
>
> I just committed what I think is a bugfix for the recent st_gen  
> support.
> Unfortunately, I can't try the code, since I don't have access to
> BSD/OSX at the moment.
>
> So please report whether there is any change in behaviour.

Your change has fixed this issue.

Thanks for the quick response,
     Ronald

>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com
>


From gvanrossum at gmail.com  Mon Aug 15 19:08:02 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 15 Aug 2005 10:08:02 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
References: <bbaeab10050814213575ce6ad7@mail.gmail.com>
	<001601c5a19b$9b2abf80$af26c797@oemcomputer>
Message-ID: <ca471dc205081510082be5de3@mail.gmail.com>

I'm with Raymond here.

On 8/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> [Brett]
> > This obviously goes against what Guido last said he
> > wanted, but I hope I can convince him to get rid of bare 'except's.
> 
> -1 on eliminating bare excepts.  This unnecessarily breaks tons of code
> without offering ANY compensating benefits.  There are valid use cases
> for this construct.  It is completely Pythonic to have bare keywords
> apply a useful default as an aid to readability and ease of coding.
> 
> +1 on the new BaseException
> 
> +1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt.
> 
> -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except
> TerminatingException".  1) Grepping existing code bases shows that these
> two are almost never caught together so it is a bit silly to introduce a
> second way to do it.  2) Efforts to keep the builtin namespace compact
> argue against adding a new builtin that will almost never be used.  3)
> The change unnecessarily sacrifices flatness, making the language more
> difficult to learn.  4) The "self-documenting" rationale is weak -- if
> needed, a two-word comment would suffice.  Existing code almost never
> has had to comment on catching multiple exceptions -- the exception
> tuple itself has been sufficiently obvious and explicit.  This rationale
> assumes that code readers aren't smart enough to infer that SystemExit
> has something to do with termination.
> 
> 
> 
> Raymond
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Mon Aug 15 19:11:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 15 Aug 2005 10:11:24 -0700
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <43003492.8060904@mpi-magdeburg.mpg.de>
References: <43003492.8060904@mpi-magdeburg.mpg.de>
Message-ID: <ca471dc205081510115c88ecc1@mail.gmail.com>

(1) Please use the SF patch manager.

(2) Please don't propose adding more bare "except:" clauses to the
standard library.

(3) I think a better patch is to use str(word)[:n] instead of word[:n].

On 8/14/05, Michael Krasnyk <miha at mpi-magdeburg.mpg.de> wrote:
> Hello all,
> 
> Recently I've found that rlcompleter does not work correctly with SWIG
> generated classes.
> In some cases dir(object) containes not only strings, but also type of
> the object, smth like <class 'mywrapper.IClassPtr'>.
> And condition "word[:n] == attr" throws an exception.
> Is it possible to solve this problem with following path?
> 
> --- cut ---
> --- rlcompleter.py.org  2005-08-14 13:02:02.000000000 +0200
> +++ rlcompleter.py      2005-08-14 13:18:59.000000000 +0200
> @@ -136,8 +136,11 @@
>          matches = []
>          n = len(attr)
>          for word in words:
> -            if word[:n] == attr and word != "__builtins__":
> -                matches.append("%s.%s" % (expr, word))
> +            try:
> +                if word[:n] == attr and word != "__builtins__":
> +                    matches.append("%s.%s" % (expr, word))
> +            except:
> +                pass
>          return matches
> 
>   def get_class_members(klass):
> --- cut ---
> 
> Thanks in advance,
> Michael Krasnyk
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Mon Aug 15 19:36:46 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 15 Aug 2005 10:36:46 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <ca471dc205081510082be5de3@mail.gmail.com>
References: <bbaeab10050814213575ce6ad7@mail.gmail.com>
	<001601c5a19b$9b2abf80$af26c797@oemcomputer>
	<ca471dc205081510082be5de3@mail.gmail.com>
Message-ID: <bbaeab1005081510362bde2afa@mail.gmail.com>

OK, I will take this as BDFL pronouncement that ditching bare
'except's is just not going to happen.  Had to try.  =)

And I will strip out the TerminatingException proposal.

-Brett

On 8/15/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> I'm with Raymond here.
> 
> On 8/15/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > [Brett]
> > > This obviously goes against what Guido last said he
> > > wanted, but I hope I can convince him to get rid of bare 'except's.
> >
> > -1 on eliminating bare excepts.  This unnecessarily breaks tons of code
> > without offering ANY compensating benefits.  There are valid use cases
> > for this construct.  It is completely Pythonic to have bare keywords
> > apply a useful default as an aid to readability and ease of coding.
> >
> > +1 on the new BaseException
> >
> > +1 on moving NotImplementedError, SystemExit, and KeyboardInterrupt.
> >
> > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except
> > TerminatingException".  1) Grepping existing code bases shows that these
> > two are almost never caught together so it is a bit silly to introduce a
> > second way to do it.  2) Efforts to keep the builtin namespace compact
> > argue against adding a new builtin that will almost never be used.  3)
> > The change unnecessarily sacrifices flatness, making the language more
> > difficult to learn.  4) The "self-documenting" rationale is weak -- if
> > needed, a two-word comment would suffice.  Existing code almost never
> > has had to comment on catching multiple exceptions -- the exception
> > tuple itself has been sufficiently obvious and explicit.  This rationale
> > assumes that code readers aren't smart enough to infer that SystemExit
> > has something to do with termination.
> >
> >
> >
> > Raymond
> >
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> 
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From bcannon at gmail.com  Mon Aug 15 19:44:12 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 15 Aug 2005 10:44:12 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>
References: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
	<200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>
Message-ID: <bbaeab1005081510443d32db06@mail.gmail.com>

On 8/15/05, Toby Dickenson <tdickenson at devmail.geminidataloggers.co.uk> wrote:
> On Monday 15 August 2005 14:16, Raymond Hettinger wrote:
> 
> > -1 on replacing "except (KeyboardInterrupt, SystemExit)" with "except
> > TerminatingException".
> 
> The rationale for including TerminatingException in the PEP would also be
> satisfied by having a TerminatingExceptions tuple (in the exceptions
> module?). It makes sense to express the classification of exceptions that are
> intended to terminate the interpreter, but we dont need to express that
> classification as inheritence.
> 

While the idea is fine, I just know that the point is going to be
brought up that the addition should not be done until experience with
the new hierarchy is had.  I will add a comment that tuples can be
added to the module after enough experience is had, but I am not going
to try pushing for this right now.

Of course I could be surprised and everyone could support the idea.  =)

-Brett

From dberlin at dberlin.org  Mon Aug 15 21:06:03 2005
From: dberlin at dberlin.org (Daniel Berlin)
Date: Mon, 15 Aug 2005 15:06:03 -0400
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e1050815092760a2dab3@mail.gmail.com>
References: <42E93940.6080708@v.loewis.de>
	<1122607673.9665.38.camel@geddy.wooz.org>
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>
	<1122918723.9680.33.camel@warna.corp.google.com>
	<m24qa9f5v8.wl%gnn@neville-neil.com> <42EF2794.1000209@v.loewis.de>
	<66d0a6e105080312181e25fa08@mail.gmail.com>
	<42F1AADE.50908@v.loewis.de>
	<66d0a6e105080718527939aa81@mail.gmail.com>
	<42F6F61B.1080505@v.loewis.de>
	<66d0a6e1050815092760a2dab3@mail.gmail.com>
Message-ID: <1124132763.3143.10.camel@MAMBA>

On Mon, 2005-08-15 at 12:27 -0400, Nicholas Bastin wrote:
> On 8/8/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > Nicholas Bastin wrote:
> > > It's a mature product.  I would hope that that would count for
> > > something.
> > 
> > Sure. But so is subversion.
> 
> I will then assume that you and I have different ideas of what 'mature' means.

Bigger projects than Python use it and consider it mature for real use
(All the Apache projects, all of KDE, GNOME is planning on switching
soon, etc).

I've never seen a corrupted FSFS repo, only corrupted BDB repos, and I
will happily grant that using BDB ended up being a big mistake for
Subversion.  Not one that could have easily been foreseen at the time,
but such is life.

But this is why FSFS is the default for 1.2+

I've never seen you post about a corrupted repository to svn-users or
svn-dev or file a bug, so i can't say why you see corrupted repositories
if they are FSFS ones.

--Dan


From Scott.Daniels at Acm.Org  Mon Aug 15 21:27:57 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Mon, 15 Aug 2005 12:27:57 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>
References: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
	<200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>
Message-ID: <ddqqbt$a3e$1@sea.gmane.org>

Toby Dickenson wrote:
> On Monday 15 August 2005 14:16, Raymond Hettinger wrote:
> 
> The rationale for including TerminatingException in the PEP would also be 
> satisfied by having a TerminatingExceptions tuple (in the exceptions 
> module?). It makes sense to express the classification of exceptions that are 
> intended to terminate the interpreter, but we dont need to express that 
> classification as inheritence.
> 

An argument _for_ TerminatingException as a class is that I can
define my own subclasses of TerminatingException without forcing
it to being a subclass of KeyboardInterrupt or SystemExit.

-- Scott David Daniels
Scott.Daniels at Acm.Org


From gvanrossum at gmail.com  Mon Aug 15 21:36:29 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 15 Aug 2005 12:36:29 -0700
Subject: [Python-Dev] PEP 348 (exception reorg) revised again
In-Reply-To: <ddqqbt$a3e$1@sea.gmane.org>
References: <001601c5a19b$9b2abf80$af26c797@oemcomputer>
	<200508151548.11552.tdickenson@devmail.geminidataloggers.co.uk>
	<ddqqbt$a3e$1@sea.gmane.org>
Message-ID: <ca471dc205081512365004051f@mail.gmail.com>

On 8/15/05, Scott David Daniels <Scott.Daniels at acm.org> wrote:
> An argument _for_ TerminatingException as a class is that I can
> define my own subclasses of TerminatingException without forcing
> it to being a subclass of KeyboardInterrupt or SystemExit.

And how would that help you? Would your own exceptions be more like
SystemExit or more like KeyboardInterrupt, or neither? If you mean
them to be excluded by base "except:", you can always subclass
BaseException, which exists for this purpose.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Mon Aug 15 22:49:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 15 Aug 2005 22:49:48 +0200
Subject: [Python-Dev] PEP: Migrating the Python CVS to Subversion
In-Reply-To: <66d0a6e1050815092760a2dab3@mail.gmail.com>
References: <42E93940.6080708@v.loewis.de>	
	<1122607673.9665.38.camel@geddy.wooz.org>	
	<87fytu2lly.fsf@tleepslib.sk.tsukuba.ac.jp>	
	<1122918723.9680.33.camel@warna.corp.google.com>	
	<m24qa9f5v8.wl%gnn@neville-neil.com>
	<42EF2794.1000209@v.loewis.de>	
	<66d0a6e105080312181e25fa08@mail.gmail.com>	
	<42F1AADE.50908@v.loewis.de>	
	<66d0a6e105080718527939aa81@mail.gmail.com>	
	<42F6F61B.1080505@v.loewis.de>
	<66d0a6e1050815092760a2dab3@mail.gmail.com>
Message-ID: <4300FFEC.3090001@v.loewis.de>

Nicholas Bastin wrote:
> Not completely.  More like -0 at the moment.  We need a better system,
> but I think we shouldn't just pick a system because it's the one the
> PEP writer preferred - there should be some sort of effort to test a
> few systems (including bug trackers).

But that's how the PEP process works: the PEP author is supposed to
collect feedback from the community in a fair way, but he is not
required to implement every suggestion that the community makes.

People who strongly disagree that the entire approach should be taken
should write an alternative ("counter") PEP, proposing their strategy.

In the end, the BDFL will pronounce which approach (if any) should
be implemented.

In the specific case, I'm personally not willing to discuss every
SCM system out there. If somebody manages to make me curious (as
Guido did with the bazaar posts), I will try it out, if I can find
an easy way to do so. Your comments about (what was the name again)
did not make me curious.

As for bug trackers: this PEP is specifically *not* about bug
trackers at all. If you think the SourceForge bugtracker should
be replaced with something else, write a PEP. I really don't
see a reasonable alternative to the SF bugtracker.

> I know this is work, but this
> isn't just something we can change easily again later.

I don't bother asking who "we" is, here: apparently not you.

Regards,
Martin

From fperez.net at gmail.com  Mon Aug 15 22:48:48 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Mon, 15 Aug 2005 14:48:48 -0600
Subject: [Python-Dev] SWIG and rlcompleter
References: <43003492.8060904@mpi-magdeburg.mpg.de>
	<ca471dc205081510115c88ecc1@mail.gmail.com>
Message-ID: <ddqv3h$o9h$1@sea.gmane.org>

Guido van Rossum wrote:

> (1) Please use the SF patch manager.
> 
> (2) Please don't propose adding more bare "except:" clauses to the
> standard library.
> 
> (3) I think a better patch is to use str(word)[:n] instead of word[:n].

Sorry to jump in, but this same patch was proposed for ipython, and my reply
was that it appeared to me as a SWIG bug.  From:

http://www.python.org/doc/2.4.1/lib/built-in-funcs.html

the docs for dir() seem to suggest that dir() should only return strings (I am
inferring that from things like 'The resulting list is sorted
alphabetically').  The docs are not fully explicit on this, though.

Am I interpreting the docs correctly, case in which this should be considered a
SWIG bug?  Or is it OK for objects to stuff non-strings in __dict__, case in
which SWIG is OK and then rlcompleter (and the corresponding system in
ipython) do need to protect against this situation.

I'd appreciate a clarification here, so I can close my own ipython bug report
as well.

Thanks,

f


From bos at serpentine.com  Mon Aug 15 23:04:53 2005
From: bos at serpentine.com (Bryan O'Sullivan)
Date: Mon, 15 Aug 2005 14:04:53 -0700
Subject: [Python-Dev] On distributed vs centralised SCM for Python
Message-ID: <1124139893.20124.29.camel@localhost.localdomain>

Pardon me for coming a little late to the SCM discussion, but I thought
I would throw a few comments in.

A little background: I've used Perforce, CVS, Subversion and BitKeeper
for a number of years.  Currently, I hack on Mercurial
<URL:http://www.selenic.com/mercurial>.

However, I'm not here to try and specifically push Mercurial, but rather
to bring up a few points that I haven't seen made in the earlier
discussions.

The biggest distinguishing factor between centralised and decentralised
SCMs is the kinds of interaction they permit between the core developer
community and outsiders.

The centralised SCM tools all create a wall between core developers
(i.e. people with commit access to the central repository) and people
who are on the fringes.  Outsiders may be able to get anonymous
read-only access, but they are left up to their own devices if they want
to make changes that they would like to contribute back to the project.

With centralised tools, any revision control that outsiders do must be
ad-hoc in nature, and they cannot share their changes in a natural way
(i.e. preserving revision history) with anyone else.

I do not follow Python development closely, so I have no idea how open
Python typically is to contributions from people outside the core CVS
committers.

However, it's worth pointing out that with a distributed SCM - it
doesn't really matter which one you use - it is simple to put together a
workflow that operates in the same way as a centralised SCM.  You lose
nothing in the translation.  What you gain is several-fold:

      * Outsiders get to work according to the same terms, and with the
        same tools, as core developers.
      * Everyone can perform whatever work they want (branch, commit,
        diff, undo, etc) without being connected to the main repository
        in any way.
      * Peer-level sharing of changes, for testing or evaluation, is
        easy and doesn't clutter up the central server with short-lived
        branches.
      * Speculative branching: it is cheap to create a local private
        branch that contains some half-baked changes.  If they work out,
        fold them back and commit them to the main repository.  If not,
        blow the branch away and forget about it.

Regardless of what you may think of the Linux development model, it is
teling that there have been about 80 people able to commit changes to
Python since 1990 (I just checked the cvsroot tarball), whereas my
estimate is that about ten times as many used BitKeeper to contribute
changes to the Linux kernel just since the 2.5 tree began in 2002.  (The
total number of users who contributed changes was about 1600, 1300 of
whom used BK, while the remainder emailed plain old patches that someone
applied.)

It is, of course, not possible for me to tell which CVS commits were
really patches that originated with someone else, but my intent is to
show how the choice of tools affects the ability of people to contribute
in "natural" ways.  How much of the difference in numbers is due to the
respective popularity or accessibility of the projects is anyone's
guess.

With any luck, there's some food for thought above.

Regards,

	<b

-- 
Bryan O'Sullivan <bos at serpentine.com>


From foom at fuhm.net  Mon Aug 15 23:20:15 2005
From: foom at fuhm.net (James Y Knight)
Date: Mon, 15 Aug 2005 17:20:15 -0400
Subject: [Python-Dev] On distributed vs centralised SCM for Python
In-Reply-To: <1124139893.20124.29.camel@localhost.localdomain>
References: <1124139893.20124.29.camel@localhost.localdomain>
Message-ID: <6208AA5C-3E27-40EE-BD7B-CB6E1CA3D764@fuhm.net>

On Aug 15, 2005, at 5:04 PM, Bryan O'Sullivan wrote:
> The centralised SCM tools all create a wall between core developers
> (i.e. people with commit access to the central repository) and people
> who are on the fringes.  Outsiders may be able to get anonymous
> read-only access, but they are left up to their own devices if they  
> want
> to make changes that they would like to contribute back to the  
> project.

But, if python is using svn, outside developers can seamlessly use  
svk (http://svk.elixus.org/) to do their own branches if they wish,  
no? Sure, that is "their own devices", but it seems a fairly workable  
solution to me as the two are so closely related.

Now, I've never tried this, so I'm just judging from the "marketing  
material" on the svk website.

James


From martin at v.loewis.de  Mon Aug 15 23:29:23 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 15 Aug 2005 23:29:23 +0200
Subject: [Python-Dev] On distributed vs centralised SCM for Python
In-Reply-To: <1124139893.20124.29.camel@localhost.localdomain>
References: <1124139893.20124.29.camel@localhost.localdomain>
Message-ID: <43010933.3050303@v.loewis.de>

Bryan O'Sullivan wrote:
> However, it's worth pointing out that with a distributed SCM - it
> doesn't really matter which one you use - it is simple to put together a
> workflow that operates in the same way as a centralised SCM.  You lose
> nothing in the translation.  What you gain is several-fold:

That may be off-topic for python-dev, but can you please explain how
this works?

>       * Outsiders get to work according to the same terms, and with the
>         same tools, as core developers.

I'm using git on the kernel level. In what way am I at the same level
as the core developers? They can write to the kernel.org repository,
I cannot. They use commit, I send diffs.

>       * Everyone can perform whatever work they want (branch, commit,
>         diff, undo, etc) without being connected to the main repository
>         in any way.

So what? If I want to branch, I create a new sandbox. I have to do that
anyway, since independent projects should not influence each other. I
can also easily diff, whether I have write access or not (in svn, even
simpler so than in CVS).
There is no easy way to undo parts of the changes, that's true.

>       * Peer-level sharing of changes, for testing or evaluation, is
>         easy and doesn't clutter up the central server with short-lived
>         branches.

So how does that work? If I commit the changes to my local version of
the repository, how do they get peer-level-shared? I turn off my machine
when I leave the house, and I don't have a permanent IP, anyway, to
host a web server or some such.

>       * Speculative branching: it is cheap to create a local private
>         branch that contains some half-baked changes.  If they work out,
>         fold them back and commit them to the main repository.  If not,
>         blow the branch away and forget about it.

I do that with separate sandboxes right now.

cp -a py2.5 py-64bit

gives me a new sandbox, in which I can do my speculative project.

> Regardless of what you may think of the Linux development model, it is
> teling that there have been about 80 people able to commit changes to
> Python since 1990 (I just checked the cvsroot tarball), whereas my
> estimate is that about ten times as many used BitKeeper to contribute
> changes to the Linux kernel just since the 2.5 tree began in 2002.  (The
> total number of users who contributed changes was about 1600, 1300 of
> whom used BK, while the remainder emailed plain old patches that someone
> applied.)

Hmm. The changes of these 800 people had to be approved by some core
developers, or perhaps even all approved by Linus Torvalds, right? This
is really the same for Python: A partial list of contributors is in
Misc/ACKS (663 lines at the moment), and this doesn't list all the
people who contributed trivial changes. So I guess Python has the
same number of contributors per line as the Linux kernel.

> It is, of course, not possible for me to tell which CVS commits were
> really patches that originated with someone else, but my intent is to
> show how the choice of tools affects the ability of people to contribute
> in "natural" ways.

I hear that, but I have a hard time believing it. People find the
"cvs diff -u, send diff file for discussion to patches tracker"
cycle quite natural.

Regards,
Martin

From bos at serpentine.com  Tue Aug 16 00:19:40 2005
From: bos at serpentine.com (Bryan O'Sullivan)
Date: Mon, 15 Aug 2005 15:19:40 -0700
Subject: [Python-Dev] On distributed vs centralised SCM for Python
In-Reply-To: <43010933.3050303@v.loewis.de>
References: <1124139893.20124.29.camel@localhost.localdomain>
	<43010933.3050303@v.loewis.de>
Message-ID: <1124144380.20124.44.camel@localhost.localdomain>

On Mon, 2005-08-15 at 23:29 +0200, "Martin v. L?wis" wrote:

> That may be off-topic for python-dev, but can you please explain how
> this works?

It's simple enough.  In place of a central server that hosts a set of
repositories and a number of branches, and to which only a few people
have access, you use a central server that hosts a number of
repositories, and you get the idea.

But the difference lies in the way you use it.  In the centralised
model, there's only one server, and only one repository, anywhere.  In
the distributed model, each developer has one or more repositories that
they keep in sync with the central ones they are interested in, pulling
and pushing changes as necessary.  The difference is that they get to
share changes horizontally if they wish, without going through the
central server.

> I'm using git on the kernel level. In what way am I at the same level
> as the core developers?

You can use the same tools to do the same things they can.  You can
communicate with them in terms of commits.  You may each have access to
different sets of servers from which other people can pull changes, but
if they want to take changes from you, you have the option of giving
them complete history of all the edits and merges you've done, with no
information loss.

> So how does that work? If I commit the changes to my local version of
> the repository, how do they get peer-level-shared? 

You have to do something to share them, but it's a lot simpler than
sending diffs to a mailing list, or attaching them to a bug tracking
system note.

> Hmm. The changes of these 800 people had to be approved by some core
> developers, or perhaps even all approved by Linus Torvalds, right?

True.

> I hear that, but I have a hard time believing it. People find the
> "cvs diff -u, send diff file for discussion to patches tracker"
> cycle quite natural.

People will find doing the same of anything, over and over for fifteen
years, quite natural :-)

	<b

-- 
Bryan O'Sullivan <bos at serpentine.com>


From raymond.hettinger at verizon.net  Tue Aug 16 02:21:45 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 15 Aug 2005 20:21:45 -0400
Subject: [Python-Dev] On distributed vs centralised SCM for Python
In-Reply-To: <6208AA5C-3E27-40EE-BD7B-CB6E1CA3D764@fuhm.net>
Message-ID: <006201c5a1f8$7c9f4060$af26c797@oemcomputer>

 [Bryan O'Sullivan]
> > The centralised SCM tools all create a wall between core developers
> > (i.e. people with commit access to the central repository) and
people
> > who are on the fringes.  Outsiders may be able to get anonymous
> > read-only access, but they are left up to their own devices if they
> > want
> > to make changes that they would like to contribute back to the
> > project.

[James Y Knight]
> But, if python is using svn, outside developers can seamlessly use
> svk (http://svk.elixus.org/) to do their own branches if they wish,
> no? Sure, that is "their own devices", but it seems a fairly workable
> solution to me as the two are so closely related.

+1 This seems to be the most flexible and sensible idea so far.  The svn
system has had many accolades; Martin knows how to convert it; and it
presents only a small learning curve to cvs users.  Optionally adding
svk to the mix allows us to get the benefits of a distributed system
without any 
additional migration or support issues.  Very nice.


Raymond


From tim.peters at gmail.com  Tue Aug 16 03:07:51 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 15 Aug 2005 21:07:51 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <42F61C03.6050703@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
Message-ID: <1f7befae05081518073433da62@mail.gmail.com>

[Martin v. L?wis]
> I have placed a new version of the PEP on
> 
> http://www.python.org/peps/pep-0347.html

...

+1 from me.  But, I don't think my vote should count much, and (sorry)
Guido's even less:  what do the people who frequently check in want? 
That means people like you (Martin), Michael, Raymond, Walter, Fred.
... plus the release manager(s).

BTW, a stumbling block in Zope's conversion to SVN was that the
conversion script initially never set svn:eol-style on any file.  This
caused weeks of problems, as people on Windows got Linux line ends,
and people checking in from Windows forced Windows line ends on
Linuxheads (CVS defaults to assuming files are text; SVN binary).

The peculiar workaround at Zope is that we're all encouraged to add
something like this to our SVN config file:

"""
[auto-props]
# Setting eol-style to native on all files is a trick:  if svn
# believes a new file is binary, it won't honor the eol-style
# auto-prop.  However, svn considers the request to set eol-style
# to be an error then, and if adding multiple files with one
# svn "add" cmd, svn will stop adding files after the first
# such error.  A future release of svn will probably consider
# this to be a warning instead (and continue adding files).
* = svn:eol-style=native
"""

It would be best if svn:eol-style were set to native during initial
conversion from CVS, on all files not marked binary in CVS.

From jepler at unpythonic.net  Tue Aug 16 04:16:18 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Mon, 15 Aug 2005 21:16:18 -0500
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <ddqv3h$o9h$1@sea.gmane.org>
References: <43003492.8060904@mpi-magdeburg.mpg.de>
	<ca471dc205081510115c88ecc1@mail.gmail.com>
	<ddqv3h$o9h$1@sea.gmane.org>
Message-ID: <20050816021614.GA23688@unpythonic.net>

You don't need something like a buggy SWIG to put non-strings in dir().

>>> class C: pass
...
>>> C.__dict__[3] = "bad wolf"
>>> dir(C)
[3, '__doc__', '__module__']

This is likely to happen "legitimately", for instance in a class that allows
x.y and x['y'] to mean the same thing. (if the user assigns to x[3])

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050815/0ce34ebf/attachment.pgp

From mbp at sourcefrog.net  Mon Aug 15 07:12:34 2005
From: mbp at sourcefrog.net (Martin Pool)
Date: Mon, 15 Aug 2005 15:12:34 +1000
Subject: [Python-Dev] cvs to bzr?
References: <17150.31637.180169.877441@montanaro.dyndns.org>
	<ddostc$paf$1@sea.gmane.org>
	<17152.109.883835.190683@montanaro.dyndns.org>
Message-ID: <pan.2005.08.15.05.12.31.465097@sourcefrog.net>

On Sun, 14 Aug 2005 21:39:41 -0500, skip wrote:

>     Lalo> You can, however, convert from CVS to baz (arch), and from there
>     Lalo> to bzr.
> 
> Would this be with cscvs?  According to the cscvs wiki page at
> 
>     http://wiki.gnuarch.org/cscvs
> 
> cscvs is current unmaintained and can't handle repositories with branches.
> In addition, it appears that to do a one-time convertsion from cvs to bzr I
> will need to also install arch and baz as well as any other packages they
> depend on.

Canonical has had an ongoing project to pull many cvs trees into baz, for
the benefit of our Ubuntu distribution people amongst other things.  There
are some people working on imports using (I think) a hacked version of
cscvs, and I have asked them to get Python in as a high priority. 
Apparently there is something in the cvs history which makes a precise
import hard.

The cvs->baz->bzr process is unfortunate.  As Mark said, we're going to be
moving away from the Arch-based code and so trying to make that process
simpler.

-- 
Martin


From abo at minkirri.apana.org.au  Tue Aug 16 06:51:37 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 15 Aug 2005 21:51:37 -0700
Subject: [Python-Dev] Fwd: Distributed RCS
In-Reply-To: <43007CDC.6060200@benjiyork.com>
References: <42FBA376.5030605@canonical.com>
	<ca471dc20508131427b6aa0d4@mail.gmail.com>	<42FF32AA.7040506@v.loewis.de>
	<17151.15640.173982.961359@montanaro.dyndns.org>
	<42FF6E4B.4000206@v.loewis.de>  <43007CDC.6060200@benjiyork.com>
Message-ID: <1124167897.351.50.camel@warna.corp.google.com>

On Mon, 2005-08-15 at 04:30, Benji York wrote:
> Martin v. L?wis wrote:
> > skip at pobox.com wrote:
> >>Granted.  What is the cost of waiting a bit longer to see if it (or
> >>something else) gets more usable and would hit the mark better than svn?
> > 
> > It depends on what "a bit" is. Waiting a month would be fine; waiting
> > two years might be pointless.
> 
> This might be too convoluted to consider, but I thought I might throw it 
> out there.  We use svn for our repositories, but I've taken to also 
> using bzr so I can do local commits and reversions (within a particular 
> svn reversion).  I can imagine expanding that usage to sharing branches 
> and such via bzr (or mercurial, which looks great), but keeping the 
> trunk in svn.

Not too convoluted at all; I already do exactly this with many upstream
CVS and SVN repositorys, using a local PRCS for my own branches. I'm
considering switching to a distributed RCS for my own branches because
it would make it easier for others to share them.

I think this probably is the best solution; it gives a reliable(?)
centralised RCS for the trunk, but allows distributed development.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From bcannon at gmail.com  Tue Aug 16 08:25:15 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 15 Aug 2005 23:25:15 -0700
Subject: [Python-Dev] rev. 1.9 of PEP 348: Raymond tested, Guido approved
Message-ID: <bbaeab1005081523253c548e27@mail.gmail.com>

OK, TerminatingException and the removal of bare 'except' clauses are now out.

I also stripped out the transition plan to basically just add
BaseException in Python 2.5, tweak docs to recommend future-proof
practices, and then change everything in Python 3.0 .  This will
prevent any nasty performance hit from what was being previously
suggested to try to make it all backwards-compatible.

-Brett

From martin at v.loewis.de  Tue Aug 16 08:52:43 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 Aug 2005 08:52:43 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
Message-ID: <43018D3B.9040404@v.loewis.de>

Tim Peters wrote:
> It would be best if svn:eol-style were set to native during initial
> conversion from CVS, on all files not marked binary in CVS.

Ok, I'll add that to the PEP. Not sure how to implement it, yet...

Regards,
Martin

From senko.rasic at gmail.com  Tue Aug 16 09:17:42 2005
From: senko.rasic at gmail.com (Senko Rasic)
Date: Tue, 16 Aug 2005 09:17:42 +0200
Subject: [Python-Dev] Extension to dl module to allow passing strings
	from native function
In-Reply-To: <42FDE000.9080508@v.loewis.de>
References: <48bbc5810508111640a6bd03e@mail.gmail.com>
	<42FDE000.9080508@v.loewis.de>
Message-ID: <48bbc58105081600174fe3570d@mail.gmail.com>

On 8/13/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Are you aware of the ctypes module?
> 
> http://starship.python.net/crew/theller/ctypes/

I didn't know about ctypes, thanks for the pointer. It definitely
has much more functionality (although it's more complex and a
whole new module) than my little hack ;-)

Regards,
Senko

-- 
Senko Rasic <senko at senko dot net>

From mwh at python.net  Tue Aug 16 13:35:43 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 16 Aug 2005 12:35:43 +0100
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <20050816021614.GA23688@unpythonic.net> (jepler@unpythonic.net's
	message of "Mon, 15 Aug 2005 21:16:18 -0500")
References: <43003492.8060904@mpi-magdeburg.mpg.de>
	<ca471dc205081510115c88ecc1@mail.gmail.com>
	<ddqv3h$o9h$1@sea.gmane.org> <20050816021614.GA23688@unpythonic.net>
Message-ID: <2m4q9qunao.fsf@starship.python.net>

jepler at unpythonic.net writes:

> You don't need something like a buggy SWIG to put non-strings in dir().
>
>>>> class C: pass
> ...
>>>> C.__dict__[3] = "bad wolf"
>>>> dir(C)
> [3, '__doc__', '__module__']
>
> This is likely to happen "legitimately", for instance in a class that allows
> x.y and x['y'] to mean the same thing. (if the user assigns to x[3])

I wonder if dir() should strip non-strings?

Cheers,
mwh

-- 
  <radix> A VoIP server "powered entirely by stabbing, that I made
          out of this gun I had"                -- from Twisted.Quotes

From mwh at python.net  Tue Aug 16 13:42:32 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 16 Aug 2005 12:42:32 +0100
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com> (Tim Peters's
	message of "Mon, 15 Aug 2005 21:07:51 -0400")
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
Message-ID: <2mzmrit8ev.fsf@starship.python.net>

Tim Peters <tim.peters at gmail.com> writes:

> [Martin v. L?wis]
>> I have placed a new version of the PEP on
>> 
>> http://www.python.org/peps/pep-0347.html
>
> ...
>
> +1 from me.  But, I don't think my vote should count much, and (sorry)
> Guido's even less:  what do the people who frequently check in want? 
> That means people like you (Martin), Michael, Raymond, Walter, Fred.
> ... plus the release manager(s).

I want svn, I think.  I'm open to more sophisticated approaches but am
not sure that any of them are really mature enough yet.  Probably will
be soon, but not soon enough to void the effort of moving to svn
(IMHO).

I'm not really a release manager these days, but if I was, I'd wand
svn for that reason too.

The third set of people who count are pydotorg admins.  I'm not really
one of those either at the moment.  While SF's CVS setup has it's
problems (occasional outages; it's only CVS) it's hard to beat what it
costs us in sysadmin time: zero.

> It would be best if svn:eol-style were set to native during initial
> conversion from CVS, on all files not marked binary in CVS.

Yes.

Cheers,
mwh

-- 
  <Erwin> I recompiled XFree 4.2 with gcc 3.2-beta-from-cvs with -O42
          and -march-pentium4-800Mhz and I am sure that the MOUSE
          CURSOR is moving 5 % FASTER!
                                                -- from Twisted.Quotes

From anthony at interlink.com.au  Tue Aug 16 14:08:26 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 16 Aug 2005 22:08:26 +1000
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <2mzmrit8ev.fsf@starship.python.net>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
Message-ID: <200508162208.28862.anthony@interlink.com.au>

On Tuesday 16 August 2005 21:42, Michael Hudson wrote:
> I want svn, I think.  I'm open to more sophisticated approaches but am
> not sure that any of them are really mature enough yet.  Probably will
> be soon, but not soon enough to void the effort of moving to svn
> (IMHO).
>
> I'm not really a release manager these days, but if I was, I'd wand
> svn for that reason too.

I _am_ a release manager these days, and I'm in favour of svn. I really
want to be off CVS, and I would love to be able to go with something
more sophisticated than svn. Unfortunately, I really don't think any of
the alternatives are appropriate. While Perforce is definitely capable,
the Bitkeeper disaster strongly influence me against relying on the 
generosity of a commercial software vendor who could change their mind
at any time. 

The more radical (and powerful) tools such as baz/bzr, darcs, monotone
and the like really aren't there yet. I have no doubt that they will 
get there, but right now, I want something better than CVS, and I don't
want to have to fight bugs or limitations in the revision control system.

By the way - if you're intending on suggesting alternates to svn, please
don't just post a link saying "check out this system". Post an explanation
of _why_ we should look at this particular system. What's it's strengths?
Why should we invest the time to download it and play with it? Speaking for
myself, I don't have the time or energy to spend trying the countless 
numbers of revision control systems that are out there.

Thanks,
Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From barry at python.org  Tue Aug 16 14:42:59 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 16 Aug 2005 08:42:59 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <2mzmrit8ev.fsf@starship.python.net>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
Message-ID: <1124196179.9673.12.camel@geddy.wooz.org>

On Tue, 2005-08-16 at 07:42, Michael Hudson wrote:

> The third set of people who count are pydotorg admins.  I'm not really
> one of those either at the moment.  While SF's CVS setup has it's
> problems (occasional outages; it's only CVS) it's hard to beat what it
> costs us in sysadmin time: zero.

True, although because of the peculiarities of cvs, there have
definitely been times I wish we had direct access to the repository. 
svn should make most of those reasons moot.

As for sysadmin time with the changes proposed by the pep -- clearly
they won't be zero, but I think the overhead for svn itself will be
nearly so.  With the fsfs backend, there's almost no continuous care and
feeding needed, including for backups (which XS4ALL takes care of).  The
overhead for the admins will be in user management.  I really don't
think it will be that much more effort for new developers to badger the
admins into adding them to some config file than it currently is to get
one of us to click a few links to add you to the SF project. ;)  (Okay,
yeah we'll have to manage credentials now.)

The alternatives to svn all sound very enticing, however my own feeling
is that while the workflows they make possible might be good for Python
in the long run, it's not clear how all that will evolve.  We know that
we can treat svn as "a better cvs" and the current workflow seems to
serve us well enough.  I'd be happy to switch to svn now, while
continuing to experiment and follow the better scm systems for the
future.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050816/b66b0382/attachment.pgp

From mwh at python.net  Tue Aug 16 14:52:13 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 16 Aug 2005 13:52:13 +0100
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1124196179.9673.12.camel@geddy.wooz.org> (Barry Warsaw's
	message of "Tue, 16 Aug 2005 08:42:59 -0400")
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
Message-ID: <2mvf26t56q.fsf@starship.python.net>

Barry Warsaw <barry at python.org> writes:

> On Tue, 2005-08-16 at 07:42, Michael Hudson wrote:
>
>> The third set of people who count are pydotorg admins.  I'm not really
>> one of those either at the moment.  While SF's CVS setup has it's
>> problems (occasional outages; it's only CVS) it's hard to beat what it
>> costs us in sysadmin time: zero.
>
> True, although because of the peculiarities of cvs, there have
> definitely been times I wish we had direct access to the repository. 
> svn should make most of those reasons moot.
>
> As for sysadmin time with the changes proposed by the pep -- clearly
> they won't be zero, but I think the overhead for svn itself will be
> nearly so.

OK, that's more or less what I thought.

[...]

> I'd be happy to switch to svn now, while continuing to experiment
> and follow the better scm systems for the future.

I suppose another question is: when?  Between 2.4.2 and 2.5a1 seems
like a good opportunity.  I guess the biggest job is collection of
keys and associated admin?

Cheers,
mwh

-- 
  well, take it from an old hand: the only reason it would be easier
  to program in C is that you can't easily express complex problems
  in C, so you don't.                   -- Erik Naggum, comp.lang.lisp

From jack at performancedrivers.com  Tue Aug 16 15:00:46 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Tue, 16 Aug 2005 09:00:46 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <200508162208.28862.anthony@interlink.com.au>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<200508162208.28862.anthony@interlink.com.au>
Message-ID: <20050816130045.GA10364@performancedrivers.com>

On Tue, Aug 16, 2005 at 10:08:26PM +1000, Anthony Baxter wrote:
> On Tuesday 16 August 2005 21:42, Michael Hudson wrote:
> > I want svn, I think.  I'm open to more sophisticated approaches but am
> > not sure that any of them are really mature enough yet.  Probably will
> > be soon, but not soon enough to void the effort of moving to svn
> > (IMHO).
> >
> > I'm not really a release manager these days, but if I was, I'd wand
> > svn for that reason too.
> 
> I _am_ a release manager these days, and I'm in favour of svn. I really
> want to be off CVS, and I would love to be able to go with something
> more sophisticated than svn. Unfortunately, I really don't think any of
> the alternatives are appropriate.

As a non-committer I can say _anything_ is preferable to the current
situation and svn is good enough.  bzr might make it even easier but svn
is familiar and it will work right now.  I haven't submitted a patch in
ages partly because using anonymous SF cvs plain doesn't work.

aside, at work we switched from cvs to svn and it the transition was
easy for developers, svn lives up to its billing as a fixed cvs.

-jack


From e.a.m.brouwer at alumnus.utwente.nl  Mon Aug 15 00:48:00 2005
From: e.a.m.brouwer at alumnus.utwente.nl (Martijn Brouwer)
Date: Sun, 14 Aug 2005 22:48:00 +0000
Subject: [Python-Dev] implementation of copy standard lib
Message-ID: <1124059680.11612.9.camel@localhost.localdomain>

Hi,
After profiling a small python script I found that approximately 50% of
the runtime of my script was consumed by one line: "import copy".
Another 15% was the startup of the interpreter, but that is OK for an
interpreted language. The copy library is used by another library I am
using for my scripts. Importing copy takes 5-10 times more time that
import os, string and re together!
I noticed that this lib is implemented in python, not in C. As I can
imagine that *a lot* of libs/scripts use the copy library, I think it
worthwhile to implement this lib in C.
Unfortunately I cannot do this myself: I am relatively inexperienced
with python and do not know C.

What are your opinions?

Martijn Brouwer

-- 
__________________________________________________
I have a new e-mail adress. If you are still using
e.a.m.brouwer at tnw.utwente.nl, please change to
e.a.m.brouwer at alumnus.utwente.nl
__________________________________________________


From simon.brunning at gmail.com  Tue Aug 16 16:28:53 2005
From: simon.brunning at gmail.com (Simon Brunning)
Date: Tue, 16 Aug 2005 15:28:53 +0100
Subject: [Python-Dev] implementation of copy standard lib
In-Reply-To: <1124059680.11612.9.camel@localhost.localdomain>
References: <1124059680.11612.9.camel@localhost.localdomain>
Message-ID: <8c7f10c605081607281f8c1e38@mail.gmail.com>

On 8/14/05, Martijn Brouwer <e.a.m.brouwer at alumnus.utwente.nl> wrote:
> I noticed that this lib is implemented in python, not in C. As I can
> imagine that *a lot* of libs/scripts use the copy library, I think it
> worthwhile to implement this lib in C.
> Unfortunately I cannot do this myself: I am relatively inexperienced
> with python and do not know C.
> 
> What are your opinions?

I'll reply to this over on c.l.py, where it belongs.

-- 
Cheers,
Simon B,
simon at brunningonline.net,
http://www.brunningonline.net/simon/blog/

From foom at fuhm.net  Tue Aug 16 16:34:31 2005
From: foom at fuhm.net (James Y Knight)
Date: Tue, 16 Aug 2005 10:34:31 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <43018D3B.9040404@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<43018D3B.9040404@v.loewis.de>
Message-ID: <4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net>

On Aug 16, 2005, at 2:52 AM, Martin v. L?wis wrote:
> Tim Peters wrote:
>
>> It would be best if svn:eol-style were set to native during initial
>> conversion from CVS, on all files not marked binary in CVS.
>>
>
> Ok, I'll add that to the PEP. Not sure how to implement it, yet...

cvs2svn does that by default (now).

James


From fdrake at acm.org  Tue Aug 16 18:41:10 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 16 Aug 2005 12:41:10 -0400
Subject: [Python-Dev] dev listinfo page (was: Re:  Python + Ping)
In-Reply-To: <42FC666A.90206@botanicus.net>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC00@au3010avexu1.global.avaya.com>
	<42FC666A.90206@botanicus.net>
Message-ID: <200508161241.10908.fdrake@acm.org>

On Friday 12 August 2005 05:05, David Wilson wrote:
 > Would it perhaps be an idea, given the number of users posting to the
 > dev list, to put a rather obvious warning on the listinfo page:

Well, not exactly the style you suggested, but I've made it fairly close.  
It's certainly more noticable now.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From fperez.net at gmail.com  Tue Aug 16 19:08:59 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 16 Aug 2005 11:08:59 -0600
Subject: [Python-Dev] SWIG and rlcompleter
References: <43003492.8060904@mpi-magdeburg.mpg.de>
	<ca471dc205081510115c88ecc1@mail.gmail.com>
	<ddqv3h$o9h$1@sea.gmane.org>
	<20050816021614.GA23688@unpythonic.net>
	<2m4q9qunao.fsf@starship.python.net>
Message-ID: <ddt6js$39c$1@sea.gmane.org>

Michael Hudson wrote:

> jepler at unpythonic.net writes:
> 
>> You don't need something like a buggy SWIG to put non-strings in dir().
>>
>>>>> class C: pass
>> ...
>>>>> C.__dict__[3] = "bad wolf"
>>>>> dir(C)
>> [3, '__doc__', '__module__']
>>
>> This is likely to happen "legitimately", for instance in a class that allows
>> x.y and x['y'] to mean the same thing. (if the user assigns to x[3])
> 
> I wonder if dir() should strip non-strings?

Me too.  And it would be a good idea, I think, to specify explicitly in the
dir() docs this behavior.  Right now at least rlcompleter and ipython's
completer can break due to this, there may be other tools out there with
similar problems.

If  it's a stated design goal that dir() can return non-strings, that's fine.  I
can filter them out in my completion code.  I'd just like to know what the
official stance on dir()'s return values is.

Cheers,

f


From fperez.net at gmail.com  Tue Aug 16 19:17:04 2005
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 16 Aug 2005 11:17:04 -0600
Subject: [Python-Dev] SWIG and rlcompleter
References: <43003492.8060904@mpi-magdeburg.mpg.de>
	<ca471dc205081510115c88ecc1@mail.gmail.com>
Message-ID: <ddt731$4of$1@sea.gmane.org>

Guido van Rossum wrote:

> (3) I think a better patch is to use str(word)[:n] instead of word[:n].

Mmh, I'm not so sure that's a good idea, as it leads to this:

In [1]: class f: pass
   ...:

In [2]: a=f()

In [3]: a.__dict__[1] = 8

In [4]: a.x = 0

In [5]: a.<TAB HIT HERE>
a.1  a.x

In [5]: a.1
------------------------------------------------------------
   File "<console>", line 1
     a.1
       ^
SyntaxError: invalid syntax


In general, foo.x named attribute access is only valid for strings to begin with
(what about unicode in there?).  Instead, this is what I've actually
implemented in ipython:

        words = [w for w in dir(object) if isinstance(w, basestring)]

That does allow unicode, I'm not sure if that's a good thing to do.

Cheers,

f


From martin at v.loewis.de  Tue Aug 16 20:19:33 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 Aug 2005 20:19:33 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<43018D3B.9040404@v.loewis.de>
	<4B72B610-80CA-40BC-9B9D-EB50F8077436@fuhm.net>
Message-ID: <43022E35.1070207@v.loewis.de>

James Y Knight wrote:
> cvs2svn does that by default (now).

Ah, ok.

Martin

From martin at v.loewis.de  Tue Aug 16 20:31:20 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 Aug 2005 20:31:20 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <2mvf26t56q.fsf@starship.python.net>
References: <42F61C03.6050703@v.loewis.de>	<1f7befae05081518073433da62@mail.gmail.com>	<2mzmrit8ev.fsf@starship.python.net>	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net>
Message-ID: <430230F8.3020405@v.loewis.de>

Michael Hudson wrote:
> I suppose another question is: when?  Between 2.4.2 and 2.5a1 seems
> like a good opportunity.  I guess the biggest job is collection of
> keys and associated admin?

I would agree. However, there still is the debate of hosting the
repository elsehwere. Some people (Anthony, Guido, Tim) would prefer
to pay for it, instead of hosting it on svn.python.org.

Regards,
Martin

From nas at arctrix.com  Tue Aug 16 21:18:35 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 16 Aug 2005 13:18:35 -0600
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <430230F8.3020405@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
Message-ID: <20050816191835.GA18968@mems-exchange.org>

On Tue, Aug 16, 2005 at 08:31:20PM +0200, "Martin v. L?wis" wrote:
> I would agree. However, there still is the debate of hosting the
> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer
> to pay for it, instead of hosting it on svn.python.org.

Another option would be to pay someone to maintain the SVN setup on
python.org.  Unfortunately, I guess that would require someone else
to first create a detailed description of the maintenance work
required and to process bids.

  Neil

From barry at python.org  Tue Aug 16 21:28:35 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 16 Aug 2005 15:28:35 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <20050816191835.GA18968@mems-exchange.org>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
	<20050816191835.GA18968@mems-exchange.org>
Message-ID: <1124220515.5254.7.camel@geddy.wooz.org>

On Tue, 2005-08-16 at 15:18, Neil Schemenauer wrote:

> Another option would be to pay someone to maintain the SVN setup on
> python.org.  Unfortunately, I guess that would require someone else
> to first create a detailed description of the maintenance work
> required and to process bids.

Again, it's not clear to me that there's much more we need to have done
that we either don't want to do ourselves or that XS4ALL isn't doing for
us.  IOW, we get backups for free and mostly the repo just swims along
nicely.  We have to do user management, but I think we want to do that
ourselves anyway.  There may be occasional infrastructural work that
needs to happen (e.g. we still owe Martin a login for tunneling), but
those tasks seem to me to be better handled either by volunteers or by
short-term paid piece work.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050816/8a628ba5/attachment.pgp

From tim.peters at gmail.com  Tue Aug 16 21:49:39 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 16 Aug 2005 15:49:39 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <430230F8.3020405@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
Message-ID: <1f7befae050816124932f5c66@mail.gmail.com>

[Michael Hudson]
>> I suppose another question is: when?  Between 2.4.2 and 2.5a1 seems
>> like a good opportunity.  I guess the biggest job is collection of
>> keys and associated admin?

[Martin v. L?wis]
> I would agree. However, there still is the debate of hosting the
> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer
> to pay for it, instead of hosting it on svn.python.org.

Not this Tim.  I _asked_ whether we had sufficient volunteer resource
to host it on python.org, because I didn't know.  Barry has since made
sufficiently reassuring gurgles on that point, in particular that
ongoing maintenance (after initial conversion) for filesystem-flavor
SVN is likely in-the-noise level work.

From martin at v.loewis.de  Tue Aug 16 22:25:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 Aug 2005 22:25:38 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1f7befae050816124932f5c66@mail.gmail.com>
References: <42F61C03.6050703@v.loewis.de>	
	<1f7befae05081518073433da62@mail.gmail.com>	
	<2mzmrit8ev.fsf@starship.python.net>	
	<1124196179.9673.12.camel@geddy.wooz.org>	
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
	<1f7befae050816124932f5c66@mail.gmail.com>
Message-ID: <43024BC2.2010505@v.loewis.de>

Tim Peters wrote:
> Not this Tim.  I _asked_ whether we had sufficient volunteer resource
> to host it on python.org, because I didn't know.  Barry has since made
> sufficiently reassuring gurgles on that point, in particular that
> ongoing maintenance (after initial conversion) for filesystem-flavor
> SVN is likely in-the-noise level work.

Ah, ok. Of course, Barry can only speak about the current availability
of volunteers, which is quite good (especially since amk took over
coordinating them), nobody can predict the future (the time machine
apparently only works one-way). So I guess the concern stays, and,
more objectively, this is a risk for the project (but so is any
specific commercial offering).

Regards,
Martin


From raymond.hettinger at verizon.net  Tue Aug 16 22:24:47 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 16 Aug 2005 16:24:47 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com>
Message-ID: <011901c5a2a0$8c4f4700$af26c797@oemcomputer>

[Tim]
> +1 from me.  But, I don't think my vote should count much, and (sorry)
> Guido's even less:  what do the people who frequently check in want?
> That means people like you (Martin), Michael, Raymond, Walter, Fred.
> ... plus the release manager(s).

+1 from me.  CVS is meeting my needs but I would definitely benefit from
fast diffs and atomic commits.  My experiences with SVN to-date have all
been positive and it was easy to learn.  

Also, I think it is a nice plus that our choosing SVN means that others
can choose SVK and get the benefits of a distributed rcs without us
having to do anything extra to support it.  James Knight's thoughts on
the subject seem on target.


Raymond


From martin at v.loewis.de  Tue Aug 16 22:33:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 Aug 2005 22:33:27 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <20050816191835.GA18968@mems-exchange.org>
References: <42F61C03.6050703@v.loewis.de>	<1f7befae05081518073433da62@mail.gmail.com>	<2mzmrit8ev.fsf@starship.python.net>	<1124196179.9673.12.camel@geddy.wooz.org>	<2mvf26t56q.fsf@starship.python.net>
	<430230F8.3020405@v.loewis.de>
	<20050816191835.GA18968@mems-exchange.org>
Message-ID: <43024D97.5050306@v.loewis.de>

Neil Schemenauer wrote:
> Another option would be to pay someone to maintain the SVN setup on
> python.org.  Unfortunately, I guess that would require someone else
> to first create a detailed description of the maintenance work
> required and to process bids.

I think this would be difficult. I could imagine services like
tummy.com, where you can hire somebody on an hours-per-week
basis; these people maintain multiple servers, and just need to
do the proper accounting. However, they also (naturally) tend
to desire an organization that meets their needs also, e.g.
by providing the machine and network (this is apparently how
tummy.com operates).

If you are suggesting that the PSF hires a specific individual
for that maintenance, the risk of getting somebody
unexperienced/uncooperative would be much higher: if we were
unhappy with the tummy.com guy looking after our hardware,
we could complain to his boss; if that is the boss, we would
just take our data and cancel the contract.

Also, hiring somebody would be somewhat unfair to people who
do similar tasks as volunteers, and I guess the board might
not agree to such expenses.

Regards,
Martin


From tim.peters at gmail.com  Tue Aug 16 22:52:09 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 16 Aug 2005 16:52:09 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <43024BC2.2010505@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
	<1f7befae050816124932f5c66@mail.gmail.com>
	<43024BC2.2010505@v.loewis.de>
Message-ID: <1f7befae050816135244a6b72f@mail.gmail.com>

[Martin v. L?wis]
> Ah, ok. Of course, Barry can only speak about the current availability
> of volunteers, which is quite good (especially since amk took over
> coordinating them), nobody can predict the future (the time machine
> apparently only works one-way). So I guess the concern stays, and,
> more objectively, this is a risk for the project (but so is any
> specific commercial offering).

I'm not really worried about it.  Sounds like ongoing pain is pretty
much limited to keeping committer accounts/credentials up to date, and
that normal good backup procedures will deal with filesystem-SVN state
as a matter of course.

If there's one thing sysadmins love to do, it's fiddling with user
accounts and credentials -- if _anyone_ volunteers to work on
python.org, they'll be eager to lord this power over us <wink>.

If not, that's fine too.  The PSF has the funds and the mission to pay
for infrastructure support; I'd just _rather_ spend PSF funds on "more
glamorous" stuff (like grants and conferences).

From tim.peters at gmail.com  Tue Aug 16 23:00:43 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 16 Aug 2005 17:00:43 -0400
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <011901c5a2a0$8c4f4700$af26c797@oemcomputer>
References: <1f7befae05081518073433da62@mail.gmail.com>
	<011901c5a2a0$8c4f4700$af26c797@oemcomputer>
Message-ID: <1f7befae050816140063dcda4@mail.gmail.com>

[Raymond Hettinger]
> +1 from me.  CVS is meeting my needs but I would definitely benefit from
> fast diffs and atomic commits.  My experiences with SVN to-date have all
> been positive and it was easy to learn.

Good!  That was my experience too, BTW -- SVN was a genuine
improvement over CVS, and I was productive with it the first hour. 
There are "tricks" you'll learn too (or already have); for example, if
you make a bunch of changes in a local checkout, and have to drop it
for a while, it's easy and fast to create an SVN branch with those
changes despite that you didn't plan on it from the start (create a
new branch in the repository; `svn switch` to it locally, which leaves
your local changes alone; then commit).

> Also, I think it is a nice plus that our choosing SVN means that others
> can choose SVK and get the benefits of a distributed rcs without us
> having to do anything extra to support it.  James Knight's thoughts on
> the subject seem on target.

Too new-fashioned for me, although I can see how it might appeal to kids ;-)

From skip at pobox.com  Tue Aug 16 23:07:09 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 16 Aug 2005 16:07:09 -0500
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <43024BC2.2010505@v.loewis.de>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
	<2mzmrit8ev.fsf@starship.python.net>
	<1124196179.9673.12.camel@geddy.wooz.org>
	<2mvf26t56q.fsf@starship.python.net> <430230F8.3020405@v.loewis.de>
	<1f7befae050816124932f5c66@mail.gmail.com>
	<43024BC2.2010505@v.loewis.de>
Message-ID: <17154.21885.668810.977155@montanaro.dyndns.org>


    Martin> Of course, Barry can only speak about the current availability
    Martin> of volunteers, which is quite good (especially since amk took
    Martin> over coordinating them) ....

I don't know why, but the first image that popped into my mind was of amk
beating a bunch of Hunchback of Notre Dame types (maybe more the Marty
Feldman (*) hunchback types) into submission with a whip while one of them
cried, "We'll do anything you ask, master.  Just don't beat us again."

The-beatings-will-continue-until-morale-improves-ly, y'rs,

Skip

(*) http://en.wikipedia.org/wiki/Marty_Feldman

From walter at livinglogic.de  Tue Aug 16 23:06:37 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue, 16 Aug 2005 23:06:37 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <1f7befae05081518073433da62@mail.gmail.com>
References: <42F61C03.6050703@v.loewis.de>
	<1f7befae05081518073433da62@mail.gmail.com>
Message-ID: <1A846489-39FF-49D1-8AF0-5BA61F9277DF@livinglogic.de>

Tim Peters wrote:

> [Martin v. L?wis]
>
>> I have placed a new version of the PEP on
>>
>> http://www.python.org/peps/pep-0347.html
>>
>
> ...
>
> +1 from me.  But, I don't think my vote should count much, and (sorry)
> Guido's even less:  what do the people who frequently check in want?
> That means people like you (Martin), Michael, Raymond, Walter, Fred.
> ... plus the release manager(s).

+1 from me for various reasons:

* Subversion seems to be stable enough, and it's better than CVS  
which is enough for me.

* The python.org machines can probably handle the load of *one*  
repository better then the SF machines that of several thousands.

* Connectivity to python.org is much better then to cvs.sf.net (at  
least from here).

* Our company repository might move to svn in the near future, so a  
Python svn repository would be a perfect playground to learn svn. ;)

Bye,
    Walter D?rwald


From tdelaney at avaya.com  Wed Aug 17 01:53:20 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 17 Aug 2005 09:53:20 +1000
Subject: [Python-Dev] PEP 347: Migration to Subversion
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>

Tim Peters wrote:

> [Martin v. L?wis]
>> I would agree. However, there still is the debate of hosting the
>> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer
>> to pay for it, instead of hosting it on svn.python.org.
> 
> Not this Tim.

Not this one either. I haven't actually used any of the various systems that much (work is ClearCase) so I have no opinions whatsoever. It's interesting reading though.

Tim Delaney

From gvanrossum at gmail.com  Wed Aug 17 01:58:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue, 16 Aug 2005 16:58:01 -0700
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>
Message-ID: <ca471dc2050816165852cb3934@mail.gmail.com>

Nor this Guido, FWIW (I think we shouldn't rule it out as an option,
but I don't have any preferences).

On 8/16/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> Tim Peters wrote:
> 
> > [Martin v. L?wis]
> >> I would agree. However, there still is the debate of hosting the
> >> repository elsehwere. Some people (Anthony, Guido, Tim) would prefer
> >> to pay for it, instead of hosting it on svn.python.org.
> >
> > Not this Tim.
> 
> Not this one either. I haven't actually used any of the various systems that much (work is ClearCase) so I have no opinions whatsoever. It's interesting reading though.
> 
> Tim Delaney
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Wed Aug 17 02:55:26 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 16 Aug 2005 20:55:26 -0400
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <2m4q9qunao.fsf@starship.python.net>
Message-ID: <000801c5a2c6$5ca70080$af26c797@oemcomputer>

[Michael Hudson]
> I wonder if dir() should strip non-strings?

-0  The behavior of dir() already a bit magical.  Python is much simpler
to comprehend if we have direct relationships like dir() and vars()
corresponding as closely as possible to the object's dictionary.  If
someone injects non-strings into an attribute dictionary, why should
dir() hide that fact?

Likewise, we would have been better-off if ceval.c didn't pre-process
data before handing it off to API functions (so that negative indices
get handled the same way in operator module functions and in user
defined methods, etc). 

Both Io and Lua have made a design principle out of keeping these
relationships as direct as possible (i.e. a[b] always corresponds to the
call a.__getitem__(b) with no intervening magic, etc.).

<begin side-topic-rant>
The auto-exposure on my camera takes in nine data points and guesses
whether the subject is backlit, whether there is a mix of light and
dark, whether it is more important avoid blown highlights or to miss
shadow detail, etc.  The good news is that it often makes a decent
guess.  The bad news is that I've completely lost the ability to predict
whether I've gotten a good shot based on the light conditions and camera
settings.  IOW, if you make the tools too smart, they become harder to
use.  Leica had it right all along.
<end side-topic-rant>


Raymond


From ilya at bluefir.net  Wed Aug 17 06:34:10 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Tue, 16 Aug 2005 21:34:10 -0700 (PDT)
Subject: [Python-Dev] remote debugging with pdb
In-Reply-To: <24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com>
References: <Pine.LNX.4.58.0508071312290.695@bagira>
	<20050808154503.GB28005@panix.com>
	<Pine.LNX.4.58.0508081926010.2814@bagira>
	<200508111802.44357.anthony@interlink.com.au>
	<24EEDE5B-4511-40D4-9C16-8A33C4ACE1C8@redivi.com>
Message-ID: <Pine.LNX.4.58.0508162119100.2711@bagira>


> One thing PDB needs is a mode that runs as a background thread and
> opens up a socket so that another Python process can talk to it, for
> embedded/remote/GUI debugging.


There is a patch on SourceForge
python.org/sf/721464
which allows pdb to read/write from/to arbitrary file objects. Would it
answer some of your concerns (eg remote debugging)?

The patch probably will not apply to the current code, but I guess, I
could revive it  if anyone thinks that it's  worthwhile...

What do you think?
Ilya

From kiko at async.com.br  Wed Aug 17 16:02:18 2005
From: kiko at async.com.br (Christian Robottom Reis)
Date: Wed, 17 Aug 2005 11:02:18 -0300
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
Message-ID: <20050817140217.GQ3389@www.async.com.br>


In Launchpad (mainly because SQLObject is used) we end up with quite a
few locals named id. Apart from the fact that naturally clobbering
builtins is a bad idea, we get quite a few warnings when linting
throughout the codebase. I've fixed these as I've found them, but today
Andrew pointed out to me that this is noted in:

    http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.ppt

I wonder: is moving id() to sys doable in the 2.5 cycle, with a
deprecation warning being raised for people using the builtin? We'd then
phase it out in one of the latter 2.x versions.

I've done some searching through my code and id() isn't the most-used
builtin, so from my perspective the impact would be limited, but of
course others might think otherwise.

Is it worth writing a PEP for this, or is it crack?

Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 3376 0125

From raymond.hettinger at verizon.net  Wed Aug 17 17:48:57 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 17 Aug 2005 11:48:57 -0400
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <20050817140217.GQ3389@www.async.com.br>
Message-ID: <000a01c5a343$2f4deea0$5e01a044@oemcomputer>

[Christian Robottom Reis]
> I've done some searching through my code and id() isn't the most-used
> builtin, so from my perspective the impact would be limited, but of
> course others might think otherwise.
> 
> Is it worth writing a PEP for this, or is it crack?

FWIW, I use id() all the time and like having it as a builtin.


Raymond


From firemoth at gmail.com  Wed Aug 17 18:32:42 2005
From: firemoth at gmail.com (Timothy Fitz)
Date: Wed, 17 Aug 2005 12:32:42 -0400
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <20050817140217.GQ3389@www.async.com.br>
References: <20050817140217.GQ3389@www.async.com.br>
Message-ID: <972ec5bd05081709327fed099@mail.gmail.com>

On 8/17/05, Christian Robottom Reis <kiko at async.com.br> wrote:
> I've done some searching through my code and id() isn't the most-used
> builtin, so from my perspective the impact would be limited, but of
> course others might think otherwise.

All of my primary uses of id would not show up in such a search. id is
handy when debugging, when using the interactive interpreter and
temporarily in scripts (print id(something), something for when
repr(something) doesn't show the id).

In my experience teaching python, id at the interactive interpreter is
invaluable, which is why any proposal to move it would get a -1. The
fundamental issue is that I want to explain  reference semantics well
before I talk about packages and the associated import call.

From jeremy at alum.mit.edu  Wed Aug 17 18:37:42 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Aug 2005 12:37:42 -0400
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <972ec5bd05081709327fed099@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<972ec5bd05081709327fed099@mail.gmail.com>
Message-ID: <e8bf7a53050817093722e7b4bf@mail.gmail.com>

I'd like to see the builtin id() removed so that I can use it as a
local variable name without clashing with the builtin name.  I
certainly use the id() function, but not as often as I have a local
variable I'd like to name id.  The sys module seems like a natural
place to put id(), since it is exposing something about the
implementation of Python rather than something about the language; the
language offers the is operator to check ids.

Jeremy

From reinhold-birkenfeld-nospam at wolke7.net  Wed Aug 17 18:37:11 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed, 17 Aug 2005 18:37:11 +0200
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <20050817140217.GQ3389@www.async.com.br>
References: <20050817140217.GQ3389@www.async.com.br>
Message-ID: <ddvp3n$oq6$1@sea.gmane.org>

Christian Robottom Reis wrote:
> In Launchpad (mainly because SQLObject is used) we end up with quite a
> few locals named id. Apart from the fact that naturally clobbering
> builtins is a bad idea, we get quite a few warnings when linting
> throughout the codebase. I've fixed these as I've found them, but today
> Andrew pointed out to me that this is noted in:
> 
>     http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.ppt
> 
> I wonder: is moving id() to sys doable in the 2.5 cycle, with a
> deprecation warning being raised for people using the builtin? We'd then
> phase it out in one of the latter 2.x versions.
> 
> I've done some searching through my code and id() isn't the most-used
> builtin, so from my perspective the impact would be limited, but of
> course others might think otherwise.
> 
> Is it worth writing a PEP for this, or is it crack?

As I can see, this is not going to happen before Py3k, as it is completely
breaking backwards compatibility. As such, a PEP would be unnecessary.

Reinhold

-- 
Mail address is perfectly valid!


From pedronis at strakt.com  Wed Aug 17 18:50:29 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Wed, 17 Aug 2005 18:50:29 +0200
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <e8bf7a53050817093722e7b4bf@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>	<972ec5bd05081709327fed099@mail.gmail.com>
	<e8bf7a53050817093722e7b4bf@mail.gmail.com>
Message-ID: <43036AD5.6010902@strakt.com>

Jeremy Hylton wrote:
> I'd like to see the builtin id() removed so that I can use it as a
> local variable name without clashing with the builtin name.  I
> certainly use the id() function, but not as often as I have a local
> variable I'd like to name id.  The sys module seems like a natural
> place to put id(), since it is exposing something about the
> implementation of Python rather than something about the language; the
> language offers the is operator to check ids.
> 

it is worth to remember that id() functionality is not cheap for Python
impls using moving GCs. Identity mappings would be less taxing.

> Jeremy
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com


From barry at python.org  Wed Aug 17 19:17:59 2005
From: barry at python.org (Barry Warsaw)
Date: Wed, 17 Aug 2005 13:17:59 -0400
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <e8bf7a53050817093722e7b4bf@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<972ec5bd05081709327fed099@mail.gmail.com>
	<e8bf7a53050817093722e7b4bf@mail.gmail.com>
Message-ID: <1124299079.23024.17.camel@geddy.wooz.org>

On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote:
> I'd like to see the builtin id() removed so that I can use it as a
> local variable name without clashing with the builtin name.  I
> certainly use the id() function, but not as often as I have a local
> variable I'd like to name id.  

Same here.

> The sys module seems like a natural
> place to put id(), since it is exposing something about the
> implementation of Python rather than something about the language; the
> language offers the is operator to check ids.

+1
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050817/4e62bbf8/attachment.pgp

From bcannon at gmail.com  Wed Aug 17 19:21:59 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 17 Aug 2005 10:21:59 -0700
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <1124299079.23024.17.camel@geddy.wooz.org>
References: <20050817140217.GQ3389@www.async.com.br>
	<972ec5bd05081709327fed099@mail.gmail.com>
	<e8bf7a53050817093722e7b4bf@mail.gmail.com>
	<1124299079.23024.17.camel@geddy.wooz.org>
Message-ID: <bbaeab100508171021433a45e@mail.gmail.com>

On 8/17/05, Barry Warsaw <barry at python.org> wrote:
> On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote:
> > I'd like to see the builtin id() removed so that I can use it as a
> > local variable name without clashing with the builtin name.  I
> > certainly use the id() function, but not as often as I have a local
> > variable I'd like to name id.
> 
> Same here.
> 
> > The sys module seems like a natural
> > place to put id(), since it is exposing something about the
> > implementation of Python rather than something about the language; the
> > language offers the is operator to check ids.
> 
> +1
> -Barry

+1

-Brett

From nas at arctrix.com  Wed Aug 17 19:40:32 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 17 Aug 2005 11:40:32 -0600
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <ddvp3n$oq6$1@sea.gmane.org>
References: <20050817140217.GQ3389@www.async.com.br>
	<ddvp3n$oq6$1@sea.gmane.org>
Message-ID: <20050817174031.GA22541@mems-exchange.org>

On Wed, Aug 17, 2005 at 06:37:11PM +0200, Reinhold Birkenfeld wrote:
> As I can see, this is not going to happen before Py3k, as it is completely
> breaking backwards compatibility. As such, a PEP would be unnecessary.

We could add sys.id for 2.5 and remove __builtin__.id a some later
time (e.g. for 3.0).

  Neil

From facundobatista at gmail.com  Wed Aug 17 19:49:25 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 17 Aug 2005 14:49:25 -0300
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <20050817174031.GA22541@mems-exchange.org>
References: <20050817140217.GQ3389@www.async.com.br>
	<ddvp3n$oq6$1@sea.gmane.org>
	<20050817174031.GA22541@mems-exchange.org>
Message-ID: <e04bdf31050817104958ba651c@mail.gmail.com>

On 8/17/05, Neil Schemenauer <nas at arctrix.com> wrote:

> On Wed, Aug 17, 2005 at 06:37:11PM +0200, Reinhold Birkenfeld wrote:
> > As I can see, this is not going to happen before Py3k, as it is completely
> > breaking backwards compatibility. As such, a PEP would be unnecessary.
> 
> We could add sys.id for 2.5 and remove __builtin__.id a some later
> time (e.g. for 3.0).

+1 for adding it to sys in 2.5, removing the builtin one in 3.0.

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From firemoth at gmail.com  Wed Aug 17 20:55:30 2005
From: firemoth at gmail.com (Timothy Fitz)
Date: Wed, 17 Aug 2005 14:55:30 -0400
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <000801c5a2c6$5ca70080$af26c797@oemcomputer>
References: <2m4q9qunao.fsf@starship.python.net>
	<000801c5a2c6$5ca70080$af26c797@oemcomputer>
Message-ID: <972ec5bd05081711555e9ad129@mail.gmail.com>

On 8/16/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> -0  The behavior of dir() already a bit magical.  Python is much simpler
> to comprehend if we have direct relationships like dir() and vars()
> corresponding as closely as possible to the object's dictionary.  If
> someone injects non-strings into an attribute dictionary, why should
> dir() hide that fact?

Indeed, there seem to be two camps, those who want dir to reflect __dict__
and those who want dir to reflect attributes of an object. It seems to
me that those who want dir to reflect __dict__ should just use
__dict__ in the first place.

However, in the case of dir handling non-strings, should dir handle
non-valid identifiers as well, that is to say that while
foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"]
?

Right now the documentation says that it returns "attributes", and I
would not consider non-strings to be attributes, so either the
documentation or the implementation should rectify this disagreement.

From gvanrossum at gmail.com  Wed Aug 17 21:10:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 17 Aug 2005 12:10:15 -0700
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com>
References: <2m4q9qunao.fsf@starship.python.net>
	<000801c5a2c6$5ca70080$af26c797@oemcomputer>
	<972ec5bd05081711555e9ad129@mail.gmail.com>
Message-ID: <ca471dc2050817121047a7164@mail.gmail.com>

On 8/17/05, Timothy Fitz <firemoth at gmail.com> wrote:
> On 8/16/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > -0  The behavior of dir() already a bit magical.  Python is much simpler
> > to comprehend if we have direct relationships like dir() and vars()
> > corresponding as closely as possible to the object's dictionary.  If
> > someone injects non-strings into an attribute dictionary, why should
> > dir() hide that fact?
> 
> Indeed, there seem to be two camps, those who want dir to reflect __dict__
> and those who want dir to reflect attributes of an object. It seems to
> me that those who want dir to reflect __dict__ should just use
> __dict__ in the first place.

Right.

> However, in the case of dir handling non-strings, should dir handle
> non-valid identifiers as well, that is to say that while
> foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"]
> ?

See below.

> Right now the documentation says that it returns "attributes", and I
> would not consider non-strings to be attributes, so either the
> documentation or the implementation should rectify this disagreement.

I think that dir() should hide non-strings; these aren't attributes if
you believe the definition that an attribute name is something
acceptable to getattr() or setattr().

Following this definition, the string "1" is a valid attribute name
(even though it's not a valid identifier), but the number 1 is not.

Try it. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Wed Aug 17 21:21:22 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 17 Aug 2005 15:21:22 -0400
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com>
Message-ID: <001101c5a360$db2d63a0$3031c797@oemcomputer>

[Timothy Fitz]
> It seems to
> me that those who want dir to reflect __dict__ should just use
> __dict__ in the first place.

The dir() builtin does quite a bit more than obj.__dict__.keys().


>>> class A(list):
	x = 1

>>> dir(A)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
'__delslice__', '__dict__', '__doc__', '__eq__', '__ge__',
'__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__',
'__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__',
'__lt__', '__module__', '__mul__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__',
'__setitem__', '__setslice__', '__str__', '__weakref__', 'append',
'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse',
'sort', 'x']

>>> A.__dict__.keys()
['__dict__', 'x', '__module__', '__weakref__', '__doc__']


Raymond


From gvanrossum at gmail.com  Wed Aug 17 21:30:33 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 17 Aug 2005 12:30:33 -0700
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <001101c5a360$db2d63a0$3031c797@oemcomputer>
References: <972ec5bd05081711555e9ad129@mail.gmail.com>
	<001101c5a360$db2d63a0$3031c797@oemcomputer>
Message-ID: <ca471dc205081712307a346cf7@mail.gmail.com>

> [Timothy Fitz]
> > It seems to
> > me that those who want dir to reflect __dict__ should just use
> > __dict__ in the first place.

[Raymond]
> The dir() builtin does quite a bit more than obj.__dict__.keys().

Well that's the whole point, right? We shouldn't conflate the two. I
don't see this as an argument why it would be bad to delete
non-string-keys found in __dict__ from dir()'s return value. I don't
think that the equation

    set(x.__dict__) <= set(dir(x))

provides enough value to try and keep it. A more useful relationship is

    name in dir(x) <==> getattr(x, name) is valid

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Wed Aug 17 22:21:16 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 17 Aug 2005 16:21:16 -0400
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <ca471dc205081712307a346cf7@mail.gmail.com>
Message-ID: <001701c5a369$3911f960$3031c797@oemcomputer>

> > [Timothy Fitz]
> > > It seems to
> > > me that those who want dir to reflect __dict__ should just use
> > > __dict__ in the first place.
> 
> [Raymond]
> > The dir() builtin does quite a bit more than obj.__dict__.keys().
> 
> Well that's the whole point, right? 

Perhaps.  I wasn't taking a position.  Just noting that Timothy's
comment over-simplified the relationship.


> A more useful relationship is
> 
>     name in dir(x) <==> getattr(x, name) is valid


That would be a useful invariant.


Raymond


From gvanrossum at gmail.com  Wed Aug 17 22:46:27 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 17 Aug 2005 13:46:27 -0700
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <001701c5a369$3911f960$3031c797@oemcomputer>
References: <ca471dc205081712307a346cf7@mail.gmail.com>
	<001701c5a369$3911f960$3031c797@oemcomputer>
Message-ID: <ca471dc20508171346268e339f@mail.gmail.com>

[me]
> > A more useful relationship is
> >
> >     name in dir(x) <==> getattr(x, name) is valid

[Raymond]
> That would be a useful invariant.

Well, the <== part can't really be guaranteed due to the existence of
__getattr__  overriding (and all bets are off if __getattribute__ is
overridden!), but apart from those, stripping non-strings in dir()
would be a big help towards making the invariant true. So I'm +1 on
that.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From caustin at spikesource.com  Thu Aug 18 00:57:33 2005
From: caustin at spikesource.com (Calvin Austin)
Date: Wed, 17 Aug 2005 15:57:33 -0700
Subject: [Python-Dev] A testing challenge
Message-ID: <4303C0DD.5090903@spikesource.com>


When was the last time someone thanked you for writing a test?  I tried 
to think of the last time it happened to me and I can't remember. Well 
at Spikesource we want to thank you not just for helping the Python 
community but for your testing efforts too and we are running a 
participatory testing contest. This is a competition where there are no 
losers, every project gains if new tests are written.  For more details 
see below, it is open worldwide. feel free to send questions to me.

thanks
calvin

*_Open Testing Contest with Over $20,000 in Prizes_*

Committers!  SpikeSource is sponsoring a contest to help increase the 
participatory testing of open source software.  Awards will be given to 
open source projects that have the greatest increase in code coverage 
from September 15 through December 31, 2005.  Project sign-up is due by 
August 31^st and the contest begins on September 15^th .  Visit 
http://www.spikesource.com/contest/ for complete details and to register 
your project.

From eric at enthought.com  Thu Aug 18 01:05:11 2005
From: eric at enthought.com (eric jones)
Date: Wed, 17 Aug 2005 18:05:11 -0500
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <1124299079.23024.17.camel@geddy.wooz.org>
References: <20050817140217.GQ3389@www.async.com.br>	<972ec5bd05081709327fed099@mail.gmail.com>	<e8bf7a53050817093722e7b4bf@mail.gmail.com>
	<1124299079.23024.17.camel@geddy.wooz.org>
Message-ID: <4303C2A7.2030608@enthought.com>

Barry Warsaw wrote:

>On Wed, 2005-08-17 at 12:37, Jeremy Hylton wrote:
>  
>
>>I'd like to see the builtin id() removed so that I can use it as a
>>local variable name without clashing with the builtin name.  I
>>certainly use the id() function, but not as often as I have a local
>>variable I'd like to name id.  
>>    
>>
>
>Same here.
>
>  
>
>>The sys module seems like a natural
>>place to put id(), since it is exposing something about the
>>implementation of Python rather than something about the language; the
>>language offers the is operator to check ids.
>>    
>>
>
>+1
>-Barry
>  
>
+1

eric


From foom at fuhm.net  Thu Aug 18 01:23:47 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 17 Aug 2005 19:23:47 -0400
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <972ec5bd05081711555e9ad129@mail.gmail.com>
References: <2m4q9qunao.fsf@starship.python.net>
	<000801c5a2c6$5ca70080$af26c797@oemcomputer>
	<972ec5bd05081711555e9ad129@mail.gmail.com>
Message-ID: <7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net>


On Aug 17, 2005, at 2:55 PM, Timothy Fitz wrote:

> On 8/16/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>
>> -0  The behavior of dir() already a bit magical.  Python is much  
>> simpler
>> to comprehend if we have direct relationships like dir() and vars()
>> corresponding as closely as possible to the object's dictionary.  If
>> someone injects non-strings into an attribute dictionary, why should
>> dir() hide that fact?
>>
>
> Indeed, there seem to be two camps, those who want dir to reflect  
> __dict__
> and those who want dir to reflect attributes of an object. It seems to
> me that those who want dir to reflect __dict__ should just use
> __dict__ in the first place.
>
> However, in the case of dir handling non-strings, should dir handle
> non-valid identifiers as well, that is to say that while
> foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"]
> ?
>
> Right now the documentation says that it returns "attributes", and I
> would not consider non-strings to be attributes, so either the
> documentation or the implementation should rectify this disagreement.
>

I initially was going to say no, there's no reason to restrict your  
idea of "attributes" to be purely strings, because surely you could  
use non-strings as attributes if you wished to. But Python proves me  
wrong:
 >>> class X: pass
 >>> X.__dict__[1] = 5
 >>> dir(X)
[1, '__doc__', '__module__']
 >>> getattr(X, 1)
TypeError: getattr(): attribute name must be string

If dir() is supposed to return the list of attributes, it does seem  
logical that it should be possible to pass those names into getattr.  
I think I'd actually call that a defect in getattr() that it doesn't  
allow non-string attributes, not a defect in dir(). Ooh...even more  
annoying, it doesn't even allow unicode attributes that use  
characters outside the default encoding (ASCII).

But either way, there's absolutely no reason to worry about the  
attribute string being a valid identifier. That's pretty much only a  
concern for tab-completion in python shells.

James

From paul at pfdubois.com  Thu Aug 18 05:05:32 2005
From: paul at pfdubois.com (Paul F. Dubois)
Date: Wed, 17 Aug 2005 20:05:32 -0700
Subject: [Python-Dev] Deprecating builtin id (and moving it to,	sys())
Message-ID: <4303FAFC.3070204@pfdubois.com>

-1 for this proposal from me. I use id some and therefore the change 
would break some of my code. Breaking existing code without some 
overwhelming reason is a very bad idea, in my opinion. The reason cited 
here, that the name is so natural that one is tempted to use it, applies 
to many builtins. Ever written dict = {} and then said to yourself, gee, 
that isn't a very good idea? I have.

Besides that, the fact that an object has an identity, behaviors, and 
data is primary. For teaching beginners id() is important.

Paul


From anthony at interlink.com.au  Thu Aug 18 06:09:16 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu, 18 Aug 2005 14:09:16 +1000
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <20050817140217.GQ3389@www.async.com.br>
References: <20050817140217.GQ3389@www.async.com.br>
Message-ID: <200508181409.17431.anthony@interlink.com.au>

On Thursday 18 August 2005 00:02, Christian Robottom Reis wrote:
> I wonder: is moving id() to sys doable in the 2.5 cycle, with a
> deprecation warning being raised for people using the builtin? We'd then
> phase it out in one of the latter 2.x versions.

I'm neutral on putting id() also into sys. I'm -1 on either issuing a
deprecation warning or, worse yet, removing the id() builtin. The warnings
system is expensive to call, and I know from a brief look at a bunch of code
that I use id() inside some tight inner loops. 

Removing it entirely is gratuitous breakage, for a not very high payoff. If
you _really_ want to call a local variable 'id' you can (but shouldn't). 
You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I
don't see any movement to allow these...

Anthony

-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From mal at egenix.com  Thu Aug 18 09:36:14 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 18 Aug 2005 09:36:14 +0200
Subject: [Python-Dev] SWIG and rlcompleter
In-Reply-To: <7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net>
References: <2m4q9qunao.fsf@starship.python.net>	<000801c5a2c6$5ca70080$af26c797@oemcomputer>	<972ec5bd05081711555e9ad129@mail.gmail.com>
	<7F4E4259-CE92-4AED-823F-5E06BECAED6C@fuhm.net>
Message-ID: <43043A6E.5020109@egenix.com>

James Y Knight wrote:
> On Aug 17, 2005, at 2:55 PM, Timothy Fitz wrote:
> 
> 
>>On 8/16/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>>
>>
>>>-0  The behavior of dir() already a bit magical.  Python is much  
>>>simpler
>>>to comprehend if we have direct relationships like dir() and vars()
>>>corresponding as closely as possible to the object's dictionary.  If
>>>someone injects non-strings into an attribute dictionary, why should
>>>dir() hide that fact?
>>>
>>
>>Indeed, there seem to be two camps, those who want dir to reflect  
>>__dict__
>>and those who want dir to reflect attributes of an object. It seems to
>>me that those who want dir to reflect __dict__ should just use
>>__dict__ in the first place.
>>
>>However, in the case of dir handling non-strings, should dir handle
>>non-valid identifiers as well, that is to say that while
>>foo.__dict__[2] = ... is an obvious case what about foo.__dict__["1"]
>>?
>>
>>Right now the documentation says that it returns "attributes", and I
>>would not consider non-strings to be attributes, so either the
>>documentation or the implementation should rectify this disagreement.
>>
> 
> 
> I initially was going to say no, there's no reason to restrict your  
> idea of "attributes" to be purely strings, because surely you could  
> use non-strings as attributes if you wished to. But Python proves me  
> wrong:
>  >>> class X: pass
>  >>> X.__dict__[1] = 5
>  >>> dir(X)
> [1, '__doc__', '__module__']
>  >>> getattr(X, 1)
> TypeError: getattr(): attribute name must be string
> 
> If dir() is supposed to return the list of attributes, it does seem  
> logical that it should be possible to pass those names into getattr.  
> I think I'd actually call that a defect in getattr() that it doesn't  
> allow non-string attributes, not a defect in dir(). Ooh...even more  
> annoying, it doesn't even allow unicode attributes that use  
> characters outside the default encoding (ASCII).

Which is quite natural: Python doesn't allow any non-ASCII
identifiers either :-)

> But either way, there's absolutely no reason to worry about the  
> attribute string being a valid identifier. That's pretty much only a  
> concern for tab-completion in python shells.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 18 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Thu Aug 18 11:39:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 Aug 2005 11:39:37 +0200
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <200508181409.17431.anthony@interlink.com.au>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
Message-ID: <43045759.80806@v.loewis.de>

Anthony Baxter wrote:
> Removing it entirely is gratuitous breakage, for a not very high payoff. If
> you _really_ want to call a local variable 'id' you can (but shouldn't). 
> You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I
> don't see any movement to allow these...

This is getting off-topic, but... In C#, you can: you write @class,
@void, @return. Apparently, this is so that you can access arbitrary
COM objects (which may happen to use C# keywords as method names). Of
course, we would put an underscore after the name in that case.

Regards,
Martin

From ianb at colorstudy.com  Thu Aug 18 17:54:49 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 18 Aug 2005 10:54:49 -0500
Subject: [Python-Dev] PEP 309: Partial method application
Message-ID: <4304AF49.6030007@colorstudy.com>

I missed the discussion on this 
(http://www.python.org/peps/pep-0309.html), but then 2.5 isn't out yet.

I think partial() misses an important use case of method getting, for 
instance:

     lst = ['A', 'b', 'C']
     lst.sort(key=partialmethod('lower'))

Which sorts by lower-case.  Of course you can use str.lower, except 
you'll have unnecessarily enforced a type (and excluded Unicode).  So 
you are left with lambda x: x.lower().

Here's an implementation:

     def partialmethod(method, *args, **kw):
         def call(obj, *more_args, **more_kw):
             call_kw = kw.copy()
             call_kw.update(more_kw)
             return getattr(obj, method)(*(arg+more_args), **call_kw)
         return call

This is obviously related to partial().  Maybe this implementation 
should be a classmethod or function attribute, partial.method().

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From raymond.hettinger at verizon.net  Thu Aug 18 18:00:06 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 18 Aug 2005 12:00:06 -0400
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4304AF49.6030007@colorstudy.com>
Message-ID: <003401c5a40d$ea5694c0$8b24c797@oemcomputer>

[Ian Bicking]
> I think partial() misses an important use case of method getting, for
> instance:
> 
>      lst = ['A', 'b', 'C']
>      lst.sort(key=partialmethod('lower'))

We've already got one:

       lst.sort(key=operator.attrgetter('lower'))


Raymond


From gvanrossum at gmail.com  Thu Aug 18 18:22:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 18 Aug 2005 09:22:01 -0700
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <200508181409.17431.anthony@interlink.com.au>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
Message-ID: <ca471dc20508180922583f17d0@mail.gmail.com>

On 8/17/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> If you _really_ want to call a local variable 'id' you can (but shouldn't).

Disagreed. The built-in namespace is searched last for a reason -- the
design is such that if you don't care for a particular built-in you
don't need to know about it.

> You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but I
> don't see any movement to allow these...

Please don't propagate the confusion between reserved keywords and
built-in names!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steven.bethard at gmail.com  Thu Aug 18 18:34:00 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 18 Aug 2005 10:34:00 -0600
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <003401c5a40d$ea5694c0$8b24c797@oemcomputer>
References: <4304AF49.6030007@colorstudy.com>
	<003401c5a40d$ea5694c0$8b24c797@oemcomputer>
Message-ID: <d11dcfba050818093460bfb041@mail.gmail.com>

Raymond Hettinger wrote:
> [Ian Bicking]
> > I think partial() misses an important use case of method getting, for
> > instance:
> >
> >      lst = ['A', 'b', 'C']
> >      lst.sort(key=partialmethod('lower'))
>
> We've already got one:
>
>        lst.sort(key=operator.attrgetter('lower'))

Doesn't that just sort on the str.lower or unicode.lower method object?

py> sorted(['A', u'b', 'C'], key=operator.attrgetter('lower'))
[u'b', 'C', 'A']
py> sorted(['A', u'b', 'C'], key=partialmethod('lower')) # after
fixing arg -> args bug
['A', u'b', 'C']

STeVe
--
You can wordify anything if you just verb it.
       --- Bucky Katt, Get Fuzzy

From raymond.hettinger at verizon.net  Thu Aug 18 18:43:07 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 18 Aug 2005 12:43:07 -0400
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <d11dcfba050818093460bfb041@mail.gmail.com>
Message-ID: <003901c5a413$e997c9e0$8b24c797@oemcomputer>

> > [Ian Bicking]
> > > I think partial() misses an important use case of method getting,
for
> > > instance:
> > >
> > >      lst = ['A', 'b', 'C']
> > >      lst.sort(key=partialmethod('lower'))
> >
> > We've already got one:
> >
> >        lst.sort(key=operator.attrgetter('lower'))
> 
> Doesn't that just sort on the str.lower or unicode.lower method
object?

My mistake.  It sorts on the bound method rather than the results of
applying that method.


Raymond


From ianb at colorstudy.com  Thu Aug 18 20:05:54 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 18 Aug 2005 13:05:54 -0500
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <003901c5a413$e997c9e0$8b24c797@oemcomputer>
References: <003901c5a413$e997c9e0$8b24c797@oemcomputer>
Message-ID: <4304CE02.7080308@colorstudy.com>

Raymond Hettinger wrote:
>>>>instance:
>>>>
>>>>     lst = ['A', 'b', 'C']
>>>>     lst.sort(key=partialmethod('lower'))
>>>
>>>We've already got one:
>>>
>>>       lst.sort(key=operator.attrgetter('lower'))
>>
>>Doesn't that just sort on the str.lower or unicode.lower method
>> object?
> 
> My mistake.  It sorts on the bound method rather than the results of
> applying that method.

Then I thought it might be right to do 
partial(operator.attrgetter('lower')).  This, however, accomplishes 
exactly nothing.  I only decided this after actually trying it, though 
upon reflection partial(function) always accomplishes nothing.

I don't have any conclusion from this, but only mention it to 
demonstrate that callables on top of callables are likely to confuse.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From martin at v.loewis.de  Thu Aug 18 21:40:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 Aug 2005 21:40:19 +0200
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4304AF49.6030007@colorstudy.com>
References: <4304AF49.6030007@colorstudy.com>
Message-ID: <4304E423.9050005@v.loewis.de>

Ian Bicking wrote:
> 
>      lst = ['A', 'b', 'C']
>      lst.sort(key=partialmethod('lower'))
> 
> Which sorts by lower-case.  Of course you can use str.lower, except 
> you'll have unnecessarily enforced a type (and excluded Unicode).  So 
> you are left with lambda x: x.lower().

For this specific case, you can use string.lower (which is exactly
what the lambda function does).

As for the more general proposal: -1 on more places to pass strings to
denote method/function/class names. These are ugly to type.

What I think you want is not a partial method, instead, you want to
turn a method into a standard function, and in a 'virtual' way.

So I would propose the syntax

  lst.sort(key=virtual.lower) # where virtual is functional.virtual

As for extending PEP 309: This PEP deliberately abstained from other
ways of currying, and instead only introduced the functional module.
If you want to see "lazy functions" in the standard library, you should
write a new PEP (unless there is an easy agreement about a single right
way to do this, which I don't see).

Regards,
Martin

P.S. It's not even clear that this should be added to functional,
as attrgetter and itemgetter are already in operator. But, perhaps,
they should be in functional.


From shane at hathawaymix.org  Thu Aug 18 22:13:31 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Thu, 18 Aug 2005 14:13:31 -0600
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4304E423.9050005@v.loewis.de>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>
Message-ID: <4304EBEB.9000300@hathawaymix.org>

Martin v. L?wis wrote:
> So I would propose the syntax
> 
>   lst.sort(key=virtual.lower) # where virtual is functional.virtual

Ooh, may I say that idea is interesting!  It's easy to implement, too:

class virtual:
     def __getattr__(self, name):
         return lambda obj: getattr(obj, name)()
virtual = virtual()

Shane

From gvanrossum at gmail.com  Thu Aug 18 22:17:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 18 Aug 2005 13:17:16 -0700
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4304E423.9050005@v.loewis.de>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>
Message-ID: <ca471dc2050818131752d8d4c3@mail.gmail.com>

On 8/18/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> As for the more general proposal: -1 on more places to pass strings to
> denote method/function/class names. These are ugly to type.

Agreed.

> What I think you want is not a partial method, instead, you want to
> turn a method into a standard function, and in a 'virtual' way.
> 
> So I would propose the syntax
> 
>   lst.sort(key=virtual.lower) # where virtual is functional.virtual

I like this, but would hope for a different name -- the poor word
'virtual' has been abused enough by C++.

> P.S. It's not even clear that this should be added to functional,
> as attrgetter and itemgetter are already in operator. But, perhaps,
> they should be in functional.

They feel related to attrgetter more than to partial.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Thu Aug 18 22:46:00 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 18 Aug 2005 13:46:00 -0700
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <ca471dc2050818131752d8d4c3@mail.gmail.com>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>
	<ca471dc2050818131752d8d4c3@mail.gmail.com>
Message-ID: <bbaeab1005081813461a09db77@mail.gmail.com>

On 8/18/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/18/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> > As for the more general proposal: -1 on more places to pass strings to
> > denote method/function/class names. These are ugly to type.
> 
> Agreed.
> 
> > What I think you want is not a partial method, instead, you want to
> > turn a method into a standard function, and in a 'virtual' way.
> >
> > So I would propose the syntax
> >
> >   lst.sort(key=virtual.lower) # where virtual is functional.virtual
> 
> I like this, but would hope for a different name -- the poor word
> 'virtual' has been abused enough by C++.
> 

Yeah, me too.  Possible name are 'delayed', 'lazyattr', or just plain
'lazy' since it reminds me of Haskell.

> > P.S. It's not even clear that this should be added to functional,
> > as attrgetter and itemgetter are already in operator. But, perhaps,
> > they should be in functional.
> 
> They feel related to attrgetter more than to partial.
> 

True, but the idea of lazy evaluation, at least for me, reminds me
more of functional languages and thus the functional module.

Oh, when should we think of putting reduce into functional?  I
remember this was discussed when it was realized reduce was the only
functional built-in that is not covered by itertools or listcomps.

-Brett

From raymond.hettinger at verizon.net  Thu Aug 18 22:52:36 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 18 Aug 2005 16:52:36 -0400
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <ca471dc2050818131752d8d4c3@mail.gmail.com>
Message-ID: <000c01c5a436$e232daa0$8b24c797@oemcomputer>

[Guido]
> They feel related to attrgetter more than to partial.

That suggests operator.methodcall()


From ianb at colorstudy.com  Thu Aug 18 22:57:38 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 18 Aug 2005 15:57:38 -0500
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <bbaeab1005081813461a09db77@mail.gmail.com>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>	
	<ca471dc2050818131752d8d4c3@mail.gmail.com>
	<bbaeab1005081813461a09db77@mail.gmail.com>
Message-ID: <4304F642.7040508@colorstudy.com>

Brett Cannon wrote:
>>>What I think you want is not a partial method, instead, you want to
>>>turn a method into a standard function, and in a 'virtual' way.
>>>
>>>So I would propose the syntax
>>>
>>>  lst.sort(key=virtual.lower) # where virtual is functional.virtual
>>
>>I like this, but would hope for a different name -- the poor word
>>'virtual' has been abused enough by C++.
>>
> 
> 
> Yeah, me too.  Possible name are 'delayed', 'lazyattr', or just plain
> 'lazy' since it reminds me of Haskell.

I don't think there's anything particularly lazy about it.  It's like a 
compliment of attrgetter.  Where attrgetter is an inversion of getattr, 
partialmethod is an inversion of... well, of something that currently 
has no name.  There's kind of an implicit operation in obj.method() -- 
people will generally read that as a "method call", not as the retrieval 
of a bound method and later invocation of that method.  I think that is 
why it's so hard to figure out how to represent this in terms of 
something like attrgetter -- we try to invert something (a method call) 
that doesn't exist in the language.


-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From steven.bethard at gmail.com  Thu Aug 18 23:20:09 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 18 Aug 2005 15:20:09 -0600
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4304EBEB.9000300@hathawaymix.org>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>
	<4304EBEB.9000300@hathawaymix.org>
Message-ID: <d11dcfba05081814207b74f7b6@mail.gmail.com>

Martin v. L?wis wrote:
> So I would propose the syntax
>
>   lst.sort(key=virtual.lower) # where virtual is functional.virtual
 
Shane Hathaway wrote:
> class virtual:
>      def __getattr__(self, name):
>          return lambda obj: getattr(obj, name)()
> virtual = virtual()

I think (perhaps because of the name) that this could be confusing.  I
don't have any intuition that "virtual.lower" would return a function
that calls the "lower" attribute instead of returning a function that
simply accesses that attribute.

If we're going to move away from the itemgetter() and attrgetter()
style, then we should be consistent about it and provide a solution
(or solutions) that answers all of these problems:
    obj.attr
    obj.attr(*args, **kwargs)
    obj[key]
I'm not sure that there is a clean/obvious way to do this.

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From ncoghlan at gmail.com  Fri Aug 19 00:43:06 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 19 Aug 2005 08:43:06 +1000
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <bbaeab1005081813461a09db77@mail.gmail.com>
References: <4304AF49.6030007@colorstudy.com>
	<4304E423.9050005@v.loewis.de>	<ca471dc2050818131752d8d4c3@mail.gmail.com>
	<bbaeab1005081813461a09db77@mail.gmail.com>
Message-ID: <43050EFA.6070702@gmail.com>

Brett Cannon wrote:
>>>What I think you want is not a partial method, instead, you want to
>>>turn a method into a standard function, and in a 'virtual' way.
>>>
>>>So I would propose the syntax
>>>
>>>  lst.sort(key=virtual.lower) # where virtual is functional.virtual
>>
>>I like this, but would hope for a different name -- the poor word
>>'virtual' has been abused enough by C++.
> 
> Yeah, me too.  Possible name are 'delayed', 'lazyattr', or just plain
> 'lazy' since it reminds me of Haskell.

Hmm, "methodcall"?

As in:
   lst.sort(key=methodcall.lower)

Where "methodcall" is something like what Shane described:

   class methodcall:
       def __getattr__(self, name):
           def delayedcall(*args, **kwds):
               return getattr(args[0], name)(*args[1:], **kwds)
           return delayedcall
   methodcall = methodcall()

> 
> Oh, when should we think of putting reduce into functional?  I
> remember this was discussed when it was realized reduce was the only
> functional built-in that is not covered by itertools or listcomps.

I expected functional.map, functional.filter and functional.reduce to all 
exist in 2.5.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From bcannon at gmail.com  Fri Aug 19 01:05:17 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 18 Aug 2005 16:05:17 -0700
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <43050EFA.6070702@gmail.com>
References: <4304AF49.6030007@colorstudy.com> <4304E423.9050005@v.loewis.de>
	<ca471dc2050818131752d8d4c3@mail.gmail.com>
	<bbaeab1005081813461a09db77@mail.gmail.com>
	<43050EFA.6070702@gmail.com>
Message-ID: <bbaeab10050818160527bd001c@mail.gmail.com>

On 8/18/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:

> > Oh, when should we think of putting reduce into functional?  I
> > remember this was discussed when it was realized reduce was the only
> > functional built-in that is not covered by itertools or listcomps.
> 
> I expected functional.map, functional.filter and functional.reduce to all
> exist in 2.5.
> 

Itertools covers map, filter is covered by genexps.  'reduce' is the
only one that does not have an equivalent anywhere.  I guess we could
cross-link itertools.map into functional.map, but I would rather just
mention in the docs of one that it is located in the other module. 
And filter is just not worth it; that can definitely be covered in the
docs of the module.

-Brett

From jcarlson at uci.edu  Fri Aug 19 02:09:09 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 18 Aug 2005 17:09:09 -0700
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <d11dcfba05081814207b74f7b6@mail.gmail.com>
References: <4304EBEB.9000300@hathawaymix.org>
	<d11dcfba05081814207b74f7b6@mail.gmail.com>
Message-ID: <20050818162112.788A.JCARLSON@uci.edu>


Steven Bethard <steven.bethard at gmail.com> wrote:
> 
> Martin v. L?wis wrote:
> > So I would propose the syntax
> >
> >   lst.sort(key=virtual.lower) # where virtual is functional.virtual
>  
> Shane Hathaway wrote:
> > class virtual:
> >      def __getattr__(self, name):
> >          return lambda obj: getattr(obj, name)()
> > virtual = virtual()
> 
> I think (perhaps because of the name) that this could be confusing.  I
> don't have any intuition that "virtual.lower" would return a function
> that calls the "lower" attribute instead of returning a function that
> simply accesses that attribute.
> 
> If we're going to move away from the itemgetter() and attrgetter()
> style, then we should be consistent about it and provide a solution
> (or solutions) that answers all of these problems:
>     obj.attr
>     obj.attr(*args, **kwargs)
>     obj[key]
> I'm not sure that there is a clean/obvious way to do this.

I thought that:
  operator.attrgetter() was for obj.attr
  operator.itemgetter() was for obj[integer_index]

That's almost all the way there.  All that remains is to have something
that gets any key (not just integers) and which handles function calls.

In terms of the function call semantics, what about:

   class methodcall:
       def __getattr__(self, name, *args, **kwds):
           def delayedcall(obj):
               return getattr(obj, name)(*args, **kwds)
           return delayedcall
   methodcall = methodcall()

 - Josiah


From steven.bethard at gmail.com  Fri Aug 19 07:33:36 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu, 18 Aug 2005 23:33:36 -0600
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <20050818162112.788A.JCARLSON@uci.edu>
References: <4304EBEB.9000300@hathawaymix.org>
	<d11dcfba05081814207b74f7b6@mail.gmail.com>
	<20050818162112.788A.JCARLSON@uci.edu>
Message-ID: <d11dcfba05081822331c69d6ec@mail.gmail.com>

Josiah Carlson wrote:
> Steven Bethard <steven.bethard at gmail.com> wrote:
> > If we're going to move away from the itemgetter() and attrgetter()
> > style, then we should be consistent about it and provide a solution
> > (or solutions) that answers all of these problems:
> >     obj.attr
> >     obj.attr(*args, **kwargs)
> >     obj[key]
> > I'm not sure that there is a clean/obvious way to do this.
> 
> I thought that:
>   operator.attrgetter() was for obj.attr
>   operator.itemgetter() was for obj[integer_index]

My point exactly.  If we're sticking to the same style, I would expect that for
    obj.method(*args, **kwargs)
we would have something like:
    operator.methodcaller('method', *args, **kwargs)

The proposal by Martin v. L?wis is that this should instead look something like:
    methodcall.method(*args, **kwargs)
which is a departure from the current attrgetter() and itemgetter()
idiom.  I'm not objecting to this approach, by the way.  I think with
the right name, it would probably read well.  I just think that we
should try to be consistent one way or the other.  If we go with
Martin v. L?wis's suggestion, I would then expect that the corrolates
to attrgetter() and itemgetter() would also be included, e.g.:
    attrget.attr   (for obj.attr)
    itemget[key]   (for obj[key])


STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From martin at v.loewis.de  Fri Aug 19 07:59:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Aug 2005 07:59:38 +0200
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <d11dcfba05081822331c69d6ec@mail.gmail.com>
References: <4304EBEB.9000300@hathawaymix.org>	<d11dcfba05081814207b74f7b6@mail.gmail.com>	<20050818162112.788A.JCARLSON@uci.edu>
	<d11dcfba05081822331c69d6ec@mail.gmail.com>
Message-ID: <4305754A.4040900@v.loewis.de>

Steven Bethard wrote:
>>I thought that:
>>  operator.attrgetter() was for obj.attr
>>  operator.itemgetter() was for obj[integer_index]
> 
> 
> My point exactly.  If we're sticking to the same style, I would expect that for
>     obj.method(*args, **kwargs)
> we would have something like:
>     operator.methodcaller('method', *args, **kwargs)

You might be missing one aspect of attrgetter, though. I can have

  f = operator.attrgetter('name', 'age')

and then f(person) gives me (person.name, person.age). Likewise for
itemgetter(1,2,3). Extending this to methodcaller is not natural;
you would have

  x=methodcaller(('open',['foo','r'],{}),('read',[100],{}),
                 ('close',[],{}))

and then

  x(somestorage)

(I know this is not the typical open/read/close pattern, where you
 would normally call read on what open returns)

It might be that there is no use case for a multi-call methodgetter;
I just point out that a single-call methodgetter would *not* be
in the same style as attrgetter and itemgetter.

>     attrget.attr   (for obj.attr)
>     itemget[key]   (for obj[key])

I agree that would be consistent. These also wouldn't allow to get
multiple items and indices. I don't know what the common use for
attrgetter is: one or more attributes?

Regards,
Martin

From steven.bethard at gmail.com  Fri Aug 19 09:14:17 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 19 Aug 2005 01:14:17 -0600
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <4305754A.4040900@v.loewis.de>
References: <4304EBEB.9000300@hathawaymix.org>
	<d11dcfba05081814207b74f7b6@mail.gmail.com>
	<20050818162112.788A.JCARLSON@uci.edu>
	<d11dcfba05081822331c69d6ec@mail.gmail.com>
	<4305754A.4040900@v.loewis.de>
Message-ID: <d11dcfba050819001453e064de@mail.gmail.com>

Martin v. L?wis wrote:
> Steven Bethard wrote:
> >>I thought that:
> >>  operator.attrgetter() was for obj.attr
> >>  operator.itemgetter() was for obj[integer_index]
> >
> >
> > My point exactly.  If we're sticking to the same style, I would expect that for
> >     obj.method(*args, **kwargs)
> > we would have something like:
> >     operator.methodcaller('method', *args, **kwargs)
> 
> You might be missing one aspect of attrgetter, though. I can have
> 
>   f = operator.attrgetter('name', 'age')
> 
> and then f(person) gives me (person.name, person.age). Likewise for
> itemgetter(1,2,3).
[snip]
> I don't know what the common use for
> attrgetter is: one or more attributes?

Well, in current Python code, I'd be willing to wager that it's one,
no more, since Python 2.4 only supports a single argument to
itemgetter and attrgetter.  Of course, when Python 2.5 comes out, it's
certainly possible that the multi-argument forms will become
commonplace.

I agree that an operator.methodcaller() shouldn't try to support
multiple methods.  OTOH, the syntax
    methodcall.method(*args, **kwargs)
doesn't really lend itself to multiple methods either.

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From jcarlson at uci.edu  Fri Aug 19 09:37:43 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 19 Aug 2005 00:37:43 -0700
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <d11dcfba050819001453e064de@mail.gmail.com>
References: <4305754A.4040900@v.loewis.de>
	<d11dcfba050819001453e064de@mail.gmail.com>
Message-ID: <20050819003609.789B.JCARLSON@uci.edu>


Steven Bethard <steven.bethard at gmail.com> wrote:
> I agree that an operator.methodcaller() shouldn't try to support
> multiple methods.  OTOH, the syntax
>     methodcall.method(*args, **kwargs)
> doesn't really lend itself to multiple methods either.

But that's OK, we don't want to be calling multiple methods anyways, do
we?  I'd personally like to see an example it makes sense if someone
says that we do.

 - Josiah


From raymond.hettinger at verizon.net  Fri Aug 19 18:39:18 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 19 Aug 2005 12:39:18 -0400
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <20050819003609.789B.JCARLSON@uci.edu>
Message-ID: <002501c5a4dc$8b521560$3f25a044@oemcomputer>

[Steven Bethard]
> > I agree that an operator.methodcaller() shouldn't try to support
> > multiple methods.  OTOH, the syntax
> >     methodcall.method(*args, **kwargs)
> > doesn't really lend itself to multiple methods either.

[Josiah Carlson]
> But that's OK, we don't want to be calling multiple methods anyways,
do
> we?  I'd personally like to see an example it makes sense if someone
> says that we do.

If an obvious syntax doesn't emerge, don't fret.  The most obvious
approach is to define a regular Python function and supply that function
to the key= argument for list.sort() or sorted().

A virtue of the key= argument was reducing O(n log n) calls to just
O(n).  Further speed-ups are a false economy.  So there's no need to
twist syntax into knots just to get a C based method calling function.

Likewise with map(), if a new function doesn't fit neatly, take that as
a cue to be writing a plain for-loop.


Raymond


From martin at v.loewis.de  Fri Aug 19 22:08:23 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 19 Aug 2005 22:08:23 +0200
Subject: [Python-Dev] PEP 309: Partial method application
In-Reply-To: <20050819003609.789B.JCARLSON@uci.edu>
References: <4305754A.4040900@v.loewis.de>	<d11dcfba050819001453e064de@mail.gmail.com>
	<20050819003609.789B.JCARLSON@uci.edu>
Message-ID: <43063C37.8040808@v.loewis.de>

Josiah Carlson wrote:
> Steven Bethard <steven.bethard at gmail.com> wrote:
> 
>>I agree that an operator.methodcaller() shouldn't try to support
>>multiple methods.  OTOH, the syntax
>>    methodcall.method(*args, **kwargs)
>>doesn't really lend itself to multiple methods either.
> 
> 
> But that's OK, we don't want to be calling multiple methods anyways, do
> we?  I'd personally like to see an example it makes sense if someone
> says that we do.

Several people argued that the version with a string method name
should be added "for consistency". I only pointed out that doing
so would not be completely consistent.

Regards,
Martin

From jeremy at alum.mit.edu  Fri Aug 19 23:15:15 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Aug 2005 17:15:15 -0400
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <ca471dc20508180922583f17d0@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
	<ca471dc20508180922583f17d0@mail.gmail.com>
Message-ID: <e8bf7a5305081914155e7ecb84@mail.gmail.com>

On 8/18/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/17/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> > If you _really_ want to call a local variable 'id' you can (but shouldn't).
> 
> Disagreed. The built-in namespace is searched last for a reason -- the
> design is such that if you don't care for a particular built-in you
> don't need to know about it.

In practice, it causes much confusion if you ever use a local variable
that has the same name as the built-in namespace.  If you intend to
use id as a variable, it leads to confusing messages when a typo or
editing error accidentally removes the definition, because the name
will still be defined for you.  It also leads to confusion when you
later want to use the builtin in the same module or function (or in
the debugger).  If Python defines the name, I don't want to provide a
redefinition.

Jeremy

From gvanrossum at gmail.com  Sat Aug 20 06:00:17 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 19 Aug 2005 21:00:17 -0700
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <e8bf7a5305081914155e7ecb84@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
	<ca471dc20508180922583f17d0@mail.gmail.com>
	<e8bf7a5305081914155e7ecb84@mail.gmail.com>
Message-ID: <ca471dc205081921007f59272d@mail.gmail.com>

On 8/19/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On 8/18/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> > On 8/17/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> > > If you _really_ want to call a local variable 'id' you can (but shouldn't).
> >
> > Disagreed. The built-in namespace is searched last for a reason -- the
> > design is such that if you don't care for a particular built-in you
> > don't need to know about it.
> 
> In practice, it causes much confusion if you ever use a local variable
> that has the same name as the built-in namespace.  If you intend to
> use id as a variable, it leads to confusing messages when a typo or
> editing error accidentally removes the definition, because the name
> will still be defined for you.  It also leads to confusion when you
> later want to use the builtin in the same module or function (or in
> the debugger).  If Python defines the name, I don't want to provide a
> redefinition.

This has startled me a few times, but never for more than 30 seconds.

In correct code there sure isn't any confusion.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From anthony at interlink.com.au  Sat Aug 20 10:48:11 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 20 Aug 2005 18:48:11 +1000
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <ca471dc20508180922583f17d0@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
	<ca471dc20508180922583f17d0@mail.gmail.com>
Message-ID: <200508201848.13750.anthony@interlink.com.au>

On Friday 19 August 2005 02:22, Guido van Rossum wrote:
> On 8/17/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> > If you _really_ want to call a local variable 'id' you can (but
> > shouldn't).
>
> Disagreed. The built-in namespace is searched last for a reason -- the
> design is such that if you don't care for a particular built-in you
> don't need to know about it.

I'm not sure what you're disagreeing with. Are you saying you _can't_ call
a variable 'id', or that it's OK to do this?

> > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but
> > I don't see any movement to allow these...
>
> Please don't propagate the confusion between reserved keywords and
> built-in names!

It's not a matter of 'confusion', more that there are some names you can't
or shouldn't use in Python. When coding twisted, often the most obvious 
'short' name for a Deferred is 'def', but of course that doesn't work. 

Anthony

From gvanrossum at gmail.com  Sat Aug 20 18:02:25 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 20 Aug 2005 09:02:25 -0700
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <200508201848.13750.anthony@interlink.com.au>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
	<ca471dc20508180922583f17d0@mail.gmail.com>
	<200508201848.13750.anthony@interlink.com.au>
Message-ID: <ca471dc205082009027ea18d9a@mail.gmail.com>

On 8/20/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> On Friday 19 August 2005 02:22, Guido van Rossum wrote:
> > On 8/17/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> > > If you _really_ want to call a local variable 'id' you can (but
> > > shouldn't).
> >
> > Disagreed. The built-in namespace is searched last for a reason -- the
> > design is such that if you don't care for a particular built-in you
> > don't need to know about it.
> 
> I'm not sure what you're disagreeing with. Are you saying you _can't_ call
> a variable 'id', or that it's OK to do this?

That it's OK.

> > > You also can't/shouldn't call a variable 'class', 'def', or 'len' -- but
> > > I don't see any movement to allow these...
> >
> > Please don't propagate the confusion between reserved keywords and
> > built-in names!
> 
> It's not a matter of 'confusion', more that there are some names you can't
> or shouldn't use in Python. When coding twisted, often the most obvious
> 'short' name for a Deferred is 'def', but of course that doesn't work.

My point is that there are two reasons for not using such a name. With
'def', you *can't*. With 'len', you *could* (but it would be unwise).
With 'id', IMO it's okay.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 20 18:14:51 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 20 Aug 2005 09:14:51 -0700
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <ca471dc2050816165852cb3934@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>
	<ca471dc2050816165852cb3934@mail.gmail.com>
Message-ID: <ca471dc2050820091458b1cb23@mail.gmail.com>

I'm ready to accept te general idea of moving to subversion and away
from SourceForge.

On the hosting issue, I'm still neutral -- I expect we'll be able to
support the current developer crowd easily on svn.python.org, but if
we ever find ther are resource problems (either people or bandwidth
etc.) I just received a recommendation for wush.net which specializes
in svn hosting. $90/month for 5 Gb of disk space sounds like a good
deal and easily within the PSF budget.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bob at redivi.com  Sat Aug 20 18:45:05 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sat, 20 Aug 2005 06:45:05 -1000
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <ca471dc2050820091458b1cb23@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>
	<ca471dc2050816165852cb3934@mail.gmail.com>
	<ca471dc2050820091458b1cb23@mail.gmail.com>
Message-ID: <D966EDCF-758F-4E96-9C72-ECAA08189902@redivi.com>


On Aug 20, 2005, at 6:14 AM, Guido van Rossum wrote:

> I'm ready to accept te general idea of moving to subversion and away
> from SourceForge.
>
> On the hosting issue, I'm still neutral -- I expect we'll be able to
> support the current developer crowd easily on svn.python.org, but if
> we ever find ther are resource problems (either people or bandwidth
> etc.) I just received a recommendation for wush.net which specializes
> in svn hosting. $90/month for 5 Gb of disk space sounds like a good
> deal and easily within the PSF budget.

We were using wush.net's subversion and trac service for a  
(commercial) project from February until a little over a week ago.   
Their servers dropped off the internet for about three days straight  
earlier this month and we were unable to contact anyone.  I still  
don't think we've received an explanation as to what happened.  When  
it did come up, our data was OK.  Previous to that experience, it  
worked out OK.  The subversion repository got wedged once, but that  
was fixed in a matter of hours after filing a ticket.

We host our own subversion and trac now.  We just can't afford that  
kind of downtime again.  Setting up subversion and trac isn't a very  
big deal, and they don't really require any real maintenance as far  
as I can tell (.. and I have been dealing with subversion over apache  
via mod_dav_svn since pre-1.0 days).

Another thing to note is that the trac installation at wush.net is a  
branch off the latest stable version, and the database can't be  
downgraded or upgraded correctly by the trac-admin tool.  However,  
the SQL to downgrade the schema to the latest stable is trivial and I  
still have it lying around if anyone is interested in moving their  
trac repositories off of wush ;)

-bob


From barry at python.org  Sat Aug 20 20:37:02 2005
From: barry at python.org (Barry Warsaw)
Date: Sat, 20 Aug 2005 14:37:02 -0400
Subject: [Python-Dev] A testing challenge
In-Reply-To: <4303C0DD.5090903@spikesource.com>
References: <4303C0DD.5090903@spikesource.com>
Message-ID: <1124563022.24297.35.camel@presto.wooz.org>

On Wed, 2005-08-17 at 18:57, Calvin Austin wrote:
> When was the last time someone thanked you for writing a test?  I tried 
> to think of the last time it happened to me and I can't remember. Well 
> at Spikesource we want to thank you not just for helping the Python 
> community but for your testing efforts too and we are running a 
> participatory testing contest. This is a competition where there are no 
> losers, every project gains if new tests are written.  For more details 
> see below, it is open worldwide. feel free to send questions to me.

Since you posted to python-dev, you might think about adding Python to
the list of languages "in which [...] the project [is] written" on the
registration form.  Currently, the only choices are C/C++, Java, and
php.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050820/fa521b8b/attachment.pgp

From paolo_veronelli at libero.it  Sun Aug 21 11:35:37 2005
From: paolo_veronelli at libero.it (Paolino)
Date: Sun, 21 Aug 2005 11:35:37 +0200
Subject: [Python-Dev] On decorators implementation
Message-ID: <43084AE9.20900@libero.it>

I noticed (via using them) that decorations are applied to methods
before  they become methods.

This choice flattens down the implementation to no differentiating
methods from functions.


1)
I have to apply euristics on the wrapped function type when I use the
function as an index key.

         if type(observed) is types.MethodType:
           observed=observed.im_func

things like this are inside my decorators.

2)
The behavior of decorations are not definable.
I imagine that a method implementation of them inside the type metaclass
could be better specified by people.
This probably ends up in metamethods or something I can't grasp


Thanks

Paolino


From martin at v.loewis.de  Sun Aug 21 13:18:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 13:18:58 +0200
Subject: [Python-Dev] On decorators implementation
In-Reply-To: <43084AE9.20900@libero.it>
References: <43084AE9.20900@libero.it>
Message-ID: <43086322.2030706@v.loewis.de>

Paolino wrote:
> I imagine that a method implementation of them inside the type metaclass
> could be better specified by people.

What you ask for is unimplementable. Method objects are created only
when the method is accessed, not (even) when the class is created.
Watch this:

>>> class X:
...   def foo(self):
...     pass
...
>>> x=X()
>>> type(x.foo)
<type 'instancemethod'>
>>> type(X.__dict__['foo'])
<type 'function'>

So even though the class has long been defined, inside X's dictionary,
foo is still a function. Only when you *access* x.foo, a method object
is created on the fly:

>>> x.foo is x.foo
False

Therefore, a decorator function cannot possibly get access to the method
object - it simply doesn't exist.

Regards,
Martin

From martin at v.loewis.de  Sun Aug 21 15:12:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 15:12:00 +0200
Subject: [Python-Dev] Admin access using svn+ssh
Message-ID: <43087DA0.702@v.loewis.de>

It turns out that svn+ssh with a single account has limitations:
you can only set the tunnel user when you are using a restricted
key. In PEP 347, the plan is that the current SF project admins
get shell access to the pythondev account, which just has been
created.

To resolve this, project admins need two different SSH keys:
one for accessing the shell, and one for regular commit activities.

I would suggest that the default key is used for regular commits,
and a separate key is created for shell access. I described this
a bit in the PEP, essentially, in .ssh/config, I have

Host pythondev
  Hostname dinsdale.python.org
  User pythondev
  IdentityFile ~/.ssh/pythondev

So when I do "ssh pythondev", I get the shell account; when I do
"svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules",
I use my default identity, which gets tunneled as "Martin v. Loewis".

Regards,
Martin

From martin at v.loewis.de  Sun Aug 21 15:34:57 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 15:34:57 +0200
Subject: [Python-Dev] wush.net details
Message-ID: <43088301.5010104@v.loewis.de>

I made a service request at wush.net, asking for more details
about their service. There was a first response within 6 hours,
asking for more time to prepare an answer. I said I don't need
one urgently, and, with apologies, got a response one week
later.

I added the essence to the PEP; namely:
- The machine would be a Virtuozzo Virtual Private Server (VPS),
  hosted at PowerVPS.

- The default repository URL would be
  http://python.wush.net/svn/projectname/,
  but anything else could be arranged

- we would get SSH login to the machine, with sudo capabilities.

- They have a Web interface for management of the various SVN
  repositories that we want to host, and to manage user accounts.
  While svn+ssh would be supported, the user interface does not
  yet support it (although he said they might have something
  in September)

- For offsite mirroring/backup, they suggest to use rsync
  instead of download of repository tarballs.

So it seems that the "regular" administrative overhead would
be roughly the same on wush.net and python.org: we would
have to maintain account information ourselves; the initial
setup might be easier due to the UI wizard help.

I understand that the hope when using a commercial service
is that its availability is higher, due to us paying somebody
for the availability. Of course, Bob Ippolito's report is
discouraging here.

Regards,
Martin

From martin at v.loewis.de  Sun Aug 21 15:43:59 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 15:43:59 +0200
Subject: [Python-Dev] PEP 347: Migration to Subversion
In-Reply-To: <ca471dc2050820091458b1cb23@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F05CC2E@au3010avexu1.global.avaya.com>	<ca471dc2050816165852cb3934@mail.gmail.com>
	<ca471dc2050820091458b1cb23@mail.gmail.com>
Message-ID: <4308851F.9090800@v.loewis.de>

Guido van Rossum wrote:
> On the hosting issue, I'm still neutral -- I expect we'll be able to
> support the current developer crowd easily on svn.python.org, but if
> we ever find ther are resource problems (either people or bandwidth
> etc.) I just received a recommendation for wush.net which specializes
> in svn hosting. $90/month for 5 Gb of disk space sounds like a good
> deal and easily within the PSF budget.

I also have wush.net in the PEP, see my separate message. I'm not sure
what it really is that we get over what we get from XS4ALL for free.
>From the day-to-day maintenance, they seem comparable: they do backup
for us, and we have to maintain accounts ourselves. Of course,
wush.net has a Web GUI for maintenance activities (create repositories,
create accounts, manage access control).

I left out bandwidth details so far: we get 200GB/mo; after this, it
is $50/200GB. Another issue might be server load. I don't know how
many VPS they host on a single machine, or what their hardware is,
but in either case, pythondev developer svn would be shared with
something else (other VPSs for wush.net, regular pydotorg activities
on python.org). Only day-to-day experience will tell whether this
is acceptable.

The critical issue seems to be availability: if the service goes
down, when will it come back? Bob's experience is discouraging,
but then, there also was a python.org outage from time to time
(e.g. when MoinMoin consumed all CPU).

As for the money itself: 90$/month certainly is not an issue at
all.

So far, I haven't received any other specific referrals for
SVN hosters.

Regards,
Martin

From martin at v.loewis.de  Sun Aug 21 15:53:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 15:53:37 +0200
Subject: [Python-Dev] Collecting SSH keys
Message-ID: <43088761.7010905@v.loewis.de>

I have setup a test installation on svn.python.org, so that
developers can see how this would work.

So if you are currently a sf.net/projects/python developer,
please send me your SSH key before August 27 or after
September 12. We will use real names for commit messages,
so if you have specific preferences about the spelling
of your name, please indicate them.

The repository will be discarded after the testing, so
feel free to make any changes you want.

It's not decided yet whether the repository will eventually
run on python.org, but it seems clear to me that we likely
will use svn+ssh for developer access, unless testing
reveals disadvantages of doing so.

Please also look at the result of the conversion; if you
find any issues, please report them.

There is currently no anonymous WebDAV access to the
repository.

Regards,
Martin

From sjoerd at acm.org  Sun Aug 21 17:28:07 2005
From: sjoerd at acm.org (Sjoerd Mullender)
Date: Sun, 21 Aug 2005 17:28:07 +0200
Subject: [Python-Dev] Collecting SSH keys
In-Reply-To: <43088761.7010905@v.loewis.de>
References: <43088761.7010905@v.loewis.de>
Message-ID: <43089D87.2060302@acm.org>

Martin v. L?wis wrote:
> I have setup a test installation on svn.python.org, so that
> developers can see how this would work.
> 
> So if you are currently a sf.net/projects/python developer,
> please send me your SSH key before August 27 or after
> September 12. We will use real names for commit messages,
> so if you have specific preferences about the spelling
> of your name, please indicate them.

What about people with a whole host of ssh keys?  I have a different key
for each system I use (currently at least 6).  Will this be supported?
Will the different keys identify the same person?

> The repository will be discarded after the testing, so
> feel free to make any changes you want.
> 
> It's not decided yet whether the repository will eventually
> run on python.org, but it seems clear to me that we likely
> will use svn+ssh for developer access, unless testing
> reveals disadvantages of doing so.
> 
> Please also look at the result of the conversion; if you
> find any issues, please report them.
> 
> There is currently no anonymous WebDAV access to the
> repository.
> 
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/sjoerd.mullender%40cwi.nl


-- 
Sjoerd Mullender

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 369 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050821/0afb7716/signature.pgp

From martin at v.loewis.de  Sun Aug 21 18:19:45 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 Aug 2005 18:19:45 +0200
Subject: [Python-Dev] Collecting SSH keys
In-Reply-To: <43089D87.2060302@acm.org>
References: <43088761.7010905@v.loewis.de> <43089D87.2060302@acm.org>
Message-ID: <4308A9A1.1040700@v.loewis.de>

Sjoerd Mullender wrote:
> What about people with a whole host of ssh keys?  I have a different key
> for each system I use (currently at least 6).  Will this be supported?
> Will the different keys identify the same person?

That would be possible, yes. You should send a single file containing
all of them, and, each time something changes, resend the entire
file. All of your keys would identify "Sjoerd Mullender".

I don't know how this scales in OpenSSH having an authorized_keys
file with hundred or more keys. On the wire, this seems safe, as
it apparently is the client which offers various keys, and the
server which then accepts or rejects them.

Regards,
Martin

From skip at pobox.com  Sat Aug 20 04:32:12 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 19 Aug 2005 21:32:12 -0500
Subject: [Python-Dev] Deprecating builtin id (and moving it to sys())
In-Reply-To: <ca471dc20508180922583f17d0@mail.gmail.com>
References: <20050817140217.GQ3389@www.async.com.br>
	<200508181409.17431.anthony@interlink.com.au>
	<ca471dc20508180922583f17d0@mail.gmail.com>
Message-ID: <17158.38444.778226.955186@montanaro.dyndns.org>


    Guido> The built-in namespace is searched last for a reason -- the
    Guido> design is such that if you don't care for a particular built-in
    Guido> you don't need to know about it.

In my mind there are three classes of builtins from the standpoint of
overriding.  Pychecker complains if you override any of them, but I think
that many times it does so unnecessarily.  The first class includes those
builtins that you will likely find in many code samples and should just
never be overridden.  For me these include "abs", "map", "list", "int",
"range", "zip", the various exceptions, etc.  The second class of builtins
consists of objects or functions that are fairly special-purpose.  You might
not really care if they are overridden, depending on context.  For me this
class includes "compile", "id", "reload", "execfile", "ord", etc.  Finally,
there is the subset of builtins that is included almost solely as a
convenience for use at the interpreter prompt.  They include "quit", "exit"
and "copyright".  I could care less if I override them in my code, and don't
think pychecker should either.

Skip

From barry at python.org  Mon Aug 22 01:01:22 2005
From: barry at python.org (Barry Warsaw)
Date: Sun, 21 Aug 2005 19:01:22 -0400
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <43087DA0.702@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
Message-ID: <1124665281.31664.28.camel@geddy.wooz.org>

On Sun, 2005-08-21 at 09:12, "Martin v. L?wis" wrote:
> It turns out that svn+ssh with a single account has limitations:
> you can only set the tunnel user when you are using a restricted
> key. In PEP 347, the plan is that the current SF project admins
> get shell access to the pythondev account, which just has been
> created.
> 
> To resolve this, project admins need two different SSH keys:
> one for accessing the shell, and one for regular commit activities.

I may be totally misunderstanding, but to get shell access wouldn't I
avoid using the pythondev account and just use my own account?  I'd only
need the pythondev account to access the svn repository, right?  (And
actually, it might be possible to set up group permissions and
membership so that I could access the repo with either).

The number of people who need shell access should be pretty small.

I'm also a little confused about the pep.  What does "admin access to
the pythondev account" mean?  Do you mean the people who are going to be
managing users that can access svn?  In that case, I think the system
admins (i.e. those who already have shell access to dinsdale) would be
the people managing user access to svn.

> I would suggest that the default key is used for regular commits,
> and a separate key is created for shell access. I described this
> a bit in the PEP, essentially, in .ssh/config, I have
> 
> Host pythondev
>   Hostname dinsdale.python.org
>   User pythondev
>   IdentityFile ~/.ssh/pythondev
> 
> So when I do "ssh pythondev", I get the shell account; when I do
> "svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules",
> I use my default identity, which gets tunneled as "Martin v. Loewis".

I'm confused again; are you saying that we should have a host named
pythondev.python.org?  I'm not sure that's necessary.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050821/580c0311/attachment-0001.pgp

From aahz at pythoncraft.com  Mon Aug 22 01:55:23 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 21 Aug 2005 16:55:23 -0700
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124665281.31664.28.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
Message-ID: <20050821235522.GA16606@panix.com>

On Sun, Aug 21, 2005, Barry Warsaw wrote:
> On Sun, 2005-08-21 at 09:12, "Martin v. L?wis" wrote:
>>
>> I would suggest that the default key is used for regular commits,
>> and a separate key is created for shell access. I described this
>> a bit in the PEP, essentially, in .ssh/config, I have
>> 
>> Host pythondev
>>   Hostname dinsdale.python.org
>>   User pythondev
>>   IdentityFile ~/.ssh/pythondev
>> 
>> So when I do "ssh pythondev", I get the shell account; when I do
>> "svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules",
>> I use my default identity, which gets tunneled as "Martin v. Loewis".
> 
> I'm confused again; are you saying that we should have a host named
> pythondev.python.org?  I'm not sure that's necessary.

No, pythondev is simply an SSH alias for dinsdale -- the server knows
nothing about it.  I don't quite understand the "User pythondev" line,
though -- I think that's a mistake.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From martin at v.loewis.de  Mon Aug 22 08:18:31 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 08:18:31 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124665281.31664.28.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
Message-ID: <43096E37.8070708@v.loewis.de>

Barry Warsaw wrote:
> I may be totally misunderstanding, but to get shell access wouldn't I
> avoid using the pythondev account and just use my own account?

You could do that (or use the root account); I can't: I don't have
a ssh account on dinsdale. An even if I had, I couldn't write to
pythondev's authorized_keys2.

> I'm also a little confused about the pep.  What does "admin access to
> the pythondev account" mean?  Do you mean the people who are going to be
> managing users that can access svn?  

Correct.

> In that case, I think the system
> admins (i.e. those who already have shell access to dinsdale) would be
> the people managing user access to svn.

Ok: to whom should I forward the ssh keys then which I'm currently
collecting?

>>Host pythondev
>>  Hostname dinsdale.python.org
>>  User pythondev
>>  IdentityFile ~/.ssh/pythondev
>>
>>So when I do "ssh pythondev", I get the shell account; when I do
>>"svn co svn+ssh://pythondev at svn.python.org/python/trunk/Modules",
>>I use my default identity, which gets tunneled as "Martin v. Loewis".
> 
> 
> I'm confused again; are you saying that we should have a host named
> pythondev.python.org?  I'm not sure that's necessary.

Not at all. This is rather an OpenSSH convenience mechanism to avoid
typing hostname and user name all the time. I introduce a local alias
pythondev, which means I want to access pythondev at dinsdale.python.org,
using the key pythondev.pub.

Regards,
Martin


From martin at v.loewis.de  Mon Aug 22 08:31:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 08:31:30 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <20050821235522.GA16606@panix.com>
References: <43087DA0.702@v.loewis.de>	<1124665281.31664.28.camel@geddy.wooz.org>
	<20050821235522.GA16606@panix.com>
Message-ID: <43097142.9050905@v.loewis.de>

Aahz wrote:
>>>Host pythondev
>>>  Hostname dinsdale.python.org
>>>  User pythondev
>>>  IdentityFile ~/.ssh/pythondev
>>>
>>I'm confused again; are you saying that we should have a host named
>>pythondev.python.org?  I'm not sure that's necessary.
> 
> 
> No, pythondev is simply an SSH alias for dinsdale -- the server knows
> nothing about it.  I don't quite understand the "User pythondev" line,
> though -- I think that's a mistake.

That's intentional. "ssh pythondev" now becomes equivalent to

ssh -l pythondev -i ~/.ssh/pythondev dinsdale.python.org

IOW, the User option is equivalent to specifying the -l option.

Regards,
Martin

From paolo_veronelli at libero.it  Mon Aug 22 10:12:46 2005
From: paolo_veronelli at libero.it (Paolino)
Date: Mon, 22 Aug 2005 10:12:46 +0200
Subject: [Python-Dev] On decorators implementation
In-Reply-To: <43084AE9.20900@libero.it>
References: <43084AE9.20900@libero.it>
Message-ID: <430988FE.2020603@libero.it>

Paolino wrote:
> I noticed (via using them) that decorations are applied to methods
> before  they become methods.
> 
> This choice flattens down the implementation to no differentiating
> methods from functions.
> 
> 
> 
> 1)
> I have to apply euristics on the wrapped function type when I use the
> function as an index key.
> 
>          if type(observed) is types.MethodType:
>            observed=observed.im_func
> 
> things like this are inside my decorators.
> 
> 2)
> The behavior of decorations are not definable.
> I imagine that a method implementation of them inside the type metaclass
> could be better specified by people.
> This probably ends up in metamethods or something I can't grasp
> 
A downside of decorating at function level is that it's virtually 
impossible to check from the decorator that the first call parameter 
(aka self) is an instance of the method class.This check must be done 
inside the decorated.
This can really happen in normal use as decorators are useful to 
register the decorated as a 'callback'.Who ever fires it can do it with 
   no respect on the class belonging of the function/method, and the 
error raised will not be coherent with 'calling method on a incompatible 
instance'.

Maybe it's possible to let the decorator know the method class even if 
the class is still undefined.(Just like recursive functions?)
This would allow decorators to call super with the right class also.
@callSuper decoration is something I really miss.

Thanks
Paolino

From stephen at xemacs.org  Mon Aug 22 09:39:03 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 22 Aug 2005 16:39:03 +0900
Subject: [Python-Dev] Collecting SSH keys
In-Reply-To: <4308A9A1.1040700@v.loewis.de> (
	=?iso-8859-1?q?Martin_v=2E_L=F6wis's_message_of?= "Sun, 21 Aug 2005
	18:19:45 +0200")
References: <43088761.7010905@v.loewis.de> <43089D87.2060302@acm.org>
	<4308A9A1.1040700@v.loewis.de>
Message-ID: <87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin at v.loewis.de> writes:

    Martin> I don't know how this scales in OpenSSH having an
    Martin> authorized_keys file with hundred or more keys.

On cvs.xemacs.org (aka SunSITE.dk) ssh+cvs access with cvs access
control being handled by a Perl script scales to approximately 85
users.  I don't handle key management directly, but I believe several
users use multiple keys (I don't personally).  I've never heard any
complaints from the guys who actually do key management; they just
keep authorized_keys in alphabetical order by comment (= user's real
name).  Nor do I notice any authorization overhead vs. a simple ssh
login when accessing the cvs server.[1]  Evidently the "what keys do
you  have?" negotiation with the agent takes very little time (in
terms of what a human can notice).

If you want time(1) timings or something like that, I'd be happy to
get an exact count of the number of keys and do them (but it will have
to wait until I get back from travel August 28).


Footnotes: 
[1]  For testing whether keys are properly installed, the sequence
"ssh xemacs at cvs.xemacs.org", then asking the server for "version" and
sending EOF (^D), is what we use.  So there is no overhead from a
local CVS or anything like that, although of course you do have to
start the remote cvs server process (via the COMMAND= in the
.ssh/config file).  How that compares to starting a shell I'm not sure.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From raymond.hettinger at verizon.net  Mon Aug 22 14:46:27 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 22 Aug 2005 08:46:27 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219,
	1.220
In-Reply-To: <20050821184639.EF8711E4006@bag.python.org>
Message-ID: <003101c5a717$83be4b60$3c23a044@oemcomputer>

> A new hashlib module to replace the md5 and sha modules.  It adds
> support for additional secure hashes such as SHA-256 and SHA-512.  The
> hashlib module uses OpenSSL for fast platform optimized
> implementations of algorithms when available.  The old md5 and sha
> modules still exist as wrappers around hashlib to preserve backwards
> compatibility.

I'm getting compilation errors:

C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : error C2146: syntax error :
missing ')' before identifier 'L'
C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
suffix on number'
C:\py25\Modules\sha512module.c(146) : fatal error C1013: compiler limit
: too many open parentheses


Also, there should be updating entries to Misc/NEWS,
PC/VC6/pythoncore.dsp, and PC/config.c.


Raymond


From martin at v.loewis.de  Mon Aug 22 16:11:31 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 16:11:31 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124711379.31664.213.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>	
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<1124711379.31664.213.camel@geddy.wooz.org>
Message-ID: <4309DD13.4040902@v.loewis.de>

Barry Warsaw wrote:
>>You could do that (or use the root account); I can't: I don't have
>>a ssh account on dinsdale. An even if I had, I couldn't write to
>>pythondev's authorized_keys2.
> 
> 
> That's easily rectified! :)  We should give you an account and sudo
> access.  Should I just use your keys from creosote?

Please do!

>>Ok: to whom should I forward the ssh keys then which I'm currently
>>collecting?
> 
> 
> Probably here, unless once you have the above, you still want to do it
> yourself.

I would be worried that you are a single point of failure here:
for sf.net/projects/python, multiple people can add new users, and
I think we should continue that tradition.

I would be happy with *different* people being able to manage
that, but the group should be larger than two, IMO.

Regards,
Martin

From martin at v.loewis.de  Mon Aug 22 16:20:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 16:20:37 +0200
Subject: [Python-Dev] Collecting SSH keys
In-Reply-To: <87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <43088761.7010905@v.loewis.de>
	<43089D87.2060302@acm.org>	<4308A9A1.1040700@v.loewis.de>
	<87y86uqv3c.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <4309DF35.5000902@v.loewis.de>

Stephen J. Turnbull wrote:
> On cvs.xemacs.org (aka SunSITE.dk) ssh+cvs access with cvs access
> control being handled by a Perl script scales to approximately 85
> users.  I don't handle key management directly, but I believe several
> users use multiple keys (I don't personally).  I've never heard any
> complaints from the guys who actually do key management; they just
> keep authorized_keys in alphabetical order by comment (= user's real
> name).  Nor do I notice any authorization overhead vs. a simple ssh
> login when accessing the cvs server.[1]  Evidently the "what keys do
> you  have?" negotiation with the agent takes very little time (in
> terms of what a human can notice).

That's encouraging; I'm willing to proceed with that approach then.
As for key management: I just designed an infrastructure where
~pythondev/keys is a directory containing files named, say
"Martin v. Loewis" (with spaces, ASCII only); the contents of
the files are just the public keys. I run then make_authorized_keys,
which regenerates the authorized_keys2 file, adding all the
command= lines. This avoids editing authorized_keys2 in a text
editor.

Regards,
Martin

From skip at pobox.com  Mon Aug 22 17:18:33 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 22 Aug 2005 10:18:33 -0500
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <43096E37.8070708@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
Message-ID: <17161.60617.950268.641009@montanaro.dyndns.org>


Martin,

I'm completely confused about what, if anything, I need to send to you.  I
can already access the python.org website repository via svn.  Will I
automatically get access to the new Python source repository or do I need to
send you pub key(s)?  Are dinsdale.python.org and svn.python.org the same
machine with different IP addresses?  If they are different machines, why
would we want to host svn repositories on multiple machines?

Skip

From aahz at pythoncraft.com  Mon Aug 22 17:25:33 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 22 Aug 2005 08:25:33 -0700
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <43097142.9050905@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<20050821235522.GA16606@panix.com> <43097142.9050905@v.loewis.de>
Message-ID: <20050822152533.GB12281@panix.com>

On Mon, Aug 22, 2005, "Martin v. L?wis" wrote:
> Aahz wrote:
>>Barry:
>>>Martin:
>>>>
>>>>Host pythondev
>>>>  Hostname dinsdale.python.org
>>>>  User pythondev
>>>>  IdentityFile ~/.ssh/pythondev
>>>>
>>>I'm confused again; are you saying that we should have a host named
>>>pythondev.python.org?  I'm not sure that's necessary.
>> 
>> No, pythondev is simply an SSH alias for dinsdale -- the server knows
>> nothing about it.  I don't quite understand the "User pythondev" line,
>> though -- I think that's a mistake.
> 
> That's intentional. "ssh pythondev" now becomes equivalent to
> 
> ssh -l pythondev -i ~/.ssh/pythondev dinsdale.python.org
> 
> IOW, the User option is equivalent to specifying the -l option.

Yes, I know -- but it looks like a mistake to me.  Are you saying that
all shell access will be done through a single account?  Isn't that a
huge security risk?  My understanding was that it was SVN access that
would be going through a single account, not shell access.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From barry at python.org  Mon Aug 22 17:32:10 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 22 Aug 2005 11:32:10 -0400
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <17161.60617.950268.641009@montanaro.dyndns.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
Message-ID: <1124724730.17082.8.camel@geddy.wooz.org>

On Mon, 2005-08-22 at 11:18, skip at pobox.com wrote:

> I'm completely confused about what, if anything, I need to send to you.  I
> can already access the python.org website repository via svn.  Will I
> automatically get access to the new Python source repository or do I need to
> send you pub key(s)?  

I think technically, the answer to that is "yes", you will automatically
get access to the source repo.  The question I have is whether you
/should/ access the source repo that way, or use the shared pythondev
account.  Two unknowns for me are 1) will there be permission problems
that either prevent you from doing this, or once you've committed a
change, will screw pythondev-access?; 2) when we finally get email
notifications worked in, will it still look like your commit is coming
from the right place.  I think the answer to #2 is yes, but I'm not sure
about #1.

> Are dinsdale.python.org and svn.python.org the same
> machine with different IP addresses?  If they are different machines, why

They are the same machine, with different IP addresses.  Anonymous
webdav will require two Apache processes, since different user/groups
are needed and to support different certs for svn.python.org and
(eventually) www.python.org.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050822/8e2b29ff/attachment.pgp

From skip at pobox.com  Mon Aug 22 17:45:29 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 22 Aug 2005 10:45:29 -0500
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
Message-ID: <17161.62233.743239.89277@montanaro.dyndns.org>


    >> Will I automatically get access to the new Python source repository
    >> or do I need to send you pub key(s)?

    Barry> I think technically, the answer to that is "yes", you will
    Barry> automatically get access to the source repo.

Okay...

    Barry> The question I have is whether you /should/ access the source
    Barry> repo that way, or use the shared pythondev account.  

More confusion here.  If I use some sort of shared access how will the
system ascribe changes I make to me and not, for example, Martin?

I think until this experiment is over and we have really and truly migrated
to svn I will simply let other people fuss with things.

Skip

From foom at fuhm.net  Mon Aug 22 17:57:54 2005
From: foom at fuhm.net (James Y Knight)
Date: Mon, 22 Aug 2005 11:57:54 -0400
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
Message-ID: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>

On Aug 22, 2005, at 11:32 AM, Barry Warsaw wrote:

> They are the same machine, with different IP addresses.  Anonymous
> webdav will require two Apache processes, since different user/groups
> are needed and to support different certs for svn.python.org and
> (eventually) www.python.org.
>

It seems a waste to use SVN's webdav support just for anon access.  
The svnserve method works well for anon access. The only reason to  
use svn webdav IMO is if you want to use that for authenticated  
access. But since you're talking about using svn+ssh for that..

James

From martin at v.loewis.de  Mon Aug 22 18:07:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 18:07:50 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <17161.60617.950268.641009@montanaro.dyndns.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
Message-ID: <4309F856.40506@v.loewis.de>

skip at pobox.com wrote:
> I'm completely confused about what, if anything, I need to send to you.  I
> can already access the python.org website repository via svn.

Yes, but you do so using username/password, right? pythondev will be
using svn+ssh.

> Will I
> automatically get access to the new Python source repository or do I need to
> send you pub key(s)?

You need to send me pubkeys. Actually, I just copied the ones from
creosote (see below). You should now be able to checkout

svn+ssh://pythondev at svn.python.org/python/trunk

> Are dinsdale.python.org and svn.python.org the same
> machine with different IP addresses?

Correct.

> If they are different machines, why
> would we want to host svn repositories on multiple machines?

We don't. However, we use different access methods. Actually, we
*might* use different access methods. If this turns out to be
too confusing to users, we are probably back to username/password.

Regards,
Martin

P.S. The keys I installed are

ssh-dss
AAAAB3NzaC1kc3MAAACBAJAPN3ngdjih7H1wqkmbkaJDpfoW3fRrk9phtuuO+js43qU06BiqInbGZ/zjVZRrM7yzRbo2PGu1+ox8H/vkMlSk6IxmgMtNrrQ9SEoTRo7eyg5ku+JiC44h3RWT2IuiIALB8axHQSBsF6Oe4O9z/lgsLMO08M2l1TzRnjSjyOEZAAAAFQDGffqFFm+IoSH6cRfxnY+BiXxZ5QAAAIATuQmlscDd/QNSlk4Oy7ZMUdHplx76zQtyUHXvhRVkIu6QrduhnnCkGIFjSHQsnJOoroF4tVaJYY7oka17Ambd0LiWcSlNK+IHMdbvZ91wbVpeo9x/HBCJtCMxDX8PxG3TADuqiZjeC8nOpCdJ+cK7emQv+G4WIw3gC3IuPRINWAAAAIA5+OO9ApbKrcClwHXZ9DqtDJBe2fSox1mnei3VAajbOU/o3+j+G+5iLerOqLTCoOyIs7umvuUulIAXvhDzCzusw3mfBtt3UODQn0L3R47OFHzOiCEbihStxd36lVgCJgRBAW7UKf+2k3BzxJ5DVpp4+AZ7fS4FUVkZ8DYAog/68g==
skip at montanaro.dyndns.org
ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAq83rRGWRR4SdvvBUMJ/gDmMG7U7LdiC50kqUTbw+Kogum5JT7kexi1XYKgyKJ8FbRwMx1Xj9zjQERgDhYtFCJg72kSkD2muN3DkyU7vIoZQM/aNpspPNNDWRqj8pzHPzhWDUfL+tjZl78JD51mTOlGHaZUGdKnPeUOQF2XTadis=
skip at montanaro.dyndns.org

From martin at v.loewis.de  Mon Aug 22 18:10:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 18:10:37 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <20050822152533.GB12281@panix.com>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<20050821235522.GA16606@panix.com> <43097142.9050905@v.loewis.de>
	<20050822152533.GB12281@panix.com>
Message-ID: <4309F8FD.6080505@v.loewis.de>

Aahz wrote:
> Yes, I know -- but it looks like a mistake to me.  Are you saying that
> all shell access will be done through a single account?  Isn't that a
> huge security risk?  My understanding was that it was SVN access that
> would be going through a single account, not shell access.

Only few selected people would have shell access; I don't see that
as a huge risk. Anyway, Barry didn't like it either, so we removed
shell access to the pythondev account; user keys now need to be
added by the pydotorg admins.

Regards,
Martin

From martin at v.loewis.de  Mon Aug 22 18:16:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 18:16:24 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124724730.17082.8.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>	
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>	
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
Message-ID: <4309FA58.4080103@v.loewis.de>

Barry Warsaw wrote:
> I think technically, the answer to that is "yes", you will automatically
> get access to the source repo.

At the moment, the answer actually is "no". For the projects repository,
there is no group write permission - you must be pythondev in order to
write.

> The question I have is whether you
> /should/ access the source repo that way, or use the shared pythondev
> account.  Two unknowns for me are 1) will there be permission problems
> that either prevent you from doing this, or once you've committed a
> change, will screw pythondev-access?;

Yes to the former. The webserver has only read access to the (projects)
repository.

> 2) when we finally get email
> notifications worked in, will it still look like your commit is coming
> from the right place.

Not sure what "the right place" would be: pythondev at python.org?
I think the email could look any way we want it to look.

> They are the same machine, with different IP addresses.  Anonymous
> webdav will require two Apache processes, since different user/groups
> are needed

Not necessarily. The repository could be world-readable, in which case
"nobody" could access it.

> and to support different certs for svn.python.org and
> (eventually) www.python.org.

Ah. I think anonymous read access should be on port 80.

Regards,
Martin


From martin at v.loewis.de  Mon Aug 22 18:20:42 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 18:20:42 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <17161.62233.743239.89277@montanaro.dyndns.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<17161.62233.743239.89277@montanaro.dyndns.org>
Message-ID: <4309FB5A.1040201@v.loewis.de>

skip at pobox.com wrote:
> More confusion here.  If I use some sort of shared access how will the
> system ascribe changes I make to me and not, for example, Martin?

In pythondev's authorized_keys2, we have a line

command="/usr/bin/svnserve --root=/data/repos/projects -t
--tunnel-user 'Skip Montanaro'",no-port-forwarding,no-X11-forwarding,
no-agent-forwarding,no-pty ssh-dss <your key>

So the *only* command you are allowed to invoke is svnserve (actually,
sshd will invoke that no matter what the ssh client requests). This
will tell subversion that changes should be logges as 'Skip Montanaro'.

> I think until this experiment is over and we have really and truly migrated
> to svn I will simply let other people fuss with things.

Well, you are not required to understand it, but you should try to use
it. Just check out svn+ssh://pythondev at svn.python.org/python/trunk/Misc,
and see whether this works.

Regards,
Martin

From martin at v.loewis.de  Mon Aug 22 18:23:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 18:23:01 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
References: <43087DA0.702@v.loewis.de>	<1124665281.31664.28.camel@geddy.wooz.org>	<43096E37.8070708@v.loewis.de>	<17161.60617.950268.641009@montanaro.dyndns.org>	<1124724730.17082.8.camel@geddy.wooz.org>
	<3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
Message-ID: <4309FBE5.40204@v.loewis.de>

James Y Knight wrote:
> It seems a waste to use SVN's webdav support just for anon access.  
> The svnserve method works well for anon access. The only reason to  
> use svn webdav IMO is if you want to use that for authenticated  
> access. But since you're talking about using svn+ssh for that..

It has the advantage that we can easily point people to files
with a web browser; they don't need an svn client.

Regards,
Martin

From barry at python.org  Mon Aug 22 18:41:11 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 22 Aug 2005 12:41:11 -0400
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <4309FA58.4080103@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<4309FA58.4080103@v.loewis.de>
Message-ID: <1124728871.17084.13.camel@geddy.wooz.org>

On Mon, 2005-08-22 at 12:16, "Martin v. L?wis" wrote:
> Barry Warsaw wrote:
> > I think technically, the answer to that is "yes", you will automatically
> > get access to the source repo.
> 
> At the moment, the answer actually is "no". For the projects repository,
> there is no group write permission - you must be pythondev in order to
> write.

Good!  I think that's a feature. :)  I have a vague discomfort with
allowing both types of access.  I.e. I'd rather all source committers
use the same mechanism.

> > 2) when we finally get email
> > notifications worked in, will it still look like your commit is coming
> > from the right place.
> 
> Not sure what "the right place" would be: pythondev at python.org?
> I think the email could look any way we want it to look.

I think it should be <username>@python.org where <username> is the
firstname.lastname (with some exceptions) scheme that we've agreed on. 
I actually /don't/ want all commits to look like they're coming from
pythondev at python.org

> > and to support different certs for svn.python.org and
> > (eventually) www.python.org.
> 
> Ah. I think anonymous read access should be on port 80.

Maybe we want to put websvn (or whatever it's called these days) on port
80 of svn.python.org?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050822/cb31d290/attachment.pgp

From pje at telecommunity.com  Mon Aug 22 18:42:57 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 22 Aug 2005 12:42:57 -0400
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <4309FBE5.40204@v.loewis.de>
References: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
	<43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
Message-ID: <5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com>

At 06:23 PM 8/22/2005 +0200, Martin v. L?wis wrote:
>James Y Knight wrote:
> > It seems a waste to use SVN's webdav support just for anon access.
> > The svnserve method works well for anon access. The only reason to
> > use svn webdav IMO is if you want to use that for authenticated
> > access. But since you're talking about using svn+ssh for that..
>
>It has the advantage that we can easily point people to files
>with a web browser; they don't need an svn client.

You can do that with viewcvs, too.  Viewcvs can also create tarballs for 
easy downloading, and has a lot of browsing and viewing options that the 
SVN webdav mode doesn't.


From skip at pobox.com  Mon Aug 22 18:43:06 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 22 Aug 2005 11:43:06 -0500
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <4309FB5A.1040201@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<17161.62233.743239.89277@montanaro.dyndns.org>
	<4309FB5A.1040201@v.loewis.de>
Message-ID: <17162.155.1546.1991@montanaro.dyndns.org>


    >> I think until this experiment is over and we have really and truly
    >> migrated to svn I will simply let other people fuss with things.

    Martin> Well, you are not required to understand it, but you should try
    Martin> to use it. 

Good point.

    Martin> Just check out
    Martin> svn+ssh://pythondev at svn.python.org/python/trunk/Misc, and see
    Martin> whether this works.

It worked.  I made a trivial change to Misc/NEWS and checked it in.  I then
ran "svn blame NEWS" to see what it showed.  This took approximately
forever.  Can I assume this is one thing svn is always going to be pretty
slow at?  I use cvs annotate frequently.  Is there a faster alternative in
svn to identify who did what?

I notice that you use my real name (including spaces).  I doubt we have any
code that munches on annotated listings, but it seems that for the sake of
script writers' sanity it would be better to elide spaces or replace them
with underscores so the annotated user is a single "word":

     40555 Skip Montanaro ++++++++++++
     28675  montanaro Python News
     40555 Skip Montanaro ++++++++++++
     28675  montanaro
     37655 anthonybaxter (editors: check NEWS.help for information about editing NEWS using ReST.)
     37654  montanaro
     37838 rhettinger What's New in Python 2.5 alpha 1?
     37838 rhettinger =================================
     37838 rhettinger
     38611 anthonybaxter *Release date: XX-XXX-2006*
     38611 anthonybaxter
     37838 rhettinger Core and builtins
     37838 rhettinger -----------------
     37838 rhettinger
     ...

That way column 2 would always be the contributor.

Skip

From gvanrossum at gmail.com  Mon Aug 22 19:43:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon, 22 Aug 2005 10:43:24 -0700
Subject: [Python-Dev] On decorators implementation
In-Reply-To: <430988FE.2020603@libero.it>
References: <43084AE9.20900@libero.it> <430988FE.2020603@libero.it>
Message-ID: <ca471dc20508221043324ba380@mail.gmail.com>

> Maybe it's possible to let the decorator know the method class even if
> the class is still undefined.(Just like recursive functions?)
> This would allow decorators to call super with the right class also.
> @callSuper decoration is something I really miss.

You're thinking about it all wrong.

Remember that decorators can also be used to declare that something is
a static method or class method etc.

Try to learn Python, not to write some other language using Python syntax.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From kbk at shore.net  Mon Aug 22 20:37:54 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Mon, 22 Aug 2005 14:37:54 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200508221837.j7MIbsHG031701@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  352 open ( +0) /  2898 closed ( +2) /  3250 total ( +2)
Bugs    :  926 open (+13) /  5177 closed (+15) /  6103 total (+28)
RFE     :  190 open ( -1) /   179 closed ( +1) /   369 total ( +0)

New / Reopened Patches
______________________

fix smtplib when local host isn't resolvable in dns  (2005-08-12)
       http://python.org/sf/1257988  opened by  Arkadiusz Miskiewicz

tarfile: fix for bug #1257255  (2005-08-17)
       http://python.org/sf/1262036  opened by  Lars Gust?bel

Patches Closed
______________

sha256 module  (2004-04-14)
       http://python.org/sf/935454  closed by  greg

sha and md5 modules should use OpenSSL when possible  (2005-02-12)
       http://python.org/sf/1121611  closed by  greg

New / Reopened Bugs
___________________

Significant memory leak with PyImport_ReloadModule  (2005-08-11)
       http://python.org/sf/1256669  opened by  Ben Held

slice object uses -1 as exclusive end-bound  (2005-08-11)
       http://python.org/sf/1256786  opened by  Bryan G. Olson

tarfile local name is local, should be abspath  (2005-08-12)
       http://python.org/sf/1257255  opened by  Martin Blais

Encodings iso8859_1 and latin_1 are redundant  (2005-08-12)
       http://python.org/sf/1257525  opened by  liturgist

Solaris 8 declares gethostname().  (2005-08-12)
       http://python.org/sf/1257687  opened by  Hans Deragon

error message incorrectly claims Visual C++ is required  (2005-08-12)
       http://python.org/sf/1257728  opened by  Zooko O'Whielacronx

Make set.remove() behave more like Set.remove()  (2005-08-12)
CLOSED http://python.org/sf/1257731  opened by  Raymond Hettinger

tkapp read-only attributes  (2005-08-12)
       http://python.org/sf/1257772  opened by  peeb

gen_send_ex: Assertion `f->f_back !  (2005-08-12)
CLOSED http://python.org/sf/1257960  opened by  Neil Schemenauer

http auth documentation/implementation conflict  (2005-08-13)
       http://python.org/sf/1258485  opened by  Matthias Klose

"it's" vs. "its" typo in Language Reference  (2005-08-14)
CLOSED http://python.org/sf/1258922  opened by  Wolfgang Petzold

Makefile ignores $CPPFLAGS  (2005-08-14)
       http://python.org/sf/1258986  opened by  Dirk Pirschel

Tix CheckList 'radio' option cannot be changed  (2005-08-14)
       http://python.org/sf/1259434  opened by  Raymond Maple

subprocess: more general (non-buffering) communication  (2005-08-15)
       http://python.org/sf/1260171  opened by  Ian Bicking

__new__ is class method  (2005-08-16)
       http://python.org/sf/1261229  opened by  Mike Orr

import dynamic library bug?  (2005-08-16)
       http://python.org/sf/1261390  opened by  broadwin

Tutorial doesn't cover * and ** function calls  (2005-08-16)
       http://python.org/sf/1261659  opened by  Brett Cannon

precompiled code and nameError.  (2005-08-17)
       http://python.org/sf/1261714  opened by  Vladimir Menshakov

minidom.py alternate newl support is broken  (2005-08-17)
       http://python.org/sf/1262320  opened by  John Whitley

fcntl.ioctl have a bit problem.  (2005-08-18)
       http://python.org/sf/1262856  opened by  Raise L. Sail

typo on "SimpleXMLRPCServer Objects"  (2005-08-18)
CLOSED http://python.org/sf/1263086  opened by  Chad Whitacre

type() and isinstance() do not call __getattribute__  (2005-08-19)
       http://python.org/sf/1263635  opened by  Per Vognsen

IDLE on Mac  (2005-08-18)
       http://python.org/sf/1263656  opened by  Bruce Sherwood

PyArg_ParseTupleAndKeywords doesn't handle I format correctl  (2005-08-19)
CLOSED http://python.org/sf/1264168  opened by  John Finlay

PEP 8 uses wrong raise syntax  (2005-08-20)
CLOSED http://python.org/sf/1264666  opened by  Steven Bethard

sequence slicing documentation incomplete  (2005-08-20)
       http://python.org/sf/1265100  opened by  Steven Bethard

lexists() is not exported from os.path  (2005-08-22)
CLOSED http://python.org/sf/1266283  opened by  Martin Blais

Mistakes in decimal.Context.subtract documentation  (2005-08-22)
       http://python.org/sf/1266296  opened by  Jim Sizelove

Bugs Closed
___________

smtplib and email.py  (2005-08-03)
       http://python.org/sf/1251528  closed by  rhettinger

float('-inf')  (2005-08-09)
       http://python.org/sf/1255395  closed by  tjreedy

Make set.remove() behave more like Set.remove()  (2005-08-12)
       http://python.org/sf/1257731  closed by  rhettinger

gen_send_ex: Assertion `f->f_back !  (2005-08-12)
       http://python.org/sf/1257960  closed by  pje

IOError after normal write  (2005-08-04)
       http://python.org/sf/1252149  closed by  tim_one

IOError after normal write  (2005-08-04)
       http://python.org/sf/1252149  deleted by  patrick_gerken

"it's" vs. "its" typo in Language Reference  (2005-08-14)
       http://python.org/sf/1258922  closed by  birkenfeld

hotshot.stats.load  (2004-02-19)
       http://python.org/sf/900092  closed by  bwarsaw

typo on "SimpleXMLRPCServer Objects"  (2005-08-18)
       http://python.org/sf/1263086  closed by  doerwalter

PyArg_ParseTupleAndKeywords doesn't handle I format correctl  (2005-08-19)
       http://python.org/sf/1264168  closed by  birkenfeld

PEP 8 uses wrong raise syntax  (2005-08-20)
       http://python.org/sf/1264666  closed by  goodger

list(obj) can swallow KeyboardInterrupt  (2005-07-21)
       http://python.org/sf/1242657  closed by  rhettinger

container methods raise KeyError not IndexError  (2005-08-01)
       http://python.org/sf/1249837  closed by  rhettinger

zip incorrectly and incompletely documented  (2005-02-12)
       http://python.org/sf/1121416  closed by  rhettinger

bz2 RuntimeError when decompressing file  (2005-04-27)
       http://python.org/sf/1191043  closed by  birkenfeld

lexists() is not exported from os.path  (2005-08-22)
       http://python.org/sf/1266283  closed by  birkenfeld

RFE Closed
__________

md5 and sha1 modules should use openssl implementation  (2004-06-30)
       http://python.org/sf/983069  closed by  greg


From nas at arctrix.com  Mon Aug 22 23:31:42 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Aug 2005 15:31:42 -0600
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode strings
Message-ID: <20050822213142.GA5702@mems-exchange.org>

[Please mail followups to python-dev at python.org.]

The PEP has been rewritten based on a suggestion by Guido to change
str() rather than adding a new built-in function.  Based on my
testing, I believe the idea is feasible.  It would be helpful if
people could test the patched Python with their own applications and
report any incompatibilities.


PEP: 349
Title: Allow str() to return unicode strings
Version: $Revision: 1.3 $
Last-Modified: $Date: 2005/08/22 21:12:08 $
Author: Neil Schemenauer <nas at arctrix.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History: 06-Aug-2005
Python-Version: 2.5


Abstract

    This PEP proposes to change the str() built-in function so that it
    can return unicode strings.  This change would make it easier to
    write code that works with either string type and would also make
    some existing code handle unicode strings.  The C function
    PyObject_Str() would remain unchanged and the function
    PyString_New() would be added instead.


Rationale

    Python has had a Unicode string type for some time now but use of
    it is not yet widespread.  There is a large amount of Python code
    that assumes that string data is represented as str instances.
    The long term plan for Python is to phase out the str type and use
    unicode for all string data.  Clearly, a smooth migration path
    must be provided.

    We need to upgrade existing libraries, written for str instances,
    to be made capable of operating in an all-unicode string world.
    We can't change to an all-unicode world until all essential
    libraries are made capable for it.  Upgrading the libraries in one
    shot does not seem feasible.  A more realistic strategy is to
    individually make the libraries capable of operating on unicode
    strings while preserving their current all-str environment
    behaviour.

    First, we need to be able to write code that can accept unicode
    instances without attempting to coerce them to str instances.  Let
    us label such code as Unicode-safe.  Unicode-safe libraries can be
    used in an all-unicode world.

    Second, we need to be able to write code that, when provided only
    str instances, will not create unicode results.  Let us label such
    code as str-stable.  Libraries that are str-stable can be used by
    libraries and applications that are not yet Unicode-safe.
    
    Sometimes it is simple to write code that is both str-stable and
    Unicode-safe.  For example, the following function just works:

        def appendx(s):
            return s + 'x'

    That's not too surprising since the unicode type is designed to
    make the task easier.  The principle is that when str and unicode
    instances meet, the result is a unicode instance.  One notable
    difficulty arises when code requires a string representation of an
    object; an operation traditionally accomplished by using the str()
    built-in function.
    
    Using the current str() function makes the code not Unicode-safe.
    Replacing a str() call with a unicode() call makes the code not
    str-stable.  Changing str() so that it could return unicode
    instances would solve this problem.  As a further benefit, some code
    that is currently not Unicode-safe because it uses str() would
    become Unicode-safe.


Specification

    A Python implementation of the str() built-in follows:

        def str(s):
            """Return a nice string representation of the object.  The
            return value is a str or unicode instance.
            """
            if type(s) is str or type(s) is unicode:
                return s
            r = s.__str__()
            if not isinstance(r, (str, unicode)):
                raise TypeError('__str__ returned non-string')
            return r
            
    The following function would be added to the C API and would be the
    equivalent to the str() built-in (ideally it be called PyObject_Str,
    but changing that function could cause a massive number of
    compatibility problems):

        PyObject *PyString_New(PyObject *);

    A reference implementation is available on Sourceforge [1] as a
    patch.

                
Backwards Compatibility

    Some code may require that str() returns a str instance.  In the
    standard library, only one such case has been found so far.  The
    function email.header_decode() requires a str instance and the
    email.Header.decode_header() function tries to ensure this by
    calling str() on its argument.  The code was fixed by changing
    the line "header = str(header)" to:

        if isinstance(header, unicode):
            header = header.encode('ascii')

    Whether this is truly a bug is questionable since decode_header()
    really operates on byte strings, not character strings.  Code that
    passes it a unicode instance could itself be considered buggy.


Alternative Solutions

    A new built-in function could be added instead of changing str().
    Doing so would introduce virtually no backwards compatibility
    problems.  However, since the compatibility problems are expected to
    rare, changing str() seems preferable to adding a new built-in.

    The basestring type could be changed to have the proposed behaviour,
    rather than changing str().  However, that would be confusing
    behaviour for an abstract base type.


References

    [1] http://www.python.org/sf/1266570


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
-------------- next part --------------
PEP: 349
Title: Allow str() to return unicode strings
Version: $Revision: 1.3 $
Last-Modified: $Date: 2005/08/22 21:12:08 $
Author: Neil Schemenauer <nas at arctrix.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History: 06-Aug-2005
Python-Version: 2.5


Abstract

    This PEP proposes to change the str() built-in function so that it
    can return unicode strings.  This change would make it easier to
    write code that works with either string type and would also make
    some existing code handle unicode strings.  The C function
    PyObject_Str() would remain unchanged and the function
    PyString_New() would be added instead.


Rationale

    Python has had a Unicode string type for some time now but use of
    it is not yet widespread.  There is a large amount of Python code
    that assumes that string data is represented as str instances.
    The long term plan for Python is to phase out the str type and use
    unicode for all string data.  Clearly, a smooth migration path
    must be provided.

    We need to upgrade existing libraries, written for str instances,
    to be made capable of operating in an all-unicode string world.
    We can't change to an all-unicode world until all essential
    libraries are made capable for it.  Upgrading the libraries in one
    shot does not seem feasible.  A more realistic strategy is to
    individually make the libraries capable of operating on unicode
    strings while preserving their current all-str environment
    behaviour.

    First, we need to be able to write code that can accept unicode
    instances without attempting to coerce them to str instances.  Let
    us label such code as Unicode-safe.  Unicode-safe libraries can be
    used in an all-unicode world.

    Second, we need to be able to write code that, when provided only
    str instances, will not create unicode results.  Let us label such
    code as str-stable.  Libraries that are str-stable can be used by
    libraries and applications that are not yet Unicode-safe.
    
    Sometimes it is simple to write code that is both str-stable and
    Unicode-safe.  For example, the following function just works:

        def appendx(s):
            return s + 'x'

    That's not too surprising since the unicode type is designed to
    make the task easier.  The principle is that when str and unicode
    instances meet, the result is a unicode instance.  One notable
    difficulty arises when code requires a string representation of an
    object; an operation traditionally accomplished by using the str()
    built-in function.
    
    Using the current str() function makes the code not Unicode-safe.
    Replacing a str() call with a unicode() call makes the code not
    str-stable.  Changing str() so that it could return unicode
    instances would solve this problem.  As a further benefit, some code
    that is currently not Unicode-safe because it uses str() would
    become Unicode-safe.


Specification

    A Python implementation of the str() built-in follows:

        def str(s):
            """Return a nice string representation of the object.  The
            return value is a str or unicode instance.
            """
            if type(s) is str or type(s) is unicode:
                return s
            r = s.__str__()
            if not isinstance(r, (str, unicode)):
                raise TypeError('__str__ returned non-string')
            return r
            
    The following function would be added to the C API and would be the
    equivalent to the str() built-in (ideally it be called PyObject_Str,
    but changing that function could cause a massive number of
    compatibility problems):

        PyObject *PyString_New(PyObject *);

    A reference implementation is available on Sourceforge [1] as a
    patch.

                
Backwards Compatibility

    Some code may require that str() returns a str instance.  In the
    standard library, only one such case has been found so far.  The
    function email.header_decode() requires a str instance and the
    email.Header.decode_header() function tries to ensure this by
    calling str() on its argument.  The code was fixed by changing
    the line "header = str(header)" to:

        if isinstance(header, unicode):
            header = header.encode('ascii')

    Whether this is truly a bug is questionable since decode_header()
    really operates on byte strings, not character strings.  Code that
    passes it a unicode instance could itself be considered buggy.


Alternative Solutions

    A new built-in function could be added instead of changing str().
    Doing so would introduce virtually no backwards compatibility
    problems.  However, since the compatibility problems are expected to
    rare, changing str() seems preferable to adding a new built-in.

    The basestring type could be changed to have the proposed behaviour,
    rather than changing str().  However, that would be confusing
    behaviour for an abstract base type.


References

    [1] http://www.python.org/sf/1266570


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

From martin at v.loewis.de  Mon Aug 22 23:47:02 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 23:47:02 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com>
References: <3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
	<43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
	<5.1.1.6.0.20050822124128.01b0bd10@mail.telecommunity.com>
Message-ID: <430A47D6.70704@v.loewis.de>

Phillip J. Eby wrote:
> You can do that with viewcvs, too.  Viewcvs can also create tarballs for
> easy downloading, and has a lot of browsing and viewing options that the
> SVN webdav mode doesn't.

True. I had some issues with viewcvs, though: you cannot provide access
control easily, as you cannot force it to slash-separated mode; it also
couldn't fetch the history across renames. These may have been fixed
meanwhile, of course.

Regards,
Martin

From martin at v.loewis.de  Mon Aug 22 23:57:11 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 22 Aug 2005 23:57:11 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <17162.155.1546.1991@montanaro.dyndns.org>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<17161.62233.743239.89277@montanaro.dyndns.org>
	<4309FB5A.1040201@v.loewis.de>
	<17162.155.1546.1991@montanaro.dyndns.org>
Message-ID: <430A4A37.1060808@v.loewis.de>

skip at pobox.com wrote:
> It worked.  I made a trivial change to Misc/NEWS and checked it in.  I then
> ran "svn blame NEWS" to see what it showed.  This took approximately
> forever.  Can I assume this is one thing svn is always going to be pretty
> slow at? 

Yes. Somebody commented that this is quadratic in svn with the number of
revisions, whereas it is linear in CVS. Please try it on some other
file; Misc/NEWS is probably the worst case in the Python repository.

I don't know whether there is any better way; we should perhaps ask
on the svn users list.

> I notice that you use my real name (including spaces).  I doubt we have any
> code that munches on annotated listings, but it seems that for the sake of
> script writers' sanity it would be better to elide spaces or replace them
> with underscores so the annotated user is a single "word":

That would be easy to do. For consistency, should we use
<lower case first name>.<lower case last name> (with the usual
exceptions 'aahz', 'guido.van.rossum', 'martin.v.loewis')?

As for parsing these things: they also show up in 'svn log'.

Regards,
Martin

From db3l at fitlinxx.com  Tue Aug 23 03:06:35 2005
From: db3l at fitlinxx.com (David Bolen)
Date: 22 Aug 2005 21:06:35 -0400
Subject: [Python-Dev] Admin access using svn+ssh
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<17161.62233.743239.89277@montanaro.dyndns.org>
	<4309FB5A.1040201@v.loewis.de>
	<17162.155.1546.1991@montanaro.dyndns.org>
	<430A4A37.1060808@v.loewis.de>
Message-ID: <upss55un8.fsf@fitlinxx.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> skip at pobox.com wrote:
> > It worked.  I made a trivial change to Misc/NEWS and checked it in.  I then
> > ran "svn blame NEWS" to see what it showed.  This took approximately
> > forever.  Can I assume this is one thing svn is always going to be pretty
> > slow at? 
> 
> Yes. Somebody commented that this is quadratic in svn with the number of
> revisions, whereas it is linear in CVS. Please try it on some other
> file; Misc/NEWS is probably the worst case in the Python repository.
> 
> I don't know whether there is any better way; we should perhaps ask
> on the svn users list.

One improvement, if you're looking for a fairly recent change is to
bound the blame command with a revision range (I find a date up to
HEAD as easiest).  You'll miss annotations on lines which were last
touched prior to the selected range, but it can definitely speed
things up.

On a file like News, even if you're generous (say take the last year)
it would probably be noticeably faster than letting svn go back to
revision 1.

-- David


From greg.ewing at canterbury.ac.nz  Tue Aug 23 05:48:27 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Aug 2005 15:48:27 +1200
Subject: [Python-Dev] On decorators implementation
In-Reply-To: <430988FE.2020603@libero.it>
References: <43084AE9.20900@libero.it> <430988FE.2020603@libero.it>
Message-ID: <430A9C8B.9010704@canterbury.ac.nz>

Paolino wrote:

> Maybe it's possible to let the decorator know the method class even if 
> the class is still undefined.(Just like recursive functions?)

No, it's not possible. The situation is not the same. With
recursive functions, both functions are defined before
either of them is called. But decorators in a class body
are executed before the surrounding class even exists.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From paragate at gmx.net  Tue Aug 23 10:46:36 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 23 Aug 2005 10:46:36 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <20050822213142.GA5702@mems-exchange.org>
References: <20050822213142.GA5702@mems-exchange.org>
Message-ID: <op.svydbyh20gn541@theta>

neil,

i just intended to worry that returning a unicode object from ``str()``
would break assumptions about the way that 'type definers' like
``str()``, ``int()``, ``float()`` and so on work, but i quickly
realized that e.g. ``int()`` does return a long where appropriate!
since the principle works there one may surmise it will also work for
``str()`` in the long run.

one point i don't seem to understand right now is why it says in the
function definition::

     if type(s) is str or type(s) is unicode:
         ...

instead of using ``isinstance()``.

Testing for ``type()`` means that instances of derived classes (that
may or may not change nothing or almost nothing to the underlying
class) when passed to a function that uses ``str()`` will behave in a
different way!

isn't it more realistic and commonplace to assume that derivatives of a
class do fulfill the requirements of the underlying class? -- which may
turn out to be wrong! but still...

the code as it stands means i have to remember that *in this special
case only* (when deriving from ``unicode``), i have to add a
``__str__()`` method myself that simply returns ``self``.

then of course, one could change ``unicode.__str__()`` to return
``self``, itself, which should work. but then, why so complicated?

i suggest to change said line to::

     if isinstance( s, ( str, unicode ) ):
         ...

any objections?

_wolf
-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

From martin at v.loewis.de  Tue Aug 23 12:03:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Aug 2005 12:03:06 +0200
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <1124728871.17084.13.camel@geddy.wooz.org>
References: <43087DA0.702@v.loewis.de>	<1124665281.31664.28.camel@geddy.wooz.org>	<43096E37.8070708@v.loewis.de>	<17161.60617.950268.641009@montanaro.dyndns.org>	<1124724730.17082.8.camel@geddy.wooz.org>	<4309FA58.4080103@v.loewis.de>
	<1124728871.17084.13.camel@geddy.wooz.org>
Message-ID: <430AF45A.1090506@v.loewis.de>

Barry Warsaw wrote:
>>Not sure what "the right place" would be: pythondev at python.org?
>>I think the email could look any way we want it to look.
> 
> 
> I think it should be <username>@python.org where <username> is the
> firstname.lastname (with some exceptions) scheme that we've agreed on. 
> I actually /don't/ want all commits to look like they're coming from
> pythondev at python.org

Ok, I have now changed all user names for the python repository to
firstname.lastname. That should allow to use them in From: fields
of commit email.

Regards,
Martin


From theller at python.net  Tue Aug 23 12:19:11 2005
From: theller at python.net (Thomas Heller)
Date: Tue, 23 Aug 2005 12:19:11 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
References: <20050822213142.GA5702@mems-exchange.org>
Message-ID: <wtmdot0g.fsf@python.net>

Neil Schemenauer <nas at arctrix.com> writes:

> [Please mail followups to python-dev at python.org.]
>
> The PEP has been rewritten based on a suggestion by Guido to change
> str() rather than adding a new built-in function.  Based on my
> testing, I believe the idea is feasible.  It would be helpful if
> people could test the patched Python with their own applications and
> report any incompatibilities.
>

I like the fact that currently unicode(x) is guarateed to return a
unicode instance, or raises a UnicodeDecodeError.  Same for str(x),
which is guaranteed to return a (byte) string instance or raise an
error.

Wouldn't also a new function make the intent clearer?

So I think I'm +1 on the text() built-in, and -0 on changing str.

Thomas


From martin at v.loewis.de  Tue Aug 23 12:38:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Aug 2005 12:38:05 +0200
Subject: [Python-Dev] Subversion instructions
Message-ID: <430AFC8D.6020000@v.loewis.de>

As some people have been struggling with svn+ssh, I wrote
a few instructions at

http://www.python.org/dev/svn.html

The main issues people have been struggling with are:

- you really should use an agent, or else you have to
  type the private key passphrase three times on checkout

- on windows, putty works fine, but you really should use
  the agent (pageant), or else plink might not find your
  key. Also, if you use Putty profiles, make sure to add
  the user name (pythondev) into the profile

- we need SSH2 keys; SSH1 is disabled on svn.python.org.
  Some of you had been using SSH1 keys on sf.net all these
  years; you will need to generate SSH2 keys.

Regards,
Martin

From mal at egenix.com  Tue Aug 23 12:39:03 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 23 Aug 2005 12:39:03 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return
	unicode	strings
In-Reply-To: <wtmdot0g.fsf@python.net>
References: <20050822213142.GA5702@mems-exchange.org> <wtmdot0g.fsf@python.net>
Message-ID: <430AFCC7.9030402@egenix.com>

Thomas Heller wrote:
> Neil Schemenauer <nas at arctrix.com> writes:
> 
> 
>>[Please mail followups to python-dev at python.org.]
>>
>>The PEP has been rewritten based on a suggestion by Guido to change
>>str() rather than adding a new built-in function.  Based on my
>>testing, I believe the idea is feasible.  It would be helpful if
>>people could test the patched Python with their own applications and
>>report any incompatibilities.
>>
> 
> 
> I like the fact that currently unicode(x) is guarateed to return a
> unicode instance, or raises a UnicodeDecodeError.  Same for str(x),
> which is guaranteed to return a (byte) string instance or raise an
> error.
> 
> Wouldn't also a new function make the intent clearer?
> 
> So I think I'm +1 on the text() built-in, and -0 on changing str.

Same here.

A new API would also help make the transition easier from the
current mixed data/text type (strings) to data-only (bytes)
and text-only (text, renamed from unicode) in Py3.0.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 23 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From p.f.moore at gmail.com  Tue Aug 23 12:41:11 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 23 Aug 2005 11:41:11 +0100
Subject: [Python-Dev] Admin access using svn+ssh
In-Reply-To: <4309FBE5.40204@v.loewis.de>
References: <43087DA0.702@v.loewis.de>
	<1124665281.31664.28.camel@geddy.wooz.org>
	<43096E37.8070708@v.loewis.de>
	<17161.60617.950268.641009@montanaro.dyndns.org>
	<1124724730.17082.8.camel@geddy.wooz.org>
	<3F040472-CADF-4F47-8EFD-9B1267C8D0C9@fuhm.net>
	<4309FBE5.40204@v.loewis.de>
Message-ID: <79990c6b05082303415114abaf@mail.gmail.com>

On 8/22/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> James Y Knight wrote:
> > It seems a waste to use SVN's webdav support just for anon access.
> > The svnserve method works well for anon access. The only reason to
> > use svn webdav IMO is if you want to use that for authenticated
> > access. But since you're talking about using svn+ssh for that..
> 
> It has the advantage that we can easily point people to files
> with a web browser; they don't need an svn client.

It also allows anonymous svn checkouts for people behind firewalls
that only allow HTTP through.

Paul.

From paragate at gmx.net  Tue Aug 23 14:59:28 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 23 Aug 2005 14:59:28 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return
	unicode	strings
In-Reply-To: <430AFCC7.9030402@egenix.com>
References: <20050822213142.GA5702@mems-exchange.org>
	<wtmdot0g.fsf@python.net> <430AFCC7.9030402@egenix.com>
Message-ID: <op.svyo1ero0gn541@theta>


just tested the proposed implementation on a unicode-naive module
basically using

import sys	
import __builtin__
reload( sys ); sys.setdefaultencoding( 'utf-8' )
__builtin__.__dict__[ 'str' ] = new_str_function

et voil?, str() calls in the module are rewritten, and
print u'd?sseldorf' does work as expected(*) (even on
systems where i have no access to sitecustomize, like
at my python-friendly isp's servers).

---
* my expectation is that unicode strings do print out
   as utf-8, as i can't see any better solution.

i suggest to make this option available e.g. via a module in
the standard lib to ease transition for people in case the pep
doesn't make it. it may be applied where deemed necessary and
left ignored otherwise.

if nobody thinks the reload hack is too awful and this solution
stands testing, i guess i'll post it to the aspn cookbook. after
all these countless hours of hunting down ordinal not in range,
finally i'm starting to see some light in the issue.

_wolf


On Tue, 23 Aug 2005 12:39:03 +0200, M.-A. Lemburg <mal at egenix.com> wrote:

> Thomas Heller wrote:
>> Neil Schemenauer <nas at arctrix.com> writes:
>>
>>
>>> [Please mail followups to python-dev at python.org.]
>>>
>>> The PEP has been rewritten based on a suggestion by Guido to change
>>> str() rather than adding a new built-in function.  Based on my
>>> testing, I believe the idea is feasible.  It would be helpful if
>>> people could test the patched Python with their own applications and
>>> report any incompatibilities.
>>>
>>
>>
>> I like the fact that currently unicode(x) is guarateed to return a
>> unicode instance, or raises a UnicodeDecodeError.  Same for str(x),
>> which is guaranteed to return a (byte) string instance or raise an
>> error.
>>
>> Wouldn't also a new function make the intent clearer?
>>
>> So I think I'm +1 on the text() built-in, and -0 on changing str.
>
> Same here.
>
> A new API would also help make the transition easier from the
> current mixed data/text type (strings) to data-only (bytes)
> and text-only (text, renamed from unicode) in Py3.0.
>


-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

From raymond.hettinger at verizon.net  Tue Aug 23 16:11:56 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 10:11:56 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <20050821184613.A45C11E4288@bag.python.org>
Message-ID: <000601c5a7ec$9f614680$8901a044@oemcomputer>

This patch should be reverted or fixed so that the Py2.5 build works
again.

It contains a disasterous search and replace error that prevents it from
compiling.  Hence, it couldn't have passed the test suite before being
checked in.  

Also, all of the project and config files need to be updated for the new
modules.


> -----Original Message-----
> From: python-checkins-bounces at python.org [mailto:python-checkins-
> bounces at python.org] On Behalf Of greg at users.sourceforge.net
> Sent: Sunday, August 21, 2005 2:46 PM
> To: python-checkins at python.org
> Subject: [Python-checkins] python/dist/src/Modules _hashopenssl.c,
> NONE,2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,2.1
md5module.c,
> 2.35, 2.36 shamodule.c, 2.22, 2.23
> 
> Update of /cvsroot/python/python/dist/src/Modules
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32064/Modules
> 
> Modified Files:
> 	md5module.c shamodule.c
> Added Files:
> 	_hashopenssl.c sha256module.c sha512module.c
> Log Message:
> [ sf.net patch # 1121611 ]
> 
> A new hashlib module to replace the md5 and sha modules.  It adds
> support for additional secure hashes such as SHA-256 and SHA-512.  The
> hashlib module uses OpenSSL for fast platform optimized
> implementations of algorithms when available.  The old md5 and sha
> modules still exist as wrappers around hashlib to preserve backwards
> compatibility.


From mwh at python.net  Tue Aug 23 16:31:55 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 23 Aug 2005 15:31:55 +0100
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <000601c5a7ec$9f614680$8901a044@oemcomputer> (Raymond
	Hettinger's message of "Tue, 23 Aug 2005 10:11:56 -0400")
References: <000601c5a7ec$9f614680$8901a044@oemcomputer>
Message-ID: <2mvf1wsp0k.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger at verizon.net> writes:

> This patch should be reverted or fixed so that the Py2.5 build works
> again.
>
> It contains a disasterous search and replace error that prevents it from
> compiling.  Hence, it couldn't have passed the test suite before being
> checked in.  

It works for me, on OS X.  Passes the test suite, even.  I presume
you're on Windows of some kind?

> Also, all of the project and config files need to be updated for the new
> modules.

Well, yes.  But if Greg is on some unix-a-like, he can only update the
unix build files (which he has done; it's in setup.py).

Cheers,
mwh

-- 
 <cube> If you are anal, and you love to be right all the time, C++
   gives you a multitude of mostly untimportant details to fret about
   so you can feel good about yourself for getting them "right", 
   while missing the big picture entirely       -- from Twisted.Quotes

From mwh at python.net  Tue Aug 23 16:33:04 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 23 Aug 2005 15:33:04 +0100
Subject: [Python-Dev] PEP 342 Implementation
In-Reply-To: <000001c5991d$e40bb140$12b62c81@oemcomputer> (Raymond
	Hettinger's message of "Thu, 04 Aug 2005 13:56:50 -0400")
References: <000001c5991d$e40bb140$12b62c81@oemcomputer>
Message-ID: <2mr7cksoyn.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger at verizon.net> writes:

> Could someone please make an independent check to verify an issue with
> the 342 checkin.  The test suite passes but when I run IDLE and open a
> new window (using Control-N), it crashes and burns.
>
> The problem does not occur just before the checkin:
>     cvs up -D "2005-08-01 18:00"
> But emerges immediately after:
>     cvs up -D "2005-08-01 21:00"

Is this still happening?  I'm not seeing any unusual flakiness, but
then I can't run IDLE (OS X, no Tk).

It's not exactly a minimal test case :)

Cheers,
mwh

-- 
  A difference which makes no difference is no difference at all.
                        -- William James (I think.  Reference anyone?)

From raymond.hettinger at verizon.net  Tue Aug 23 17:03:12 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 11:03:12 -0400
Subject: [Python-Dev] PEP 342 Implementation
In-Reply-To: <2mr7cksoyn.fsf@starship.python.net>
Message-ID: <000001c5a7f3$c8e0e2c0$8901a044@oemcomputer>

[Raymond Hettinger]
> 
> > Could someone please make an independent check to verify an issue
with
> > the 342 checkin.  The test suite passes but when I run IDLE and open
a
> > new window (using Control-N), it crashes and burns.
> >
> > The problem does not occur just before the checkin:
> >     cvs up -D "2005-08-01 18:00"
> > But emerges immediately after:
> >     cvs up -D "2005-08-01 21:00"
> 
> Is this still happening?  I'm not seeing any unusual flakiness, but
> then I can't run IDLE (OS X, no Tk).


Yes, it is still happening.
No one has yet offered an independent confirmation.


> It's not exactly a minimal test case :)

Right ;-)

Once narrowed down, the problem and solution will likely be obvious.


Raymond


From fredrik at pythonware.com  Tue Aug 23 16:58:53 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 23 Aug 2005 16:58:53 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
References: <20050822213142.GA5702@mems-exchange.org>
Message-ID: <defdj9$m3q$1@sea.gmane.org>

Neil Schemenauer wrote:

> The PEP has been rewritten based on a suggestion by Guido to change
> str() rather than adding a new built-in function.  Based on my testing, I
> believe the idea is feasible.

note that this breaks chapter 3 of the tutorial:

http://docs.python.org/tut/node5.html#SECTION005130000000000000000

where str() is first introduced.

</F> 


From raymond.hettinger at verizon.net  Tue Aug 23 17:16:11 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 11:16:11 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <2mvf1wsp0k.fsf@starship.python.net>
Message-ID: <000101c5a7f5$989d5100$8901a044@oemcomputer>

[Raymond Hettinger] 
> > This patch should be reverted or fixed so that the Py2.5 build works
> > again.
> >
> > It contains a disasterous search and replace error that prevents it
from
> > compiling.  Hence, it couldn't have passed the test suite before
being
> > checked in.

[Michael Hudson]
> It works for me, on OS X.  Passes the test suite, even.  I presume
> you're on Windows of some kind?


Here's an excerpt from the check-in note for sha512module.c:

 
RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL);
 
RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL);
 
RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL);
 
RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL);
 
RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL);

Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL"
that Mr. Gates and the ANSI C folks have yet to accept ;-)  

If it works for you, then it probably means that sha512module.c was left
out of the build.  Maybe sha512module.c wasn't supposed to be checked
in?


> > Also, all of the project and config files need to be updated for the
new
> > modules.
> 
> Well, yes.  But if Greg is on some unix-a-like, he can only update the
> unix build files (which he has done; it's in setup.py).

The project files are just text files and can be updated simply and
directly.  But yes, that is no big deal and I'll just do it for him once
the code gets to a compilable state.

Aside from the project files, there is still config.c and whatnot.  We
should put together a checklist of all the things that need to be
updated when a new module is added.


Raymond


From nas at arctrix.com  Tue Aug 23 17:21:57 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Aug 2005 09:21:57 -0600
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <op.svydbyh20gn541@theta>
References: <20050822213142.GA5702@mems-exchange.org> <op.svydbyh20gn541@theta>
Message-ID: <20050823152156.GA7839@mems-exchange.org>

On Tue, Aug 23, 2005 at 10:46:36AM +0200, Wolfgang Lipp wrote:
> one point i don't seem to understand right now is why it says in the
> function definition::
> 
>      if type(s) is str or type(s) is unicode:
>          ...
> 
> instead of using ``isinstance()``.

I don't think isinstance() would be okay.  That test is meant as an
optimization to avoid calling __str__ on str and unicode instances.
Subclasses should still have their __str__ method called otherwise
they cannot override it.

> the code as it stands means i have to remember that *in this special
> case only* (when deriving from ``unicode``), i have to add a
> ``__str__()`` method myself that simply returns ``self``.

Ah, I see that unicode.__str__ returns a str instance.

> then of course, one could change ``unicode.__str__()`` to return
> ``self``, itself, which should work. but then, why so complicated?

I think that may be the right fix.

  Neil

From gmccaughan at synaptics-uk.com  Tue Aug 23 17:32:29 2005
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Tue, 23 Aug 2005 16:32:29 +0100
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
	_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
	2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35,
	2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer>
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
Message-ID: <200508231632.30175.gmccaughan@synaptics-uk.com>

> Here's an excerpt from the check-in note for sha512module.c:
> 
>  
> RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL);
> RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL);
> RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL);
> RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL);
> RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL);
> 
> Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL"
> that Mr. Gates and the ANSI C folks have yet to accept ;-)  

It's valid C99, meaning "this is an unsigned long long".

-- 
g


From pje at telecommunity.com  Tue Aug 23 17:43:02 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 11:43:02 -0400
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
 strings
In-Reply-To: <20050823152156.GA7839@mems-exchange.org>
References: <op.svydbyh20gn541@theta> <20050822213142.GA5702@mems-exchange.org>
	<op.svydbyh20gn541@theta>
Message-ID: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com>

At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> > then of course, one could change ``unicode.__str__()`` to return
> > ``self``, itself, which should work. but then, why so complicated?
>
>I think that may be the right fix.

No, it isn't.  Right now str(u"x") coerces the unicode object to a string, 
so changing this will be backwards-incompatible with any existing programs.

I think the new builtin is actually the right way to go for both 2.x and 
3.x Pythons.  i.e., text() would be a builtin in 2.x, along with a new 
bytes() type, and in 3.x text() could replace the basestring, str and 
unicode types.

I also think that the text() constructor should have a signature of 
'text(ob,encoding="ascii")'.  In the default case, strings can be returned 
by text() as long as they are pure ASCII (making the code str-stable *and* 
unicode-safe).  In the non-default case, a unicode object should always be 
returned, making the code unicode-safe but not str-stable.  Allowing text() 
to return 8-bit strings would be an obvious violation of its name: it's for 
text, not bytes.


From paragate at gmx.net  Tue Aug 23 17:45:27 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 23 Aug 2005 17:45:27 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <20050823152156.GA7839@mems-exchange.org>
References: <20050822213142.GA5702@mems-exchange.org> <op.svydbyh20gn541@theta>
	<20050823152156.GA7839@mems-exchange.org>
Message-ID: <op.svywp11f0gn541@theta>


i have to revise my last posting -- exporting the new ``str``
pure-python implementation breaks -- of course! -- as soon
as ``isinstance(x,str)`` [sic] is used. right now it breaks
because you can't have a function as the second argument of
``isinstance()``, but even if that could be avoided by canny
programming, the fact remains that any object derived from
e.g. a string literal will still be constructed from the
underlying implementation and can't therefore be an instance
of the old ``str``. also, ``str.__bases__`` is not extendable
(it's a tuple) and not replaceable (it's a built-in), so there
seems to be no way to get near a truly working solution except
with C-level patches.


On Tue, 23 Aug 2005 17:21:57 +0200, Neil Schemenauer <nas at arctrix.com>  
wrote:

> I don't think isinstance() would be okay.  That test is meant as an
> optimization to avoid calling __str__ on str and unicode instances.
> Subclasses should still have their __str__ method called otherwise
> they cannot override it.

makes perfect sense, i'll change the line back.

_wolf

From mwh at python.net  Tue Aug 23 17:44:56 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 23 Aug 2005 16:44:56 +0100
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer> (Raymond
	Hettinger's message of "Tue, 23 Aug 2005 11:16:11 -0400")
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
Message-ID: <2mmzn8slmv.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger at verizon.net> writes:

> [Raymond Hettinger] 
>> > This patch should be reverted or fixed so that the Py2.5 build works
>> > again.
>> >
>> > It contains a disasterous search and replace error that prevents it
> from
>> > compiling.  Hence, it couldn't have passed the test suite before
> being
>> > checked in.
>
> [Michael Hudson]
>> It works for me, on OS X.  Passes the test suite, even.  I presume
>> you're on Windows of some kind?
>
>
> Here's an excerpt from the check-in note for sha512module.c:
>
>  
> RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL);
>  
> RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL);
>  
> RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL);
>  
> RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL);
>  
> RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL);
>
> Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL"
> that Mr. Gates and the ANSI C folks have yet to accept ;-)  

It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found
lying around somewhere...), so I think it's just Bill who's behind.
However, Python doesn't require C99, so it's pretty dodgy code by our
standards.

Hmm.  You have PY_LONG_LONG #define-d, right?  Does VC++ 6 (that's
what you use, right?) support any kind of long long literal?

> If it works for you, then it probably means that sha512module.c was left
> out of the build.

Nope: 

[mwh at 82-33-185-193 build-debug]$ ./python.exe 
Python 2.5a0 (#1, Aug 23 2005, 13:24:32) 
[GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import _sha512
[44297 refs]

> Maybe sha512module.c wasn't supposed to be checked in?

I think if you have a sufficiently modern openssl it's unnecessary.

> The project files are just text files and can be updated simply and
> directly.  But yes, that is no big deal and I'll just do it for him once
> the code gets to a compilable state.
>
> Aside from the project files, there is still config.c and whatnot.

Does anything need to be done there?  Oh, PC/config.c, right?

> We should put together a checklist of all the things that need to be
> updated when a new module is added.

Sounds like it! :)

Cheers,
mwh

-- 
  This makes it possible to pass complex object hierarchies to
  a C coder who thinks computer science has made no worthwhile
  advancements since the invention of the pointer.
                                       -- Gordon McMillan, 30 Jul 1998

From fredrik at pythonware.com  Tue Aug 23 17:51:34 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 23 Aug 2005 17:51:34 +0200
Subject: [Python-Dev] [Python-checkins]
	python/dist/src/Modules_hashopenssl.c, NONE,
	2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
	2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
	<200508231632.30175.gmccaughan@synaptics-uk.com>
Message-ID: <defgm2$19k$1@sea.gmane.org>

Gareth McCaughan wrote:

> It's valid C99, meaning "this is an unsigned long long".

since when does Python require C99 compilers?

</F> 


From mcherm at mcherm.com  Tue Aug 23 18:11:02 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 23 Aug 2005 09:11:02 -0700
Subject: [Python-Dev] Revised PEP 349: Allow str() to
	return	unicodestrings
Message-ID: <20050823091102.ay5mcm8r2gco4488@login.werra.lunarpages.com>

Neil Schemenauer wrote:
> The PEP has been rewritten based on a suggestion by Guido to change
> str() rather than adding a new built-in function.  Based on my testing, I
> believe the idea is feasible.

Fredrik Lundh replies:
> note that this breaks chapter 3 of the tutorial:
>
> http://docs.python.org/tut/node5.html#SECTION005130000000000000000
>
> where str() is first introduced.

It's hardly "introduced"... the only bit I found reads:

   ... When a Unicode string is printed, written to a file, or converted
   with str(), conversion takes place using this default encoding.

   >>> u"abc"
   u'abc'
   >>> str(u"abc")
   'abc'
   >>> u"???"
   u'\xe4\xf6\xfc'
   >>> str(u"???")
   Traceback (most recent call last):
     File "<stdin>", line 1, in ?
   UnicodeEncodeError: 'ascii' codec can't encode characters in position
   0-2: ordinal not in range(128)

   To convert a Unicode string into an 8-bit string using a specific encoding,
   Unicode objects provide an encode() method that takes one argument, the
   name of the encoding. Lowercase names for encodings are preferred.

   >>> u"???".encode('utf-8')
   '\xc3\xa4\xc3\xb6\xc3\xbc'

I think that if we just took out the example of str() usage and replaced
it with a sentence or two that DID introduce the (revised) str() function,
it ought to work. In particular, it could mention that you can call str()
on any object, which isn't stated here at all.

-- Michael Chermside


From theller at python.net  Tue Aug 23 18:08:42 2005
From: theller at python.net (Thomas Heller)
Date: Tue, 23 Aug 2005 18:08:42 +0200
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
	<2mmzn8slmv.fsf@starship.python.net>
Message-ID: <y86siqk5.fsf@python.net>

Michael Hudson <mwh at python.net> writes:

> "Raymond Hettinger" <raymond.hettinger at verizon.net> writes:
>
>> [Raymond Hettinger] 
>>> > This patch should be reverted or fixed so that the Py2.5 build works
>>> > again.
>>> >
>>> > It contains a disasterous search and replace error that prevents it
>> from
>>> > compiling.  Hence, it couldn't have passed the test suite before
>> being
>>> > checked in.
>>
>> [Michael Hudson]
>>> It works for me, on OS X.  Passes the test suite, even.  I presume
>>> you're on Windows of some kind?
>>
>>
>> Here's an excerpt from the check-in note for sha512module.c:
>>
>>  
>> RND(S[0],S[1],S[2],S[3],S[4],S[5],S[6],S[7],0,0x428a2f98d728ae22ULL);
>>  
>> RND(S[7],S[0],S[1],S[2],S[3],S[4],S[5],S[6],1,0x7137449123ef65cdULL);
>>  
>> RND(S[6],S[7],S[0],S[1],S[2],S[3],S[4],S[5],2,0xb5c0fbcfec4d3b2fULL);
>>  
>> RND(S[5],S[6],S[7],S[0],S[1],S[2],S[3],S[4],3,0xe9b5dba58189dbbcULL);
>>  
>> RND(S[4],S[5],S[6],S[7],S[0],S[1],S[2],S[3],4,0x3956c25bf348b538ULL);
>>
>> Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL"
>> that Mr. Gates and the ANSI C folks have yet to accept ;-)  
>
> It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found
> lying around somewhere...), so I think it's just Bill who's behind.
> However, Python doesn't require C99, so it's pretty dodgy code by our
> standards.
>
> Hmm.  You have PY_LONG_LONG #define-d, right?  Does VC++ 6 (that's
> what you use, right?) support any kind of long long literal?

The suffix seems to be 'ui64'.  From vc6 limits.h:

#if     _INTEGRAL_MAX_BITS >= 64
/* minimum signed 64 bit value */
#define _I64_MIN    (-9223372036854775807i64 - 1)
/* maximum signed 64 bit value */
#define _I64_MAX      9223372036854775807i64
/* maximum unsigned 64 bit value */
#define _UI64_MAX     0xffffffffffffffffui64
#endif


Thomas


From abkhd at hotmail.com  Tue Aug 23 18:23:33 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Tue, 23 Aug 2005 16:23:33 +0000
Subject: [Python-Dev] Modules _hashopenssl, sha256, sha512 compile in MinGW,
	test_hmac.py passes
Message-ID: <BAY23-F61408A2BB2A3A7F428EADABA90@phx.gbl>

Hello,


I can also report that MinGW can compile the said modules and (after 
updating config.c, etc.) the resulting code passes as follows:

$ python -i ../Lib/test/test_hmac.py
test_md5_vectors (__main__.TestVectorsTestCase) ... ok
test_sha_vectors (__main__.TestVectorsTestCase) ... ok
test_normal (__main__.ConstructorTestCase) ... ok
test_withmodule (__main__.ConstructorTestCase) ... ok
test_withtext (__main__.ConstructorTestCase) ... ok
test_default_is_md5 (__main__.SanityTestCase) ... ok
test_exercise_all_methods (__main__.SanityTestCase) ... ok
test_attributes (__main__.CopyTestCase) ... ok
test_equality (__main__.CopyTestCase) ... ok
test_realcopy (__main__.CopyTestCase) ... ok

----------------------------------------------------------------------
Ran 10 tests in 0.050s

OK
>>>


Are these moduels going to be built into the core?

Regards
Khalid

_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


From gmccaughan at synaptics-uk.com  Tue Aug 23 18:38:20 2005
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Tue, 23 Aug 2005 17:38:20 +0100
Subject: [Python-Dev] [Python-checkins]
	python/dist/src/Modules_hashopenssl.c, NONE,
	2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
	2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <defgm2$19k$1@sea.gmane.org>
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
	<200508231632.30175.gmccaughan@synaptics-uk.com>
	<defgm2$19k$1@sea.gmane.org>
Message-ID: <200508231738.20961.gmccaughan@synaptics-uk.com>

On Tuesday 2005-08-23 16:51, Fredrik Lundh wrote:
> Gareth McCaughan wrote:
> 
> > It's valid C99, meaning "this is an unsigned long long".
> 
> since when does Python require C99 compilers?
> 
> </F> 

It doesn't, of course, and I hope it won't for a good while.
I was just responding to this:

  | Perhaps OS X has some sort of Steve Jobs special constant suffix "ULL"
  | that Mr. Gates and the ANSI C folks have yet to accept

since in fact Mr Gates and the ANSI C folks (and the gcc folks,
and probably plenty of others I can't check so easily) *have*
accepted it.

-- 
g


From raymond.hettinger at verizon.net  Tue Aug 23 18:46:58 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 12:46:58 -0400
Subject: [Python-Dev]
 [Python-checkins]python/dist/src/Modules_hashopenssl.c, NONE,
 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35,
 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <defgm2$19k$1@sea.gmane.org>
Message-ID: <000401c5a802$478c38a0$8901a044@oemcomputer>

[Gareth]
> > It's valid C99, meaning "this is an unsigned long long".

</F>
> since when does Python require C99 compilers?


Except from PEP 7:

  "Use ANSI/ISO standard C (the 1989 version of the standard)."


From mwh at python.net  Tue Aug 23 18:51:05 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 23 Aug 2005 17:51:05 +0100
Subject: [Python-Dev] [Python-checkins]
 python/dist/src/Modules_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22,
 2.23
In-Reply-To: <defgm2$19k$1@sea.gmane.org> (Fredrik Lundh's message of "Tue,
	23 Aug 2005 17:51:34 +0200")
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
	<200508231632.30175.gmccaughan@synaptics-uk.com>
	<defgm2$19k$1@sea.gmane.org>
Message-ID: <2mirxwsikm.fsf@starship.python.net>

"Fredrik Lundh" <fredrik at pythonware.com> writes:

> Gareth McCaughan wrote:
>
>> It's valid C99, meaning "this is an unsigned long long".
>
> since when does Python require C99 compilers?

Well, it doesn't, but Raymond was suggesting the code was GCC
specific, or something.

Cheers,
mwh

-- 
  Check out the comments in this source file that start with:
  # Oh, lord help us.
            -- Mark Hammond gets to play with the Outlook object model

From raymond.hettinger at verizon.net  Tue Aug 23 17:47:14 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 11:47:14 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <2mmzn8slmv.fsf@starship.python.net>
Message-ID: <000101c5a7f9$efb77480$8901a044@oemcomputer>

[Michael Hudson]
> It's an C99 unsigned long long literal, AFAICT (p70 of the PDF I found
> lying around somewhere...), so I think it's just Bill who's behind.
> However, Python doesn't require C99, so it's pretty dodgy code by our
> standards.

More than just dodgy.  
Except from PEP 7:

  "Use ANSI/ISO standard C (the 1989 version of the standard)."


Raymond


From nas at arctrix.com  Tue Aug 23 18:54:09 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Aug 2005 10:54:09 -0600
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com>
References: <op.svydbyh20gn541@theta> <20050822213142.GA5702@mems-exchange.org>
	<op.svydbyh20gn541@theta>
	<5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com>
Message-ID: <20050823165409.GA8026@mems-exchange.org>

On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote:
> At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> >> then of course, one could change ``unicode.__str__()`` to return
> >> ``self``, itself, which should work. but then, why so complicated?
> >
> >I think that may be the right fix.
> 
> No, it isn't.  Right now str(u"x") coerces the unicode object to a
> string, so changing this will be backwards-incompatible with any
> existing programs.

I meant that for the implementation of the PEP, changing
unicode.__str__ to return self seems to be the right fix.  Whether
you believe that str() should be allowed to return unicode instances
is a different question.

> I think the new builtin is actually the right way to go for both 2.x and 
> 3.x Pythons.  i.e., text() would be a builtin in 2.x, along with a new 
> bytes() type, and in 3.x text() could replace the basestring, str and 
> unicode types.

Perhaps the critical question is what will the string type in P3k be
called?  If it will be 'str' then I think the PEP makes sense.  If
it will be something else, then there should be a corresponding type
slot (e.g. __text__).  What method does your proposed text()
built-in call?

> I also think that the text() constructor should have a signature of 
> 'text(ob,encoding="ascii")'.

I think that's a bad idea.  We want to get away from ASCII and use
Unicode instead.

> In the default case, strings can be returned by text() as long as
> they are pure ASCII (making the code str-stable *and*
> unicode-safe).

I think you misunderstand the PEP.  Your proposed function is
neither Unicode-safe nor str-stable, the worst of both worlds.
Passing it a unicode string that contains non-ASCII characters would
result in an exception (not Unicode-safe).  Passing it a str results
in a unicode return value (not str-stable).

  Neil

From nas at arctrix.com  Tue Aug 23 19:00:06 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Aug 2005 11:00:06 -0600
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <op.svywp11f0gn541@theta>
References: <20050822213142.GA5702@mems-exchange.org> <op.svydbyh20gn541@theta>
	<20050823152156.GA7839@mems-exchange.org> <op.svywp11f0gn541@theta>
Message-ID: <20050823170003.GB8026@mems-exchange.org>

On Tue, Aug 23, 2005 at 05:45:27PM +0200, Wolfgang Lipp wrote:
> i have to revise my last posting -- exporting the new ``str``
> pure-python implementation breaks -- of course! -- as soon
> as ``isinstance(x,str)`` [sic] is used

Right.  I tried to come up with a pure Python version so people
could test their code.  This was my latest attempt before giving
up (from memory):

    # inside site.py
    _old_str_new = str.__new__
    def _str_new(self, s):
        if type(self) not in (str, unicode):
            return _old_str_new(self, s)
        if type(s) not in (str, unicode):
            return s
        r = s.__str__()
        if not isinstance(r, (str, unicode)):
            raise TypeError('__str__ returned non-string')
        return r
    str.__new__ = _str_new
    
It doesn't work though:

    TypeError: can't set attributes of built-in/extension type 'str'

Maybe someone else has a clever solution.

  Neil

From pje at telecommunity.com  Tue Aug 23 19:14:24 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 13:14:24 -0400
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
 strings
In-Reply-To: <20050823165409.GA8026@mems-exchange.org>
References: <5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com>
	<op.svydbyh20gn541@theta> <20050822213142.GA5702@mems-exchange.org>
	<op.svydbyh20gn541@theta>
	<5.1.1.6.0.20050823112823.01b22d18@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050823130921.02a43da8@mail.telecommunity.com>

At 10:54 AM 8/23/2005 -0600, Neil Schemenauer wrote:
>On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote:
> > At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> > >> then of course, one could change ``unicode.__str__()`` to return
> > >> ``self``, itself, which should work. but then, why so complicated?
> > >
> > >I think that may be the right fix.
> >
> > No, it isn't.  Right now str(u"x") coerces the unicode object to a
> > string, so changing this will be backwards-incompatible with any
> > existing programs.
>
>I meant that for the implementation of the PEP, changing
>unicode.__str__ to return self seems to be the right fix.  Whether
>you believe that str() should be allowed to return unicode instances
>is a different question.
>
> > I think the new builtin is actually the right way to go for both 2.x and
> > 3.x Pythons.  i.e., text() would be a builtin in 2.x, along with a new
> > bytes() type, and in 3.x text() could replace the basestring, str and
> > unicode types.
>
>Perhaps the critical question is what will the string type in P3k be
>called?  If it will be 'str' then I think the PEP makes sense.  If
>it will be something else, then there should be a corresponding type
>slot (e.g. __text__).  What method does your proposed text()
>built-in call?

Heck if I know.  :)  I think the P3k string type should just be called 
'text', though, so we can leave the whole unicode/str mess behind.


> > I also think that the text() constructor should have a signature of
> > 'text(ob,encoding="ascii")'.
>
>I think that's a bad idea.  We want to get away from ASCII and use
>Unicode instead.

It's not str-stable if it returns unicode for a string input.


> > In the default case, strings can be returned by text() as long as
> > they are pure ASCII (making the code str-stable *and*
> > unicode-safe).
>
>I think you misunderstand the PEP.  Your proposed function is
>neither Unicode-safe nor str-stable, the worst of both worlds.
>Passing it a unicode string that contains non-ASCII characters would
>result in an exception (not Unicode-safe).  Passing it a str results
>in a unicode return value (not str-stable).

I think you misunderstand my proposal.  :)  I'm proposing rough semantics of:

     def text(ob, encoding='ascii'):

         if isinstance(ob,unicode):
             return ob

         ob = str(ob)  # or ob.__text__, then fallback to __unicode__/__str__

         if encoding=='ascii' and isinstance(ob,str):
             unicode(ob,encoding)  # check for purity
             return ob  # return the string if it's pure

         return unicode(ob, encoding)

This is str-stable *and* unicode-safe.


From reinhold-birkenfeld-nospam at wolke7.net  Tue Aug 23 19:23:25 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue, 23 Aug 2005 19:23:25 +0200
Subject: [Python-Dev] python/dist/src/Doc/tut tut.tex,1.276,1.277
In-Reply-To: <20050823150057.057C91E400B@bag.python.org>
References: <20050823150057.057C91E400B@bag.python.org>
Message-ID: <defm2e$lln$1@sea.gmane.org>

rhettinger at users.sourceforge.net wrote:

I'm not a native speaker, but...

> @@ -114,7 +114,7 @@
>  programs, or to test functions during bottom-up program development.
>  It is also a handy desk calculator.
>
> -Python allows writing very compact and readable programs.  Programs
> +Python enables programs to written compactly and readably.  Programs
>  written in Python are typically much shorter than equivalent C or
>  \Cpp{} programs, for several reasons:
>  \begin{itemize}

...shouldn't it be "programs to be written compactly"?

> @@ -1753,8 +1753,8 @@
>
>  \begin{methoddesc}[list]{pop}{\optional{i}}
>  Remove the item at the given position in the list, and return it.  If
> -no index is specified, \code{a.pop()} returns the last item in the
> -list.  The item is also removed from the list.  (The square brackets
> +no index is specified, \code{a.pop()} removes and returns the last item
> +in the list.  The item is also removed from the list.  (The square brackets
>  around the \var{i} in the method signature denote that the parameter
>  is optional, not that you should type square brackets at that
>  position.  You will see this notation frequently in the

Thats twice the same the same (removal from list).

> @@ -1985,7 +1987,9 @@
>  \section{The \keyword{del} statement \label{del}}
>
>  There is a way to remove an item from a list given its index instead
> -of its value: the \keyword{del} statement.  This can also be used to
> +of its value: the \keyword{del} statement.  Unlike the \method{pop()})
> +method which returns a value, the \keyword{del} keyword is a statement
> +and can also be used to
>  remove slices from a list (which we did earlier by assignment of an
>  empty list to the slice).  For example:

The del keyword is a statement?

> @@ -2133,8 +2137,8 @@
>  keys.  Tuples can be used as keys if they contain only strings,
>  numbers, or tuples; if a tuple contains any mutable object either
>  directly or indirectly, it cannot be used as a key.  You can't use
> -lists as keys, since lists can be modified in place using their
> -\method{append()} and \method{extend()} methods, as well as slice and
> +lists as keys, since lists can be modified in place using methods like
> +\method{append()} and \method{extend()} or modified with slice and
>  indexed assignments.

Is the second "modified" necessary?

> @@ -5595,8 +5603,8 @@
>  to round it again can't make it better:  it was already as good as it
>  gets.
>
> -Another consequence is that since 0.1 is not exactly 1/10, adding 0.1
> -to itself 10 times may not yield exactly 1.0, either:
> +Another consequence is that since 0.1 is not exactly 1/10,
> +summing ten values of 0.1 may not yield exactly 1.0, either:
>
>  \begin{verbatim}
>  >>> sum = 0.0

Is it clear from context that the "0.1 is not exactly 1/10" refers to
floating point only?

> @@ -5637,7 +5645,7 @@
>  you can perform an exact analysis of cases like this yourself.  Basic
>  familiarity with binary floating-point representation is assumed.
>
> -\dfn{Representation error} refers to that some (most, actually)
> +\dfn{Representation error} refers to fact that some (most, actually)
>  decimal fractions cannot be represented exactly as binary (base 2)
>  fractions.  This is the chief reason why Python (or Perl, C, \Cpp,
>  Java, Fortran, and many others) often won't display the exact decimal

"...refers to the fact..."?


Reinhold

-- 
Mail address is perfectly valid!


From pje at telecommunity.com  Tue Aug 23 19:57:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 Aug 2005 13:57:27 -0400
Subject: [Python-Dev] python/dist/src/Doc/tut tut.tex,1.276,1.277
In-Reply-To: <defm2e$lln$1@sea.gmane.org>
References: <20050823150057.057C91E400B@bag.python.org>
	<20050823150057.057C91E400B@bag.python.org>
Message-ID: <5.1.1.6.0.20050823135140.032c5ea0@mail.telecommunity.com>

At 07:23 PM 8/23/2005 +0200, Reinhold Birkenfeld wrote:
>rhettinger at users.sourceforge.net wrote:
>
>I'm not a native speaker, but...
>
> > @@ -114,7 +114,7 @@
> >  programs, or to test functions during bottom-up program development.
> >  It is also a handy desk calculator.
> >
> > -Python allows writing very compact and readable programs.  Programs
> > +Python enables programs to written compactly and readably.  Programs
> >  written in Python are typically much shorter than equivalent C or
> >  \Cpp{} programs, for several reasons:
> >  \begin{itemize}
>
>...shouldn't it be "programs to be written compactly"?

It looks to me like the original text here should stand; Python doesn't 
"enable programs to be written"; it enables people to write them.  That is, 
the passive voice should be avoided if possible.  ;-)


> > @@ -1985,7 +1987,9 @@
> >  \section{The \keyword{del} statement \label{del}}
> >
> >  There is a way to remove an item from a list given its index instead
> > -of its value: the \keyword{del} statement.  This can also be used to
> > +of its value: the \keyword{del} statement.  Unlike the \method{pop()})
> > +method which returns a value, the \keyword{del} keyword is a statement
> > +and can also be used to
> >  remove slices from a list (which we did earlier by assignment of an
> >  empty list to the slice).  For example:
>
>The del keyword is a statement?

The keyword certainly isn't.  This section also looks like it should stand 
the way it was, or else say that "unlike the pop() method, the del 
statement can also be used to remove slices...".


From greg at electricrain.com  Tue Aug 23 20:59:29 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue, 23 Aug 2005 11:59:29 -0700
Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219,
	1.220
In-Reply-To: <003101c5a717$83be4b60$3c23a044@oemcomputer>
References: <20050821184639.EF8711E4006@bag.python.org>
	<003101c5a717$83be4b60$3c23a044@oemcomputer>
Message-ID: <20050823185929.GI16043@zot.electricrain.com>

On Mon, Aug 22, 2005 at 08:46:27AM -0400, Raymond Hettinger wrote:
> > A new hashlib module to replace the md5 and sha modules.  It adds
> > support for additional secure hashes such as SHA-256 and SHA-512.  The
> > hashlib module uses OpenSSL for fast platform optimized
> > implementations of algorithms when available.  The old md5 and sha
> > modules still exist as wrappers around hashlib to preserve backwards
> > compatibility.
> 
> I'm getting compilation errors:
> 
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : error C2146: syntax error :
> missing ')' before identifier 'L'
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : error C2059: syntax error : 'bad
> suffix on number'
> C:\py25\Modules\sha512module.c(146) : fatal error C1013: compiler limit
> : too many open parentheses
> 
> 
> Also, there should be updating entries to Misc/NEWS,
> PC/VC6/pythoncore.dsp, and PC/config.c.
> 
> 
> Raymond

I don't have a win32 dev environment at the moment so i didn't see
that.  Sorry.

If you remove the 'ULL' suffix from all of the 64bit constants in that
file what happens?

I added the ULLs to quelch the mass of warnings about constants being
to large for the datatype that gcc 3.3 was spewing.

-greg


From greg at electricrain.com  Tue Aug 23 21:04:30 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue, 23 Aug 2005 12:04:30 -0700
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
	_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
	2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35,
	2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <000601c5a7ec$9f614680$8901a044@oemcomputer>
References: <20050821184613.A45C11E4288@bag.python.org>
	<000601c5a7ec$9f614680$8901a044@oemcomputer>
Message-ID: <20050823190430.GJ16043@zot.electricrain.com>

> This patch should be reverted or fixed so that the Py2.5 build works
> again.
> 
> It contains a disasterous search and replace error that prevents it from
> compiling.  Hence, it couldn't have passed the test suite before being
> checked in.  
> 
> Also, all of the project and config files need to be updated for the new
> modules.

It passes fine on linux.  I don't have a windows dev environment.

regardless, the quick way to work around the sha512 on windows issue
is to comment it out in setup.py and comment out the sha384 and sha512
tests in test_hashlib.py and commit that until the complation issues
are worked out.

-g

> > -----Original Message-----
> > From: python-checkins-bounces at python.org [mailto:python-checkins-
> > bounces at python.org] On Behalf Of greg at users.sourceforge.net
> > Sent: Sunday, August 21, 2005 2:46 PM
> > To: python-checkins at python.org
> > Subject: [Python-checkins] python/dist/src/Modules _hashopenssl.c,
> > NONE,2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,2.1
> md5module.c,
> > 2.35, 2.36 shamodule.c, 2.22, 2.23
> > 
> > Update of /cvsroot/python/python/dist/src/Modules
> > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32064/Modules
> > 
> > Modified Files:
> > 	md5module.c shamodule.c
> > Added Files:
> > 	_hashopenssl.c sha256module.c sha512module.c
> > Log Message:
> > [ sf.net patch # 1121611 ]
> > 
> > A new hashlib module to replace the md5 and sha modules.  It adds
> > support for additional secure hashes such as SHA-256 and SHA-512.  The
> > hashlib module uses OpenSSL for fast platform optimized
> > implementations of algorithms when available.  The old md5 and sha
> > modules still exist as wrappers around hashlib to preserve backwards
> > compatibility.

From raymond.hettinger at verizon.net  Tue Aug 23 21:09:50 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 15:09:50 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219,
 1.220
In-Reply-To: <20050823185929.GI16043@zot.electricrain.com>
Message-ID: <000201c5a816$3caacaa0$8901a044@oemcomputer>

[Gregory P. Smith]
> I don't have a win32 dev environment at the moment so i didn't see
> that.  Sorry.

No big deal.
But we still have to get the code back to ANSI compliance.
Do you have an ANSI-strict option with your compiler?


Raymond


From barry at python.org  Tue Aug 23 21:27:01 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 23 Aug 2005 15:27:01 -0400
Subject: [Python-Dev]
	[Python-checkins]	python/dist/src/Modules_hashopenssl.c, NONE,
	2.1 sha256module.c, NONE, 2.1	sha512module.c, NONE,
	2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <defgm2$19k$1@sea.gmane.org>
References: <000101c5a7f5$989d5100$8901a044@oemcomputer>
	<200508231632.30175.gmccaughan@synaptics-uk.com>
	<defgm2$19k$1@sea.gmane.org>
Message-ID: <1124825221.16679.4.camel@geddy.wooz.org>

On Tue, 2005-08-23 at 11:51, Fredrik Lundh wrote:
> Gareth McCaughan wrote:
> 
> > It's valid C99, meaning "this is an unsigned long long".
> 
> since when does Python require C99 compilers?

Why, since Python 3.0 of course!  <wink>

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050823/139f430b/attachment.pgp

From keir at cs.toronto.edu  Tue Aug 23 22:10:21 2005
From: keir at cs.toronto.edu (Keir Mierle)
Date: Tue, 23 Aug 2005 16:10:21 -0400
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)
Message-ID: <20050823201021.GE32195@cs.toronto.edu>

Hi, I'm working on Argon (http://www.third-bit.com/trac/argon) with Greg
Wilson this summer

We're having a very strange problem with Python's unicode parsing of source
files. Basically, our CGI script was running extremely slowly on our production
box (a pokey dual-Xeon 3GHz w/ 4GB RAM and 15K SCSI drives). Slow to the tune
of 6-10 seconds per request. I eventually tracked this down to imports of our
source tree; the actual request was completing in 300ms, the rest of the time
was spent in __import__.

After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was
getting called 51 million times. Our code is 1.2 million characters, so I
hardly think it makes sense to call IsLinebreak 50 times for each character;
and we're not even importing our entire source tree on every invocation.

Our code is a fork of Trac, and originally had these lines at the top:

# -*- coding: iso8859-1 -*-  

This made me suspicious, so I removed all of them. The CGI execution time
immediately dropped to ~1 second. gprof revealed that
_PyUnicodeUCS2_IsLinebreak is not called at all anymore.

Now that our code works fast enough, I don't really care about this, but I
thought python-dev might want to know something weird is going on with unicode
splitlines.

I documented my investigation of this problem; if anyone wants further details,
just email me. (I'm not on python-dev)
http://www.third-bit.com/trac/argon/ticket/525

Thanks in advance,
Keir

From martin at v.loewis.de  Tue Aug 23 23:13:42 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Aug 2005 23:13:42 +0200
Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219,
 1.220
In-Reply-To: <000201c5a816$3caacaa0$8901a044@oemcomputer>
References: <000201c5a816$3caacaa0$8901a044@oemcomputer>
Message-ID: <430B9186.3010106@v.loewis.de>

Raymond Hettinger wrote:
> But we still have to get the code back to ANSI compliance.
> Do you have an ANSI-strict option with your compiler?

Please don't call this "ANSI compliant". ANSI does many more
thinks that writing C standards, and, in the specific case,
the code *is* ANSI compliant as it stands - it just doesn't
comply to C89. It complies to ISO C 99, which (I believe)
is also an U.S. American national (ANSI) standard.

gcc does have an option to force c89 compliance, but there
is a good chance that Python stops compiling with option:
on many systems, essential system headers fail to comply
with C89 (in addition, activating that mode also makes many
extensions unavailable).

Regards,
Martin

From tjreedy at udel.edu  Tue Aug 23 23:23:50 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 23 Aug 2005 17:23:50 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
	_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
	2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35,
	2.36 shamodule.c, 2.22, 2.23
References: <2mmzn8slmv.fsf@starship.python.net>
	<000101c5a7f9$efb77480$8901a044@oemcomputer>
Message-ID: <deg455$li1$1@sea.gmane.org>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote in message 
news:000101c5a7f9$efb77480$8901a044 at oemcomputer...
> Except from PEP 7:
>
>  "Use ANSI/ISO standard C (the 1989 version of the standard)."

Just checked (P&B, Standard C): only one L allowed, not two.  But with C99 
compilers becoming more common, accidental usages of C99-isms in submitted 
code will likely become more common, especially when there is not a 
graceful C89 alternative.  While the current policy should be followed 
while it remains the policy,  it may need revision someday.

Terry J. Reedy


From greg at electricrain.com  Tue Aug 23 23:32:22 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Tue, 23 Aug 2005 14:32:22 -0700
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
	_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
	2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35,
	2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <000101c5a7f5$989d5100$8901a044@oemcomputer>
References: <2mvf1wsp0k.fsf@starship.python.net>
	<000101c5a7f5$989d5100$8901a044@oemcomputer>
Message-ID: <20050823213222.GK16043@zot.electricrain.com>

> The project files are just text files and can be updated simply and
> directly.  But yes, that is no big deal and I'll just do it for him once
> the code gets to a compilable state.

I just checked in an update removing all of the ULLs.  Could you check
that it compiles on windows and passes test_hashlib.py now?

It does leave gcc 3.x users with a big mess of compiler warnings to
deal with but those can be worked around once the build is actually
working everywhere.

thanks.
Greg

> Aside from the project files, there is still config.c and whatnot.  We
> should put together a checklist of all the things that need to be
> updated when a new module is added.

that'd be helpful. :)


From martin at v.loewis.de  Tue Aug 23 23:43:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 Aug 2005 23:43:34 +0200
Subject: [Python-Dev] [Python-checkins]
 python/dist/src/Modules	_hashopenssl.c, NONE, 2.1 sha256module.c, NONE,
 2.1 sha512module.c, NONE, 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22,
 2.23
In-Reply-To: <deg455$li1$1@sea.gmane.org>
References: <2mmzn8slmv.fsf@starship.python.net>	<000101c5a7f9$efb77480$8901a044@oemcomputer>
	<deg455$li1$1@sea.gmane.org>
Message-ID: <430B9886.4060004@v.loewis.de>

Terry Reedy wrote:
> Just checked (P&B, Standard C): only one L allowed, not two.  But with C99 
> compilers becoming more common, accidental usages of C99-isms in submitted 
> code will likely become more common, especially when there is not a 
> graceful C89 alternative.  While the current policy should be followed 
> while it remains the policy,  it may need revision someday.

I think Python switched to C89 in 1999 (shortly before C99 was
published, IIRC). So the canonical time for switching to C99 would
be in 2009, provided all interesting compilers have implemented it
by then, atleast to the degree that Python would typically need.

Regards,
Martin

From raymond.hettinger at verizon.net  Wed Aug 24 02:29:37 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 20:29:37 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Modules
 _hashopenssl.c, NONE, 2.1 sha256module.c, NONE, 2.1 sha512module.c, NONE,
 2.1 md5module.c, 2.35, 2.36 shamodule.c, 2.22, 2.23
In-Reply-To: <20050823213222.GK16043@zot.electricrain.com>
Message-ID: <001001c5a842$e9ac0580$ab12c797@oemcomputer>

[Gregory P. Smith]
> I just checked in an update removing all of the ULLs.  Could you check
> that it compiles on windows and passes test_hashlib.py now?

Okay, all is well.


Raymond


From raymond.hettinger at verizon.net  Wed Aug 24 03:23:32 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 21:23:32 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
Message-ID: <001601c5a84a$715b9200$ab12c797@oemcomputer>

The latest version of PEP 348 still proposes that a bare except clause
will default to Exception instead of BaseException.  Initially, I had
thought that might be a good idea but now think it is doomed and needs
to be removed from the PEP.

A bare except belongs at the end of a try suite, not in the middle.
This is obvious when compared to:

  if a:    ...
  elif b:  ...
  elif c:  ...
  else:    ...     # The bare else goes at the end 
                   # and serves as a catchall

or 

  switch c
  case a:  ...
  case b:  ...
  default: ...     # The bare default goes at the end
                   # and serves as a catchall

In contrast, Brett's 8/9 note revealed that the following would be
allowable and common if the PEP is accepted in its current form:

  try:     ...
  except:  ...     # A bare except in the middle.  WTF?
  except (KeyboardInterrupt, SystemExit): ...

The right way is, of course:

  try:     ...
  except (KeyboardInterrupt, SystemExit): ...
  except:          # Implicit or explicit match to BaseException
                   # that serves as a catchall

For those not needing a terminating exception handler, the rest of the
PEP appropriately allows and encourages a simple and explicit solution
that meets most needs:

  try:               ...
  except Exception:  ...

The core issue is that the most obvious meaning of a bare except is
"catchall", not "catchmost".  When the latter is intended, the simple
and explicit form shown in the last example is the way to go.  If the
former is intended, then either a bare except clause or explicit mention
of BaseException will do nicely.

However, under the PEP proposal, both new and existing code will suffer
from having bare except clauses that look like they catch everything,
are intended to catch everything, but, in fact, do not.  That kind of
optical illusion error must be avoided.  There is no getting around our
mind's propensity to interpret the bare form as defaulting to the top of
the tree rather than the middle as proposed by the PEP.  

Likewise, there is no getting around the mental confusion caused a bare
except clause in the middle of a try-suite rather than at the end.  We
have to avoid code that looks like it does one thing but actually does
something else.


Raymond


From gvanrossum at gmail.com  Wed Aug 24 03:30:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue, 23 Aug 2005 18:30:20 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <001601c5a84a$715b9200$ab12c797@oemcomputer>
References: <001601c5a84a$715b9200$ab12c797@oemcomputer>
Message-ID: <ca471dc205082318302ea36ab7@mail.gmail.com>

On 8/23/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> The latest version of PEP 348 still proposes that a bare except clause
> will default to Exception instead of BaseException.  Initially, I had
> thought that might be a good idea but now think it is doomed and needs
> to be removed from the PEP.

If we syntactically enforce that the bare except, if present, must be
last, would that remove your objection? I agree that a bare except in
the middle is an anomaly, but that doesn't mean we can't keep bare
except: as a shorthand for except Exception:.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Wed Aug 24 04:41:01 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 23 Aug 2005 22:41:01 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082318302ea36ab7@mail.gmail.com>
Message-ID: <001d01c5a855$447a9d20$ab12c797@oemcomputer>

[Guido van Rossum]
> If we syntactically enforce that the bare except, if present, must be
> last, would that remove your objection? I agree that a bare except in
> the middle is an anomaly, but that doesn't mean we can't keep bare
> except: as a shorthand for except Exception:.

Hmm.  Prohibiting mid-suite bare excepts is progress and eliminates the
case that causes immediate indigestion.

As for the rest, I'm not as sure and it would be helpful to get thoughts
from others on this one.  My sense is that blocking the clause from
appearing in the middle is treating the symptom and not the disease.

The motivating case for the most of the PEP was that folks were writing
bare except clauses and trapping more than they should.  Much of the
concern was dealt with just by giving a choice between writing Exception
and BareException depending on the intended result.

That leaves the question of the default value a bare except with
Exception being the most useful and BaseException being the most
obvious.

While I don't doubt that Exception is the more useful, we have already
introduced a new builtin and moved two other exceptions to meet this
need.  Going further and altering the meaning of bare except seems like
overkill for a relatively small issue.

My remaining concern is about obviousness.  How much code has been
written or will be written that intends a bare except to mean
BaseException instead of Exception.  Would such code erroneously pass a
code review or inspection.  I suspect it would.  The code looks like
does one thing but actually does something else.  This may or may not be
a big deal.


Raymond


From niko at alum.mit.edu  Wed Aug 24 09:07:58 2005
From: niko at alum.mit.edu (Niko Matsakis)
Date: Wed, 24 Aug 2005 09:07:58 +0200
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
Message-ID: <00236367-938E-4D75-866E-2F1A5DEABEC0@alum.mit.edu>

> As for the rest, I'm not as sure and it would be helpful to get  
> thoughts
> from others on this one.  My sense is that blocking the clause from
> appearing in the middle is treating the symptom and not the disease.

+1

It would be better to prohibit bare except entirely (well, presumably  
at some point in the future with appropriate warnings at the moment)  
than change its semantics.  I agree that its intuitive meaning is "if  
anything is thrown", not, "if a non-programmer-error exception is  
thrown," but I'm not sure if that's even important.  The point is  
that it has existing well defined semantics; changing them just seems  
unnecessary to the aims of the rewrite and confusing to existing  
Python programmers.

I've written plenty of code with bare excepts and they all intended  
to catch *any* exception, usually in a user interface where I wanted  
to return to the main loop on programmer error not abort the entire  
program.  I don't relish the thought of going back and changing  
existing code, and I imagine there are few who do.


My 2 cents,
Niko

From ncoghlan at gmail.com  Wed Aug 24 11:26:02 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 24 Aug 2005 19:26:02 +1000
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <001601c5a84a$715b9200$ab12c797@oemcomputer>
References: <001601c5a84a$715b9200$ab12c797@oemcomputer>
Message-ID: <430C3D2A.3070103@gmail.com>

Raymond Hettinger wrote:
> The latest version of PEP 348 still proposes that a bare except clause
> will default to Exception instead of BaseException.  Initially, I had
> thought that might be a good idea but now think it is doomed and needs
> to be removed from the PEP.

One thing I assumed was that _if_ bare excepts were kept, they would still 
only be allowed as the last except clause.

That is, this example:
>   try:     ...
>   except:  ...     # A bare except in the middle.  WTF?
>   except (KeyboardInterrupt, SystemExit): ...

would still be a syntax error, even if bare excepts were allowed.

I still have some qualms about the idea of a bare except that doesn't catch 
everything (I'd prefer to see them gone altogether), but I don't mind quite as 
much if the above code stays as a syntax error.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From walter at livinglogic.de  Wed Aug 24 11:45:33 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 11:45:33 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <20050823201021.GE32195@cs.toronto.edu>
References: <20050823201021.GE32195@cs.toronto.edu>
Message-ID: <430C41BD.8010602@livinglogic.de>

Keir Mierle wrote:

> Hi, I'm working on Argon (http://www.third-bit.com/trac/argon) with Greg
> Wilson this summer
> 
> We're having a very strange problem with Python's unicode parsing of source
> files. Basically, our CGI script was running extremely slowly on our production
> box (a pokey dual-Xeon 3GHz w/ 4GB RAM and 15K SCSI drives). Slow to the tune
> of 6-10 seconds per request. I eventually tracked this down to imports of our
> source tree; the actual request was completing in 300ms, the rest of the time
> was spent in __import__.

This is caused by the chances to the codecs in 2.4. Basically the codecs 
no longer rely on C's readline() to do line splitting (which can't work 
for UTF-16), but do it themselves (via unicode.splitlines()).

> After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was
> getting called 51 million times. Our code is 1.2 million characters, so I
> hardly think it makes sense to call IsLinebreak 50 times for each character;
> and we're not even importing our entire source tree on every invocation.

But if you're using CGI, you're importing your source on every 
invocation. Switching to a different server side technology might help. 
Nevertheless 50 million calls seems to be a bit much.

> Our code is a fork of Trac, and originally had these lines at the top:
> 
> # -*- coding: iso8859-1 -*-  
> 
> This made me suspicious, so I removed all of them. The CGI execution time
> immediately dropped to ~1 second. gprof revealed that
> _PyUnicodeUCS2_IsLinebreak is not called at all anymore.
> 
> Now that our code works fast enough, I don't really care about this, but I
> thought python-dev might want to know something weird is going on with unicode
> splitlines.

I wonder if we should switch back to a simple readline() implementation 
for those codecs that don't require the current implementation 
(basically every charmap codec). AFAIK source files are opened in 
universal newline mode, so at least we'd get proper treatment of "\n", 
"\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
to unicode.splitlines()).

> I documented my investigation of this problem; if anyone wants further details,
> just email me. (I'm not on python-dev)
> http://www.third-bit.com/trac/argon/ticket/525

Bye,
    Walter D?rwald

From martin at v.loewis.de  Wed Aug 24 12:16:25 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 12:16:25 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C41BD.8010602@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de>
Message-ID: <430C48F9.8060801@v.loewis.de>

Walter D?rwald wrote:
> This is caused by the chances to the codecs in 2.4. Basically the codecs 
> no longer rely on C's readline() to do line splitting (which can't work 
> for UTF-16), but do it themselves (via unicode.splitlines()).

That explains why you get any calls to IsLineBreak; it doesn't explain
why you get so many of them.

I investigated this a bit, and one issue seems to be that
StreamReader.readline performs splitline on the entire input, only to
fetch the first line. It then joins the rest for later processing.
In addition, it also performs splitlines on a single line, just to
strip any trailing line breaks.

The net effect is that, for a file with N lines, IsLineBreak is invoked
up to N*N/2 times per character (atleast for the last character).

So I think it would be best if Unicode characters exposed a .islinebreak
method (or, failing that, codecs just knew what the line break
characters are in Unicode 3.2), and then codecs would split off
the first line of input itself.

>>After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was
>>getting called 51 million times. Our code is 1.2 million characters, so I
>>hardly think it makes sense to call IsLinebreak 50 times for each character;
>>and we're not even importing our entire source tree on every invocation.
> 
> 
> But if you're using CGI, you're importing your source on every 
> invocation.

Well, no. Only the CGI script needs to be parsed every time; all modules
could load off bytecode files.

Which suggests that Keir Mierle doesn't use bytecode files, I think he
should.

Regards,
Martin

From mal at egenix.com  Wed Aug 24 12:27:45 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 24 Aug 2005 12:27:45 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C41BD.8010602@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de>
Message-ID: <430C4BA1.5030503@egenix.com>

Walter D?rwald wrote:
> I wonder if we should switch back to a simple readline() implementation 
> for those codecs that don't require the current implementation 
> (basically every charmap codec). 

That would be my preference as well. The 2.4 .readline() approach
is really only needed for codecs that have to deal with encodings
that:

a) use multi-byte formats, or
b) support more line-end formats than just CR, CRLF, LF, or
c) are stateful.

This can easily be had by using a mix-in class for
codecs which do need the buffered .readline() approach.

> AFAIK source files are opened in 
> universal newline mode, so at least we'd get proper treatment of "\n", 
> "\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
> u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
> to unicode.splitlines()).

While the Unicode standard defines these characters as line
end code points, I think their definition does not necessarily
apply to data that is converted from a certain encoding to
Unicode, so that's not a big loss.

E.g. in ASCII or Latin-1, FILE, GROUP and RECORD
SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85)
are not interpreted as line end characters.

Furthermore, we had no reports of anyone complaining in
Python 1.6, 2.0 - 2.3 that line endings were not detected
properly. All these Python versions relied on the stream's
.readline() method to get the next line. The only bug reports
we had were for UTF-16 which falls into the above
category a) and did not support .readline() until Python 2.4.

A note on the performance of _PyUnicode_IsLinebreak():
in Python 2.0 Fredrik changed this to use the two step
lookup (reducing the size of the lookup tables considerably).

I think it's worthwhile reconsidering this approach for
character type queries that do no involve a huge number
of code points.

In Python 1.6 the function looked like this (and was
inlined by the compiler using its own fast lookup
table):

int _PyUnicode_IsLinebreak(register const Py_UNICODE ch)
{
    switch (ch) {
    case 0x000A: /* LINE FEED */
    case 0x000D: /* CARRIAGE RETURN */
    case 0x001C: /* FILE SEPARATOR */
    case 0x001D: /* GROUP SEPARATOR */
    case 0x001E: /* RECORD SEPARATOR */
    case 0x0085: /* NEXT LINE */
    case 0x2028: /* LINE SEPARATOR */
    case 0x2029: /* PARAGRAPH SEPARATOR */
	return 1;
    default:
	return 0;
    }
}

another candidate to convert back is:

int _PyUnicode_IsWhitespace(register const Py_UNICODE ch)
{
    switch (ch) {
    case 0x0009: /* HORIZONTAL TABULATION */
    case 0x000A: /* LINE FEED */
    case 0x000B: /* VERTICAL TABULATION */
    case 0x000C: /* FORM FEED */
    case 0x000D: /* CARRIAGE RETURN */
    case 0x001C: /* FILE SEPARATOR */
    case 0x001D: /* GROUP SEPARATOR */
    case 0x001E: /* RECORD SEPARATOR */
    case 0x001F: /* UNIT SEPARATOR */
    case 0x0020: /* SPACE */
    case 0x0085: /* NEXT LINE */
    case 0x00A0: /* NO-BREAK SPACE */
    case 0x1680: /* OGHAM SPACE MARK */
    case 0x2000: /* EN QUAD */
    case 0x2001: /* EM QUAD */
    case 0x2002: /* EN SPACE */
    case 0x2003: /* EM SPACE */
    case 0x2004: /* THREE-PER-EM SPACE */
    case 0x2005: /* FOUR-PER-EM SPACE */
    case 0x2006: /* SIX-PER-EM SPACE */
    case 0x2007: /* FIGURE SPACE */
    case 0x2008: /* PUNCTUATION SPACE */
    case 0x2009: /* THIN SPACE */
    case 0x200A: /* HAIR SPACE */
    case 0x200B: /* ZERO WIDTH SPACE */
    case 0x2028: /* LINE SEPARATOR */
    case 0x2029: /* PARAGRAPH SEPARATOR */
    case 0x202F: /* NARROW NO-BREAK SPACE */
    case 0x3000: /* IDEOGRAPHIC SPACE */
	return 1;
    default:
	return 0;
    }
}

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 23 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Wed Aug 24 12:56:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 12:56:58 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C4BA1.5030503@egenix.com>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C4BA1.5030503@egenix.com>
Message-ID: <430C527A.8090302@v.loewis.de>

M.-A. Lemburg wrote:
> I think it's worthwhile reconsidering this approach for
> character type queries that do no involve a huge number
> of code points.

I would advise against that. I measure both versions
(your version called PyUnicode_IsLinebreak2) with the
following code

volatile int result;
void unibench()
{
#define REPS 10000000000LL
  long long i;
  clock_t s1,s2,s3,s4,s5;
  s1 = clock();
  for(i=0;i<REPS;i++)
    result = _PyUnicode_IsLinebreak('(');
  s2 = clock();
  for(i=0;i<REPS;i++)
    result = PyUnicode_IsLinebreak2('(');
  s3 = clock();
  for(i=0;i<REPS;i++)
    result = _PyUnicode_IsLinebreak('\n');
  s4 = clock();
  for(i=0;i<REPS;i++)
    result = PyUnicode_IsLinebreak2('\n');
  s5 = clock();
  printf("f1, (: %d\nf2, (: %d\nf1, CR: %d\n, f2, CR: %d\n",
	 (int)(s2-s1),(int)(s3-s2),(int)(s4-s3),(int)(s5-s4));
}

and got those numbers

f1, (: 13210000
f2, (: 13300000
f1, CR: 13220000
, f2, CR: 13250000

What can be seen is that performance the two versions is nearly
identical, with the code currently used being slightly better.
What can also be seen is that, on my machine, 1e10 calls to
IsLinebreak take 13.2 seconds. So 51  Mio calls take about 70ms.

The reported performance problem is more likely in the allocation
of all these splitlines results, and the copying of the same
strings over and over again.

Regards,
Martin

From walter at livinglogic.de  Wed Aug 24 13:47:58 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 13:47:58 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C48F9.8060801@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de>
Message-ID: <430C5E6E.2040405@livinglogic.de>

Martin v. L?wis wrote:

> Walter D?rwald wrote:
> 
>>This is caused by the chances to the codecs in 2.4. Basically the codecs 
>>no longer rely on C's readline() to do line splitting (which can't work 
>>for UTF-16), but do it themselves (via unicode.splitlines()).
> 
> That explains why you get any calls to IsLineBreak; it doesn't explain
> why you get so many of them.
> 
> I investigated this a bit, and one issue seems to be that
> StreamReader.readline performs splitline on the entire input, only to
> fetch the first line. It then joins the rest for later processing.
> In addition, it also performs splitlines on a single line, just to
> strip any trailing line breaks.

This is because unicode.splitlines() is the only API available to Python 
that knows about unicode line feeds.

> The net effect is that, for a file with N lines, IsLineBreak is invoked
> up to N*N/2 times per character (atleast for the last character).
 >
> So I think it would be best if Unicode characters exposed a .islinebreak
> method (or, failing that, codecs just knew what the line break
> characters are in Unicode 3.2), and then codecs would split off
> the first line of input itself.

I think a maxsplit argument (just as for unicode.split()) would help too.

> [...]

Bye,
    Walter D?rwald

From mal at egenix.com  Wed Aug 24 14:24:42 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 24 Aug 2005 14:24:42 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C527A.8090302@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C4BA1.5030503@egenix.com> <430C527A.8090302@v.loewis.de>
Message-ID: <430C670A.3090408@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>I think it's worthwhile reconsidering this approach for
>>character type queries that do no involve a huge number
>>of code points.
> 
> 
> I would advise against that. I measure both versions
> (your version called PyUnicode_IsLinebreak2) with the
> following code
> 
> volatile int result;
> void unibench()
> {
> #define REPS 10000000000LL
>   long long i;
>   clock_t s1,s2,s3,s4,s5;
>   s1 = clock();
>   for(i=0;i<REPS;i++)
>     result = _PyUnicode_IsLinebreak('(');
>   s2 = clock();
>   for(i=0;i<REPS;i++)
>     result = PyUnicode_IsLinebreak2('(');
>   s3 = clock();
>   for(i=0;i<REPS;i++)
>     result = _PyUnicode_IsLinebreak('\n');
>   s4 = clock();
>   for(i=0;i<REPS;i++)
>     result = PyUnicode_IsLinebreak2('\n');
>   s5 = clock();
>   printf("f1, (: %d\nf2, (: %d\nf1, CR: %d\n, f2, CR: %d\n",
> 	 (int)(s2-s1),(int)(s3-s2),(int)(s4-s3),(int)(s5-s4));
> }
> 
> and got those numbers
> 
> f1, (: 13210000
> f2, (: 13300000
> f1, CR: 13220000
> , f2, CR: 13250000
> 
> What can be seen is that performance the two versions is nearly
> identical, with the code currently used being slightly better.
> What can also be seen is that, on my machine, 1e10 calls to
> IsLinebreak take 13.2 seconds. So 51  Mio calls take about 70ms.

Your test is somewhat biased: the current solution
works using type records, so it has to swap in a new
record for each character you test. In you benchmark,
the same character is tested over and over again
and the type record likely already stored in the
CPU cache.

The .splitlines() routine itself calls the above
function for each and every character in the string,
so quite a few of these type records have to be
looked up.

Here's a version that uses os.py as basis:

#include <stdlib.h>
#include <time.h>
#include "Python.h"

int _PyUnicode_IsLinebreak16(register const Py_UNICODE ch)
{
    switch (ch) {
    case 0x000A: /* LINE FEED */
    case 0x000D: /* CARRIAGE RETURN */
    case 0x001C: /* FILE SEPARATOR */
    case 0x001D: /* GROUP SEPARATOR */
    case 0x001E: /* RECORD SEPARATOR */
    case 0x0085: /* NEXT LINE */
    case 0x2028: /* LINE SEPARATOR */
    case 0x2029: /* PARAGRAPH SEPARATOR */
	return 1;
    default:
	return 0;
    }
}

#define REPS 10000
#define BUFFERSIZE 30000

int main(void)
{
    long i, j;
    clock_t s1,s2,s3;
    char *buffer;
    FILE *datafile;
    long filelen;
    int result;

    datafile = fopen("os.py", "rb");
    if (datafile == NULL) {
	printf("could not find os.py\n");
	return -1;
    }
    buffer = (char *)malloc(BUFFERSIZE);
    filelen = fread(buffer, 1, BUFFERSIZE, datafile);
    printf("filelen=%li bytes\n", filelen);

    s1 = clock();

    /* Python 2.4 */
    for(i = 0; i < REPS; i++)
	for (j = 0; j < filelen; j++)
	    result = _PyUnicode_IsLinebreak((Py_UNICODE)buffer[j]);
    s2 = clock();

    /* Python 1.6 */
    for(i = 0; i < REPS; i++)
	for (j = 0; j < filelen; j++)
	    result = _PyUnicode_IsLinebreak16((Py_UNICODE)buffer[j]);
    s3 = clock();

    printf("2.4: %d\n"
	   "1.6: %d\n",
	   (int)(s2-s1),
	   (int)(s3-s2));
    return 0;
}

Output, compiled with -O3:

filelen=23147 bytes
2.4: 2570000
1.6: 1230000

That's a factor 2.

> The reported performance problem is more likely in the allocation
> of all these splitlines results, and the copying of the same
> strings over and over again.

True.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 23 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Wed Aug 24 14:56:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 14:56:30 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C5E6E.2040405@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
Message-ID: <430C6E7E.7070106@v.loewis.de>

Walter D?rwald wrote:
> I think a maxsplit argument (just as for unicode.split()) would help too.

Correct - that would allow to get rid of the quadratic part.
We should also strive for avoiding the second copy of the line,
if the user requested keepends.

I wonder whether it would be worthwhile to cache the .splitlines result.
An application that has just invoked .readline() will likely invoke
.readline() again. If there is more than one line left, we could return
the first line right away (potentially trimming the line ending if
necessary). Only when a single line is left, we would attempt to
read more data. In a plain .read(), we would first join the lines
back.

Regards,
Martin

From mcherm at mcherm.com  Wed Aug 24 15:08:56 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 24 Aug 2005 06:08:56 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
Message-ID: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>

Raymond Hettinger writes:
> The latest version of PEP 348 still proposes that a bare except clause
> will default to Exception instead of BaseException.  Initially, I had
> thought that might be a good idea but now think it is doomed and needs
> to be removed from the PEP.

Guido writes:
> If we syntactically enforce that the bare except, if present, must be
> last, would that remove your objection? I agree that a bare except in
> the middle is an anomaly, but that doesn't mean we can't keep bare
> except: as a shorthand for except Exception:.

Explicit is better than Implicit. I think that in newly written code
"except Exception:" is better (more explicit and easier to understand)
than "except:" Legacy code that uses "except:" can remain unchanged *IF*
the meaning of "except:" is unchanged... but I think we all agree that
this is unwise because the existing meaning is a tempting trap for the
unwary. So I don't see any advantage to keeping bare "except:" in the
long run. What we do to ease the transition is a different question,
but one more easily resolved.

-- Michael Chermside


From walter at livinglogic.de  Wed Aug 24 16:08:12 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 16:08:12 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C6E7E.7070106@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de>
Message-ID: <430C7F4C.9010703@livinglogic.de>

Martin v. L?wis wrote:

> Walter D?rwald wrote:
> 
>>I think a maxsplit argument (just as for unicode.split()) would help too.
> 
> Correct - that would allow to get rid of the quadratic part.

OK, such a patch should be rather simple. I'll give it a try.

> We should also strive for avoiding the second copy of the line,
> if the user requested keepends.

Your suggested unicode method islinebreak() would help with that. Then 
we could add the following to the string module:

unicodelinebreaks = u"".join(unichr(c) for c in xrange(0, 
sys.maxunicode) if unichr(c).islinebreak())

Then

     if line and not keepends:
         line = line.splitlines(False)[0]

could be

     if line and not keepends:
         line = line.rstrip(string.unicodelinebreaks)

> I wonder whether it would be worthwhile to cache the .splitlines result.
> An application that has just invoked .readline() will likely invoke
> .readline() again. If there is more than one line left, we could return
> the first line right away (potentially trimming the line ending if
> necessary). Only when a single line is left, we would attempt to
> read more data. In a plain .read(), we would first join the lines
> back.

OK, this would mean we'd have to distinguish between a direct call to 
read() and one done by readline() (which we do anyway through the 
firstline argument).

Bye,
    Walter D?rwald

From martin at v.loewis.de  Wed Aug 24 16:33:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 16:33:50 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C7F4C.9010703@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
Message-ID: <430C854E.1080200@v.loewis.de>

Walter D?rwald wrote:
> Martin v. L?wis wrote:
> 
>> Walter D?rwald wrote:
>>
>>> I think a maxsplit argument (just as for unicode.split()) would help
>>> too.
>>
>>
>> Correct - that would allow to get rid of the quadratic part.
> 
> 
> OK, such a patch should be rather simple. I'll give it a try.

Actually, on a second thought - it would not remove the quadratic
aspect. You would still copy the rest string completely on each
split. So on the first split, you copy N lines (one result line,
and N-1 lines into the rest string), on the second split, N-2
lines, and so on, totalling N*N/2 line copies again. The only
thing you save is the join (as the rest is already joined), and
the IsLineBreak calls (which are necessary only for the first
line).

Please see python.org/sf/1268314; it solves the problem by
keeping the splitlines result. It only invokes IsLineBreak
once per character, and also copies each character only once,
and allocates each line only once, totalling in O(N) for
these operations. It still does contain a quadratic operation:
the lines are stored in a list, and the result list is
removed from the list with del lines[0]. This copies N-1
pointers, result in N*N/2 pointer copies. That should still
be much faster than the current code.

> unicodelinebreaks = u"".join(unichr(c) for c in xrange(0,
> sys.maxunicode) if unichr(c).islinebreak())

That is very inefficient. I would rather add a static list
to the string module, and have a test that says

assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c
in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or
unicodedata.category(ch)=='Zl')

unicodelinebreaks could then be defined as

# u"\r\n\x1c\x1d\x1e\x85\u2028\u2029
'\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8")

> OK, this would mean we'd have to distinguish between a direct call to
> read() and one done by readline() (which we do anyway through the
> firstline argument).

See my patch. If we have cached lines, we don't need to call .read
at all.

Regards,
Martin

From foom at fuhm.net  Wed Aug 24 16:34:53 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 24 Aug 2005 10:34:53 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
Message-ID: <B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>


On Aug 23, 2005, at 10:41 PM, Raymond Hettinger wrote:

> [Guido van Rossum]
>
>> If we syntactically enforce that the bare except, if present, must be
>> last, would that remove your objection? I agree that a bare except in
>> the middle is an anomaly, but that doesn't mean we can't keep bare
>> except: as a shorthand for except Exception:.
>>
>
> Hmm.  Prohibiting mid-suite bare excepts is progress and eliminates  
> the
> case that causes immediate indigestion.
>
> As for the rest, I'm not as sure and it would be helpful to get  
> thoughts
> from others on this one.  My sense is that blocking the clause from
> appearing in the middle is treating the symptom and not the disease.
>

I would rather see "except:" be deprecated eventually, and force the  
user to say either except Exception, except BaseException, or even  
better, except ActualExceptionIWantToCatch.

James

From barry at python.org  Wed Aug 24 17:03:52 2005
From: barry at python.org (Barry Warsaw)
Date: Wed, 24 Aug 2005 11:03:52 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
Message-ID: <1124895832.19291.10.camel@geddy.wooz.org>

On Wed, 2005-08-24 at 10:34, James Y Knight wrote:

> I would rather see "except:" be deprecated eventually, and force the  
> user to say either except Exception, except BaseException, or even  
> better, except ActualExceptionIWantToCatch.

I agree about bare except, but there is a very valid use case for an
except clause that catches every possible exception.  We need to make
sure we don't overlook this use case.  As an example, say I'm building a
transaction-aware system, I'm going to want to write code like this:

txn = new_transaction()
try:
    txn.begin()
    rtn = do_work()
except AllPossibleExceptions:
    txn.abort()
    raise
else:
    txn.commit()
    return rtn

I'm fine with not spelling that except statement as "except:" but I
don't want there to be any exceptions that can sneak past that middle
suite, including non-errors like SystemExit or KeyboardInterrupt.

I can't remember ever writing a bare except with a suite that didn't
contain (end in?) a bare raise.  Maybe we can allow bare except, but
constrain things so that the only way out of its suite is via a bare
raise.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/46c9f344/attachment.pgp

From gvanrossum at gmail.com  Wed Aug 24 17:10:37 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 08:10:37 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
Message-ID: <ca471dc205082408103f0adc81@mail.gmail.com>

On 8/24/05, Michael Chermside <mcherm at mcherm.com> wrote:
> Explicit is better than Implicit. I think that in newly written code
> "except Exception:" is better (more explicit and easier to understand)
> than "except:" Legacy code that uses "except:" can remain unchanged *IF*
> the meaning of "except:" is unchanged... but I think we all agree that
> this is unwise because the existing meaning is a tempting trap for the
> unwary. So I don't see any advantage to keeping bare "except:" in the
> long run. What we do to ease the transition is a different question,
> but one more easily resolved.

OK, I'm convinced. Let's drop bare except for Python 3.0, and
deprecate them until then, without changing the meaning.

The deprecation message (to be generated by the compiler!) should
steer people in the direction of specifying one particular exception
(e.g. KeyError etc.) rather than Exception.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/

From foom at fuhm.net  Wed Aug 24 17:23:52 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 24 Aug 2005 11:23:52 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082408103f0adc81@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
Message-ID: <D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>


On Aug 24, 2005, at 11:10 AM, Guido van Rossum wrote:

> On 8/24/05, Michael Chermside <mcherm at mcherm.com> wrote:
>
>> Explicit is better than Implicit. I think that in newly written code
>> "except Exception:" is better (more explicit and easier to  
>> understand)
>> than "except:" Legacy code that uses "except:" can remain  
>> unchanged *IF*
>> the meaning of "except:" is unchanged... but I think we all agree  
>> that
>> this is unwise because the existing meaning is a tempting trap for  
>> the
>> unwary. So I don't see any advantage to keeping bare "except:" in the
>> long run. What we do to ease the transition is a different question,
>> but one more easily resolved.
>>
>
> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> deprecate them until then, without changing the meaning.
>
> The deprecation message (to be generated by the compiler!) should
> steer people in the direction of specifying one particular exception
> (e.g. KeyError etc.) rather than Exception.

I agree but there's the minor nit of non-Exception exceptions.

I think it must be the case that raising an object which does not  
derive from an exception class must be deprecated as well in order  
for "except:" to be deprecated. Otherwise, there is nothing you can  
change "except:" to in order not to get a deprecation warning and  
still have your code be correct in the face of documented features of  
python.

James


From raymond.hettinger at verizon.net  Wed Aug 24 17:27:19 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 24 Aug 2005 11:27:19 -0400
Subject: [Python-Dev] FW:  Bare except clauses in PEP 348
Message-ID: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer>

Hey guys, don't give up your bare except clauses so easily.

They are useful.  And, if given the natural meaning of "catch
everything" and put in a natural position at the end of a suite, their
meaning is plain and obvious.  Remember beauty counts.  I don't think
there would be similar temptation to eliminate a dangling else clause
and replace it with "else Everything".  Nor would a final default case
in a switch statement benefit from being written as "default
Everything".

The thought is that it is okay to have useful defaults.  My whole issue
was that the PEP was choosing the wrong default.  If we leave it alone,
all is well.  An empty except will continue to mean "catch everything",
it will always appear at the end, its meaning will be obvious, and
existing working code won't break :-)

On the occasions where you really intended to catch everything, do you
really want to go on an editing binge just to uglify the code to
something like:

   try:                   
       ...
   except SomeException: 
       ...
   except BaseException:  
       ...


It is more beautiful and clear as:

   try:                   
       ...
   except SomeException: 
       ...
   except:  
       ...

To me, the latter is more attractive and is more obviously a catchall,
just like an else-clause or a default-statement.  It is a strong visual
cue that at least one of the except clauses will always be triggered.
In contrast, the first example makes you think twice about whether the
final case really does get everything (sometimes implicit IS better than
explicit).


Raymond


From shane at hathawaymix.org  Wed Aug 24 18:17:20 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Wed, 24 Aug 2005 10:17:20 -0600
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <1124895832.19291.10.camel@geddy.wooz.org>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
	<1124895832.19291.10.camel@geddy.wooz.org>
Message-ID: <430C9D90.6090102@hathawaymix.org>

Barry Warsaw wrote:
> I agree about bare except, but there is a very valid use case for an
> except clause that catches every possible exception.  We need to make
> sure we don't overlook this use case.  As an example, say I'm building a
> transaction-aware system, I'm going to want to write code like this:
> 
> txn = new_transaction()
> try:
>     txn.begin()
>     rtn = do_work()
> except AllPossibleExceptions:
>     txn.abort()
>     raise
> else:
>     txn.commit()
>     return rtn
> 
> I'm fine with not spelling that except statement as "except:" but I
> don't want there to be any exceptions that can sneak past that middle
> suite, including non-errors like SystemExit or KeyboardInterrupt.
> 
> I can't remember ever writing a bare except with a suite that didn't
> contain (end in?) a bare raise.  Maybe we can allow bare except, but
> constrain things so that the only way out of its suite is via a bare
> raise.

I also use this idiom quite frequently, but I wonder if a finally clause 
would be a better way to write it:

txn = new_transaction()
try:
     txn.begin()
     rtn = do_work()
finally:
     if exception_occurred():
         txn.abort()
     else:
         txn.commit()
         return rtn

Since this doesn't use try/except/else, it's not affected by changes to 
the meaning of except clauses.  However, it forces more indentation and 
expects a new builtin, and the name "exception_occurred" is probably too 
long for a builtin.

Now for a weird idea.

txn = new_transaction()
try:
     txn.begin()
     rtn = do_work()
finally except:
     txn.abort()
finally else:
     txn.commit()
     return rtn

This is what I would call qualified finally clauses.  The interpreter 
chooses exactly one of the finally clauses.  If a "finally except" 
clause is chosen, the exception is re-raised before execution continues. 
  Most code that currently uses bare raise inside bare except could just 
prefix the "except" and "else" keywords with "finally".

Shane

From niko at alum.mit.edu  Wed Aug 24 18:29:36 2005
From: niko at alum.mit.edu (Niko Matsakis)
Date: Wed, 24 Aug 2005 18:29:36 +0200
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <430C9D90.6090102@hathawaymix.org>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
	<1124895832.19291.10.camel@geddy.wooz.org>
	<430C9D90.6090102@hathawaymix.org>
Message-ID: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>

>
> txn = new_transaction()
> try:
>      txn.begin()
>      rtn = do_work()
> finally:
>      if exception_occurred():
>          txn.abort()
>      else:
>          txn.commit()
>          return rtn
>

Couldn't you just do:

txn = new_transaction ()
try:
     complete = 0
     txn.begin ()
     rtn = do_work ()
     complete = 1
finally:
     if not complete: txn.abort ()
     else: txn.commit ()

and then not need new builtins or anything fancy?


Niko


From mcherm at mcherm.com  Wed Aug 24 18:33:00 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 24 Aug 2005 09:33:00 -0700
Subject: [Python-Dev] FW:  Bare except clauses in PEP 348
Message-ID: <20050824093300.uuj0o52cj9s0wksk@login.werra.lunarpages.com>

Raymond writes:
> Hey guys, don't give up your bare except clauses so easily.
      [...]

Raymond:

I agree that when comparing:

   // Code snippet A
   try:
       ...
   except SomeException:
       ...
   except BaseException:
       ...

with

   // Code snippet B
   try:
       ...
   except SomeException:
       ...
   except:
       ...

that B is nicer than A. Slightly nicer. It's a minor esthetic point. But
consider these:

    // Code snippet C
    try:
        ...
    except Exception:
        ...

    // Code snippet D
    try:
        ...
    except:
        ...

Esthetically I'd say that D is nicer than A for the same reasons. It's a minor
esthetic point. But you see, this case is different. You and I would likely
never bother to compare C and D because they do different things! (D is
equivalent to catching BaseException, not Exception). But we know that people
who are not so careful or not so knowlegable WILL make this mistake... they
make it all the time today!

Since situation C (catching an exception) is hundreds of times more common than
situation A (needing default processing for exceptions that don't get caught,
but doing it with try-except instead of try-finally because the
nothing-was-thrown case is different), I would FAR rather protect beginners
from erroniously confusing C and D than I would provide a marginally more
elegent syntax for the experts using A or B. And that elegence is arguable...
there's something to be said for simplicity, and having only one kind of
"except" clause for try statements is clearly simpler than having both "except
<some-exception-type>:" and also bare "except:".

-- Michael Chermside


From martin at v.loewis.de  Wed Aug 24 18:38:17 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 18:38:17 +0200
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>	<1124895832.19291.10.camel@geddy.wooz.org>	<430C9D90.6090102@hathawaymix.org>
	<934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>
Message-ID: <430CA279.3080909@v.loewis.de>

Niko Matsakis wrote:
> Couldn't you just do:
> 
> txn = new_transaction ()
> try:
>      complete = 0
>      txn.begin ()
>      rtn = do_work ()
>      complete = 1
> finally:
>      if not complete: txn.abort ()
>      else: txn.commit ()
> 
> and then not need new builtins or anything fancy?

I personally dislike recording the execution path in
local variables. This is like setting a flag in a loop
before the break, and testing the flag afterwards.
You can do this, but the else: clause of the loop is
just more readable.

This specific fragment has also the bug that a
KeyboardInterrupt before the assignment to complete
will cause a NameError/UnboundLocalError; this
can easily be fixed by moving the assignment before
the try block.

Regards,
Martin

From shane at hathawaymix.org  Wed Aug 24 18:42:21 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Wed, 24 Aug 2005 10:42:21 -0600
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
	<1124895832.19291.10.camel@geddy.wooz.org>
	<430C9D90.6090102@hathawaymix.org>
	<934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>
Message-ID: <430CA36D.8000004@hathawaymix.org>

Niko Matsakis wrote:
>>
>> txn = new_transaction()
>> try:
>>      txn.begin()
>>      rtn = do_work()
>> finally:
>>      if exception_occurred():
>>          txn.abort()
>>      else:
>>          txn.commit()
>>          return rtn
>>
> 
> Couldn't you just do:
> 
> txn = new_transaction ()
> try:
>     complete = 0
>     txn.begin ()
>     rtn = do_work ()
>     complete = 1
> finally:
>     if not complete: txn.abort ()
>     else: txn.commit ()
> 
> and then not need new builtins or anything fancy?

That would work, though it's less readable.  If I were looking over code 
like that written by someone else, I'd have verify that the "complete" 
variable is handled correctly in all cases.  (As Martin noted, your code 
already has a bug.)  The nice try/except/else idiom we have today, with 
a bare except and bare raise, is much easier to verify.

Shane

From walter at livinglogic.de  Wed Aug 24 18:59:11 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 18:59:11 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C854E.1080200@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de>
Message-ID: <430CA75F.7090900@livinglogic.de>

Martin v. L?wis wrote:

> Walter D?rwald wrote:
> 
>>Martin v. L?wis wrote:
>>
>>>Walter D?rwald wrote:
>>>
>>>>I think a maxsplit argument (just as for unicode.split()) would help
>>>>too.
>>>
>>>Correct - that would allow to get rid of the quadratic part.
>>
>>OK, such a patch should be rather simple. I'll give it a try.
> 
> Actually, on a second thought - it would not remove the quadratic
> aspect.

At least it would remove the quadratic number of calls to 
_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only 
once.

> You would still copy the rest string completely on each
> split. So on the first split, you copy N lines (one result line,
> and N-1 lines into the rest string), on the second split, N-2
> lines, and so on, totalling N*N/2 line copies again.

OK, that's true.

We could prevent string copying if we kept the unsplit string and the 
position of the current line terminator, but this would require a "first 
position after a line terminator" method.

> The only
> thing you save is the join (as the rest is already joined), and
> the IsLineBreak calls (which are necessary only for the first
> line).
> 
> Please see python.org/sf/1268314;

The last part of the patch seems to be more related to bug #1235646.

With the patch test_pep263 and test_codecs fail (and test_parser, but 
this might be unrelated):

python Lib/test/test_pep263.py gives the following output:

File "Lib/test/test_pep263.py", line 22
SyntaxError: list index out of range

test_codecs.py has the following two complaints:

File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", 
line 366, in readline
     self.charbuffer = lines[1] + self.charbuffer
IndexError: list index out of range

and

File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", 
line 336, in readline
     line = result.splitlines(False)[0]
NameError: global name 'result' is not defined

> it solves the problem by
> keeping the splitlines result. It only invokes IsLineBreak
> once per character, and also copies each character only once,
> and allocates each line only once, totalling in O(N) for
> these operations. It still does contain a quadratic operation:
> the lines are stored in a list, and the result list is
> removed from the list with del lines[0]. This copies N-1
> pointers, result in N*N/2 pointer copies. That should still
> be much faster than the current code.

Using collections.deque() should get rid of this problem.

>>unicodelinebreaks = u"".join(unichr(c) for c in xrange(0,
>>sys.maxunicode) if unichr(c).islinebreak())
> 
> That is very inefficient. I would rather add a static list
> to the string module, and have a test that says
> 
> assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c
> in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or
> unicodedata.category(ch)=='Zl')

You mean, in the test suite?

> unicodelinebreaks could then be defined as
> 
> # u"\r\n\x1c\x1d\x1e\x85\u2028\u2029
> '\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8")

That might be better, as this definition won't change very often.

BTW, why the decode() call? For a Python without unicode?

>>OK, this would mean we'd have to distinguish between a direct call to
>>read() and one done by readline() (which we do anyway through the
>>firstline argument).
> 
> See my patch. If we have cached lines, we don't need to call .read
> at all.

I wonder what happens, if calls to read() and readline() are mixed (e.g. 
if I'm reading Fortran source or anything with a fixed line header): 
read() would be used to read the first n character (which joins the line 
buffer) and readline() reads the rest (which would split it again) etc.
(Of course this could be done via a single readline() call).

But, I think a maxsplit argument for splitlines() woould make sense 
independent of this problem.

Bye,
    Walter D?rwald

From rrr at ronadam.com  Wed Aug 24 19:03:13 2005
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 24 Aug 2005 13:03:13 -0400
Subject: [Python-Dev] FW:  Bare except clauses in PEP 348
In-Reply-To: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer>
References: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer>
Message-ID: <430CA851.6060406@ronadam.com>

Raymond Hettinger wrote:
 > Hey guys, don't give up your bare except clauses so easily.

Yes, Don't give up.  I often write code starting with a bare except, 
then after it works, stick a raise in it to determine exactly what 
exception I'm catching. Then use that to rewrite a more explicit except 
statement.

Your comment earlier about treating the symptom is also accurate.  This 
isn't just an issue with bare excepts not being allowed in the middle, 
it also comes up whenever we catch exceptions out of order in the tree. 
  Ie.. catching an exception closer to the base will block a following 
except clause that tries to catch an exception on the same branch.

So should except clauses be checked for orderliness?

Regards, Ron


From gvanrossum at gmail.com  Wed Aug 24 19:15:47 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 10:15:47 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
Message-ID: <ca471dc2050824101579f8304@mail.gmail.com>

On 8/24/05, James Y Knight <foom at fuhm.net> wrote:
> I think it must be the case that raising an object which does not
> derive from an exception class must be deprecated as well in order
> for "except:" to be deprecated. Otherwise, there is nothing you can
> change "except:" to in order not to get a deprecation warning and
> still have your code be correct in the face of documented features of
> python.

I agree; isn't that already in ther PEP? This surely has been the
thinking all along.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Wed Aug 24 19:17:56 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 10:17:56 -0700
Subject: [Python-Dev] FW: Bare except clauses in PEP 348
In-Reply-To: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer>
References: <003901c5a8c0$51d79fc0$b729cb97@oemcomputer>
Message-ID: <ca471dc205082410174c1c6060@mail.gmail.com>

On 8/24/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> Hey guys, don't give up your bare except clauses so easily.

They are an attractive nuisance by being so much shorter to type than
the "right thing to do".

Especially if they default to something whose use cases are rather
esoteric (i.e. BaseException).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From abo at minkirri.apana.org.au  Wed Aug 24 19:26:34 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Wed, 24 Aug 2005 10:26:34 -0700
Subject: [Python-Dev] 51 Million calls to
	_PyUnicodeUCS2_IsLinebreak()	(???)
In-Reply-To: <430C854E.1080200@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de>
	<430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de>
	<430C7F4C.9010703@livinglogic.de>  <430C854E.1080200@v.loewis.de>
Message-ID: <1124904393.9380.29.camel@warna.corp.google.com>

On Wed, 2005-08-24 at 07:33, "Martin v. L?wis" wrote:
> Walter D?rwald wrote:
> > Martin v. L?wis wrote:
> > 
> >> Walter D?rwald wrote:
[...]
> Actually, on a second thought - it would not remove the quadratic
> aspect. You would still copy the rest string completely on each
> split. So on the first split, you copy N lines (one result line,
> and N-1 lines into the rest string), on the second split, N-2
> lines, and so on, totalling N*N/2 line copies again. The only
> thing you save is the join (as the rest is already joined), and
> the IsLineBreak calls (which are necessary only for the first
> line).
[...]

In the past, I've avoided the string copy overhead inherent in split()
by using buffers...

I've always wondered why Python didn't use buffer type tricks internally
for split-type operations. I haven't looked at Python's string
implementation, but the fact that strings are immutable surely means
that you can safely and efficiently reference an implementation level
"data" object for all strings... ie all strings are "buffers".

The only problem I can see with this is huge "data" objects might hang
around just because some small fragment of it is still referenced by a
string. Surely a simple huristic or two like "if len(string) <
len(data)/8: copy data; else: reference data" would go a long way
towards avoiding that.

In my limited playing around with manipulating of strings and
benchmarking stuff, the biggest overhead is nearly always the copys.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From walter at livinglogic.de  Wed Aug 24 19:35:11 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 19:35:11 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430C4BA1.5030503@egenix.com>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de> <430C4BA1.5030503@egenix.com>
Message-ID: <430CAFCF.3040109@livinglogic.de>

M.-A. Lemburg wrote:

> Walter D?rwald wrote:
> 
>>I wonder if we should switch back to a simple readline() implementation 
>>for those codecs that don't require the current implementation 
>>(basically every charmap codec). 
> 
> That would be my preference as well. The 2.4 .readline() approach
> is really only needed for codecs that have to deal with encodings
> that:
> 
> a) use multi-byte formats, or
> b) support more line-end formats than just CR, CRLF, LF, or
> c) are stateful.
> 
> This can easily be had by using a mix-in class for
> codecs which do need the buffered .readline() approach.

Should this be a mix-in or should we simply have two base classes? Which 
of those bases/mix-ins should be the default?

>>AFAIK source files are opened in 
>>universal newline mode, so at least we'd get proper treatment of "\n", 
>>"\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
>>u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
>>to unicode.splitlines()).
> 
> While the Unicode standard defines these characters as line
> end code points, I think their definition does not necessarily
> apply to data that is converted from a certain encoding to
> Unicode, so that's not a big loss.
> 
> E.g. in ASCII or Latin-1, FILE, GROUP and RECORD
> SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85)
> are not interpreted as line end characters.
> 
> Furthermore, we had no reports of anyone complaining in
> Python 1.6, 2.0 - 2.3 that line endings were not detected
> properly.  All these Python versions relied on the stream's
> .readline() method to get the next line. The only bug reports
> we had were for UTF-16 which falls into the above
> category a) and did not support .readline() until Python 2.4.

True.

Bye,
    Walter D?rwald

From martin at v.loewis.de  Wed Aug 24 19:38:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 19:38:54 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430CA75F.7090900@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de>
Message-ID: <430CB0AE.1040201@v.loewis.de>

Walter D?rwald wrote:
> At least it would remove the quadratic number of calls to
> _PyUnicodeUCS2_IsLinebreak(). For each character it would be called only
> once.

Correct. However, I very much doubt that this is the cause of the
slowdown.

> The last part of the patch seems to be more related to bug #1235646.

You mean the last chunk (linebuffer=None)? This is just the extension
to reset.

> With the patch test_pep263 and test_codecs fail (and test_parser, but
> this might be unrelated):

Oops, I thought I ran the test suite, but apparently with the patch
removed. New version uploaded.

> Using collections.deque() should get rid of this problem.

Alright. There are so many types in Python I've never heard of :-)

> You mean, in the test suite?

Right.

> BTW, why the decode() call? For a Python without unicode?

Right. Not sure what people think whether this should still be
supported, but I keep supporting it whenever I think of it.

> I wonder what happens, if calls to read() and readline() are mixed (e.g.
> if I'm reading Fortran source or anything with a fixed line header):
> read() would be used to read the first n character (which joins the line
> buffer) and readline() reads the rest (which would split it again) etc.
> (Of course this could be done via a single readline() call).

Then performance would drop again - it should still be correct, though.

If this is becomes a frequent problem, we could satisfy read requests
from the split lines as well (i.e. join as many lines as you need).
However, I would rather expect that callers of read() typically want
the entire file, or want to read in large chunks (with no line
orientation at all).

> But, I think a maxsplit argument for splitlines() woould make sense
> independent of this problem.

I'm not so sure anymore. It is good for consistency, but I doubt there
are actual use cases: how often do you want only the first n lines
of some string? Reading the first n lines of a file might be an
application, but then, you would rather use .readline() directly.

For readline, I don't think there is a clear case for splitting of
only the first line (unless you want to return an index instead of
the rest string): if the application eventually wants all of the
data, we better split it right away into individual strings, instead
of dealing with a gradually decreasing trailer.

Anyway, I don't think we should go back to C's readline/fgets. This
is just too messy wrt. buffering and text vs. binary mode. I wish
Python would stop using stdio entirely.

Regards,
Martin


From walter at livinglogic.de  Wed Aug 24 20:16:39 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 20:16:39 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430CB0AE.1040201@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de>
	<430CB0AE.1040201@v.loewis.de>
Message-ID: <430CB987.5000601@livinglogic.de>

Martin v. L?wis wrote:

> Walter D?rwald wrote:
> 
>>At least it would remove the quadratic number of calls to
>>_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only
>>once.
> 
> Correct. However, I very much doubt that this is the cause of the
> slowdown.

Probably. We'd need a test with the original Argon source to really know.

>>The last part of the patch seems to be more related to bug #1235646.
> 
> You mean the last chunk (linebuffer=None)? This is just the extension
> to reset.

Ouch, you're right: The part of "cvs diff" was part of my checkout, not 
your patch. I have so many Python checkouts, that I sometimes forget 
which is which! ;)

>>With the patch test_pep263 and test_codecs fail (and test_parser, but
>>this might be unrelated):
> 
> Oops, I thought I ran the test suite, but apparently with the patch
> removed. New version uploaded.

Looks much better now.

>>Using collections.deque() should get rid of this problem.
> 
> Alright. There are so many types in Python I've never heard of :-)

The problem is that unicode.splitlines() returns a list, so the push/pop 
performance advantange of collections.deque might be eaten by having to 
create a collections.deque object in the first place.

>>You mean, in the test suite?
> 
> Right.
> 
>>BTW, why the decode() call? For a Python without unicode?
> 
> Right. Not sure what people think whether this should still be
> supported, but I keep supporting it whenever I think of it.

OK, so should we add this for 2.4.2 or only for 2.5?

Should this really be put into string.py, or should it be a class 
attribute of unicode? (At least that's what was proposed for the other 
strings in string.py (string.whitespace etc.) too.

>>I wonder what happens, if calls to read() and readline() are mixed (e.g.
>>if I'm reading Fortran source or anything with a fixed line header):
>>read() would be used to read the first n character (which joins the line
>>buffer) and readline() reads the rest (which would split it again) etc.
>>(Of course this could be done via a single readline() call).
> 
> Then performance would drop again - it should still be correct, though.
> 
> If this is becomes a frequent problem, we could satisfy read requests
> from the split lines as well (i.e. join as many lines as you need).
> However, I would rather expect that callers of read() typically want
> the entire file, or want to read in large chunks (with no line
> orientation at all).

Agreed! Don't fix a bug that hasn't been reported! ;)

>>But, I think a maxsplit argument for splitlines() woould make sense
>>independent of this problem.
> 
> I'm not so sure anymore. It is good for consistency, but I doubt there
> are actual use cases: how often do you want only the first n lines
> of some string? Reading the first n lines of a file might be an
> application, but then, you would rather use .readline() directly.

Not every unicode string is read from a StreamReader.

> For readline, I don't think there is a clear case for splitting of
> only the first line (unless you want to return an index instead of
> the rest string): if the application eventually wants all of the
> data, we better split it right away into individual strings, instead
> of dealing with a gradually decreasing trailer.

True, this would be best for a readline loop.

Another solution would be to have a unicode.itersplitlines() and store 
the iterator. Then we wouldn't need a maxsplit because you simply can 
stop iterating once you have what you want.

> Anyway, I don't think we should go back to C's readline/fgets. This
> is just too messy wrt. buffering and text vs. binary mode.

I don't know about C's readline, but StreamReader.read() and 
StreamReader.readline() are messy enough. But at least it's something we 
can fix ourselves.

> I wish
> Python would stop using stdio entirely.

So reverting to the 2.3 behaviour for simple codecs is out?

Bye,
    Walter D?rwald

From barry at python.org  Wed Aug 24 20:25:21 2005
From: barry at python.org (Barry Warsaw)
Date: Wed, 24 Aug 2005 14:25:21 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <430CA279.3080909@v.loewis.de>
References: <001d01c5a855$447a9d20$ab12c797@oemcomputer>
	<B10822D6-45B9-4F35-BE15-F0B09958D5C1@fuhm.net>
	<1124895832.19291.10.camel@geddy.wooz.org>
	<430C9D90.6090102@hathawaymix.org>
	<934F5C4E-FF88-41D0-939E-623D0AFCDAE2@alum.mit.edu>
	<430CA279.3080909@v.loewis.de>
Message-ID: <1124907921.19925.5.camel@geddy.wooz.org>

On Wed, 2005-08-24 at 12:38, "Martin v. L?wis" wrote:

> I personally dislike recording the execution path in
> local variables. This is like setting a flag in a loop
> before the break, and testing the flag afterwards.
> You can do this, but the else: clause of the loop is
> just more readable.

Agreed!

> This specific fragment has also the bug that a
> KeyboardInterrupt before the assignment to complete
> will cause a NameError/UnboundLocalError; this
> can easily be fixed by moving the assignment before
> the try block.

And that begs the question whether getting rid of this common idiom is
trading one common problem for another.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/eeedd633/attachment.pgp

From reinhold-birkenfeld-nospam at wolke7.net  Wed Aug 24 20:33:02 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed, 24 Aug 2005 20:33:02 +0200
Subject: [Python-Dev] Docs/Pointer to Tools/scripts?
Message-ID: <deiegu$jqf$1@sea.gmane.org>

Hi,

after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder
whether the Tools directory is documented at all. There are many useful
scripts there which many people will not find if they are not listed
anywhere in the docs.

Just a thought.

Reinhold

-- 
Mail address is perfectly valid!


From phd at mail2.phd.pp.ru  Wed Aug 24 20:44:11 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Wed, 24 Aug 2005 22:44:11 +0400
Subject: [Python-Dev] Docs/Pointer to Tools/scripts?
In-Reply-To: <deiegu$jqf$1@sea.gmane.org>
References: <deiegu$jqf$1@sea.gmane.org>
Message-ID: <20050824184410.GB5666@phd.pp.ru>

Hello!

On Wed, Aug 24, 2005 at 08:33:02PM +0200, Reinhold Birkenfeld wrote:
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts

   What's more, pysource.py is more than just a script - it's a generally
useful module.

   Thank you for committing the code.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From martin at v.loewis.de  Wed Aug 24 21:15:09 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 21:15:09 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
 (???)
In-Reply-To: <430CB987.5000601@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de>
	<430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de>
Message-ID: <430CC73D.1050401@v.loewis.de>

Walter D?rwald wrote:
>> Right. Not sure what people think whether this should still be
>> supported, but I keep supporting it whenever I think of it.
> 
> 
> OK, so should we add this for 2.4.2 or only for 2.5?

You mean, string.unicodelinebreaks? I think something needs to be
done to fix the performance problem. In doing so, API changes
might occur. We should not add API changes in 2.4.2 unless they
contribute to the bug fix, and even then, the release manager
probably needs to approve them (in any case, they certainly
need to be backwards compatible)

> Should this really be put into string.py, or should it be a class
> attribute of unicode? (At least that's what was proposed for the other
> strings in string.py (string.whitespace etc.) too.

If the 2.4.2 fix is based on this kind of data, I think it should go
into a private attribute of codecs.py. For 2.5, I would put it
into strings for tradition. There is no point in having some of these
constants in strings and others as class attributes (unless we also
add them as class attributes in 2.5, in which case adding
unicodelinebreaks into strings would be pointless).

So I think in 2.5, I would like to see

# string.py
ascii_letters = str.ascii_letters

in which case unicode.linebreaks would be the right spelling.

>> I'm not so sure anymore. It is good for consistency, but I doubt there
>> are actual use cases: how often do you want only the first n lines
>> of some string? Reading the first n lines of a file might be an
>> application, but then, you would rather use .readline() directly.
> 
> 
> Not every unicode string is read from a StreamReader.

Sure: but how often do you want to fetch the first line of a Unicode
string you happen to have in memory, without iterating over all lines
eventually?

> Another solution would be to have a unicode.itersplitlines() and store
> the iterator. Then we wouldn't need a maxsplit because you simply can
> stop iterating once you have what you want.

That might work. I would then ask for itersplitlines to return pairs
of (line, truncated) so you can easily know whether you merely ran
into the end of the string, or whether you got a complete line
(although it might be a bit too specific for the readlines() case)

> So reverting to the 2.3 behaviour for simple codecs is out?

I'm -1, atleast. It would also fix the problem at hand, for the reported
case. However, it does leave some codecs in the cold, most notably
UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is
built-in in the parser). I think the UTF-8 stream reader should support
all Unicode line breaks, so it should continue to use the Python
approach. However, UTF-8 is fairly common, so that reading an
UTF-8-encoded file line-by-line shouldn't suck.

Regards,
Martin

From raymond.hettinger at verizon.net  Wed Aug 24 21:15:12 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 24 Aug 2005 15:15:12 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
Message-ID: <005a01c5a8e0$27480860$b729cb97@oemcomputer>

[Guido van Rossum]
> > OK, I'm convinced. Let's drop bare except for Python 3.0, and
> > deprecate them until then, without changing the meaning.
> >
> > The deprecation message (to be generated by the compiler!) should
> > steer people in the direction of specifying one particular exception
> > (e.g. KeyError etc.) rather than Exception.

[James Y Knight]
> I agree but there's the minor nit of non-Exception exceptions.
> 
> I think it must be the case that raising an object which does not
> derive from an exception class must be deprecated as well in order
> for "except:" to be deprecated. Otherwise, there is nothing you can
> change "except:" to in order not to get a deprecation warning and
> still have your code be correct in the face of documented features of
> python.

Hmm, that may not be a killer.  I wonder if it is possible to treat
BaseException as a constant (like we do with None) and teach the
compiler to interpret it as catching anything that gets raised so that
"except BaseException" will work like a bare except clause does now.


Raymond


From barry at python.org  Wed Aug 24 21:21:47 2005
From: barry at python.org (Barry Warsaw)
Date: Wed, 24 Aug 2005 15:21:47 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <005a01c5a8e0$27480860$b729cb97@oemcomputer>
References: <005a01c5a8e0$27480860$b729cb97@oemcomputer>
Message-ID: <1124911307.19921.11.camel@geddy.wooz.org>

On Wed, 2005-08-24 at 15:15, Raymond Hettinger wrote:

> Hmm, that may not be a killer.  I wonder if it is possible to treat
> BaseException as a constant (like we do with None) and teach the
> compiler to interpret it as catching anything that gets raised so that
> "except BaseException" will work like a bare except clause does now.

Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
big change in the semantics of exception matching.  I think I'd rather
keep bare except than add that!

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050824/f1600681/attachment-0001.pgp

From raymond.hettinger at verizon.net  Wed Aug 24 21:30:28 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 24 Aug 2005 15:30:28 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <1124911307.19921.11.camel@geddy.wooz.org>
Message-ID: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer>

> > Hmm, that may not be a killer.  I wonder if it is possible to treat
> > BaseException as a constant (like we do with None) and teach the
> > compiler to interpret it as catching anything that gets raised so
that
> > "except BaseException" will work like a bare except clause does now.
> 
> Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
> big change in the semantics of exception matching.  I think I'd rather
> keep bare except than add that!

That may be your only other option if we're waiting until 3.0 to
eliminate string exceptions and class exceptions not derived from the
hierarchy.


Raymond


From mwh at python.net  Wed Aug 24 21:52:13 2005
From: mwh at python.net (Michael Hudson)
Date: Wed, 24 Aug 2005 20:52:13 +0100
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer> (Raymond
	Hettinger's message of "Wed, 24 Aug 2005 15:30:28 -0400")
References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer>
Message-ID: <2mbr3nru36.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger at verizon.net> writes:

>> > Hmm, that may not be a killer.  I wonder if it is possible to treat
>> > BaseException as a constant (like we do with None) and teach the
>> > compiler to interpret it as catching anything that gets raised so
> that
>> > "except BaseException" will work like a bare except clause does now.
>> 
>> Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
>> big change in the semantics of exception matching.  I think I'd rather
>> keep bare except than add that!
>
> That may be your only other option if we're waiting until 3.0 to
> eliminate string exceptions and class exceptions not derived from the
> hierarchy.

I really hope string exceptions can be killed off before 3.0.  They
should be fully deprecated in 2.5.

Cheers,
mwh

-- 
  The Oxford Bottled Beer Database heartily disapproves of the 
  excessive consumption of alcohol.  No, really.
                        -- http://www.bottledbeer.co.uk/beergames.html
                           (now sadly gone to the big 404 in the sky)

From walter at livinglogic.de  Wed Aug 24 22:14:32 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 22:14:32 +0200
Subject: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak()
	(???)
In-Reply-To: <430CC73D.1050401@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de>
	<430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de>
	<430CC73D.1050401@v.loewis.de>
Message-ID: <671A6329-ED68-491F-84CB-1D2CF00A2F6A@livinglogic.de>

Am 24.08.2005 um 21:15 schrieb Martin v. L?wis:

> Walter D?rwald wrote:
>
>
>>> Right. Not sure what people think whether this should still be
>>> supported, but I keep supporting it whenever I think of it.
>>>
>>
>> OK, so should we add this for 2.4.2 or only for 2.5?
>>
>
> You mean, string.unicodelinebreaks?
>

Yes.

> I think something needs to be
> done to fix the performance problem. In doing so, API changes
> might occur. We should not add API changes in 2.4.2 unless they
> contribute to the bug fix, and even then, the release manager
> probably needs to approve them (in any case, they certainly
> need to be backwards compatible)
>

OK. Your version of the patch (without replacing line =  
line.splitlines(False)[0] with something better) might be enough for  
2.4.2.

>> Should this really be put into string.py, or should it be a class
>> attribute of unicode? (At least that's what was proposed for the  
>> other
>> strings in string.py (string.whitespace etc.) too.
>>
>
> If the 2.4.2 fix is based on this kind of data, I think it should go
> into a private attribute of codecs.py.
>

I think codecs.unicodelinebreaks has one big problem: it will not  
work for codecs that do str->str decoding.

> For 2.5, I would put it
> into strings for tradition. There is no point in having some of these
> constants in strings and others as class attributes (unless we also
> add them as class attributes in 2.5, in which case adding
> unicodelinebreaks into strings would be pointless).
>
> So I think in 2.5, I would like to see
>
> # string.py
> ascii_letters = str.ascii_letters
>
> in which case unicode.linebreaks would be the right spelling.
>

And it would have the advantage, that it could work both with str and  
unicode if we had both str.linebreaks and unicode.linebreaks

>>> I'm not so sure anymore. It is good for consistency, but I doubt  
>>> there
>>> are actual use cases: how often do you want only the first n lines
>>> of some string? Reading the first n lines of a file might be an
>>> application, but then, you would rather use .readline() directly.
>>>
>>
>> Not every unicode string is read from a StreamReader.
>>
>
> Sure: but how often do you want to fetch the first line of a Unicode
> string you happen to have in memory, without iterating over all lines
> eventually?
>

I don't know. The only obvious spot in the standard library (apart  
from codecs.py) seems to be
    def shortdescription(self): return self.description().splitlines() 
[0]
in Lib/plat-mac/pimp.py

>> Another solution would be to have a unicode.itersplitlines() and  
>> store
>> the iterator. Then we wouldn't need a maxsplit because you simply can
>> stop iterating once you have what you want.
>>
>
> That might work. I would then ask for itersplitlines to return pairs
> of (line, truncated) so you can easily know whether you merely ran
> into the end of the string, or whether you got a complete line
> (although it might be a bit too specific for the readlines() case)
>

Or maybe (line, terminatorlength) which gives you the same info  
(terminatorlength == 0 means truncated) and makes it easy to strip  
the terminator.

>> So reverting to the 2.3 behaviour for simple codecs is out?
>>
>
> I'm -1, atleast. It would also fix the problem at hand, for the  
> reported
> case. However, it does leave some codecs in the cold, most notably
> UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is
> built-in in the parser).
>

You meant PEP 263, right?

> I think the UTF-8 stream reader should support
> all Unicode line breaks, so it should continue to use the Python
> approach.
>

OK.

> However, UTF-8 is fairly common, so that reading an
> UTF-8-encoded file line-by-line shouldn't suck.
>

OK, so what's missing is a solution for str->str codecs (or we keep  
line = line.splitlines(False)[0] and test, whether this is fast enough).

Bye,
    Walter D?rwald


From tzot at mediconsa.com  Wed Aug 24 22:48:28 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Wed, 24 Aug 2005 23:48:28 +0300
Subject: [Python-Dev] Docs/Pointer to Tools/scripts?
References: <deiegu$jqf$1@sea.gmane.org>
Message-ID: <deimf1$dpd$1@sea.gmane.org>

"Reinhold Birkenfeld" <reinhold-birkenfeld-nospam at wolke7.net> wrote in 
message news:deiegu$jqf$1 at sea.gmane.org...
> Hi,
>
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder
> whether the Tools directory is documented at all. There are many useful
> scripts there which many people will not find if they are not listed
> anywhere in the docs.

AFAIK the only documentation is the README file in said directory. 


From walter at livinglogic.de  Wed Aug 24 23:12:38 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 24 Aug 2005 23:12:38 +0200
Subject: [Python-Dev] [Argon] Re: 51 Million calls to
	_PyUnicodeUCS2_IsLinebreak() (???)
In-Reply-To: <Pine.GSO.4.58.0508241419430.27108@dvp.cs>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de>
	<430C5E6E.2040405@livinglogic.de> <430C6E7E.7070106@v.loewis.de>
	<430C7F4C.9010703@livinglogic.de> <430C854E.1080200@v.loewis.de>
	<430CA75F.7090900@livinglogic.de> <430CB0AE.1040201@v.loewis.de>
	<430CB987.5000601@livinglogic.de>
	<Pine.GSO.4.58.0508241419430.27108@dvp.cs>
Message-ID: <8FD4A0C3-D54B-403C-9BC7-052D2FB1F0E5@livinglogic.de>

Am 24.08.2005 um 20:20 schrieb Greg Wilson:

>>> Walter D?rwald wrote:
>>>
>>>> At least it would remove the quadratic number of calls to
>>>> _PyUnicodeUCS2_IsLinebreak(). For each character it would be  
>>>> called only
>>>> once.
>
>> Martin v. L?wis wrote:
>>
>>> Correct. However, I very much doubt that this is the cause of the
>>> slowdown.
>
>> Walter D?rwald wrote:
>> Probably. We'd need a test with the original Argon source to  
>> really know.
>
> We can do that.

So, can you try Martin's patch?

>> OK, so should we add this for 2.4.2 or only for 2.5?
>
> 2.4.2 please ;-)

If we use the patch as is, I think it can go into 2.4.2.

Bye,
    Walter D?rwald


From martin at v.loewis.de  Wed Aug 24 23:37:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 Aug 2005 23:37:53 +0200
Subject: [Python-Dev] Docs/Pointer to Tools/scripts?
In-Reply-To: <deiegu$jqf$1@sea.gmane.org>
References: <deiegu$jqf$1@sea.gmane.org>
Message-ID: <430CE8B1.4030409@v.loewis.de>

Reinhold Birkenfeld wrote:
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I
> wonder whether the Tools directory is documented at all. There are
> many useful scripts there which many people will not find if they are
> not listed anywhere in the docs.

Christos already mentioned it: there is a README file in both Tools
and Tools/scripts; you should update it whenever you add something.

Regards,
Martin

From gvanrossum at gmail.com  Thu Aug 25 02:28:35 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 17:28:35 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <2mbr3nru36.fsf@starship.python.net>
References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer>
	<2mbr3nru36.fsf@starship.python.net>
Message-ID: <ca471dc20508241728b6ed4df@mail.gmail.com>

On 8/24/05, Michael Hudson <mwh at python.net> wrote:
> I really hope string exceptions can be killed off before 3.0.  They
> should be fully deprecated in 2.5.

But what about class exceptions that don't inherit from Exception?
That will take a while before we can deprecate that.

Anyway, there have been plenty of cases where I was only interested in
catching arbitrary exceptions generated by *Python* (as opposed to
broken 3rd party code or even obscure Python library code) and those
all inherit from Exception. And in those cases I've written "except
Exception:" and so far never regretted it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Thu Aug 25 03:39:35 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 24 Aug 2005 18:39:35 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc2050824101579f8304@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
	<ca471dc2050824101579f8304@mail.gmail.com>
Message-ID: <bbaeab100508241839436eea14@mail.gmail.com>

On 8/24/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/24/05, James Y Knight <foom at fuhm.net> wrote:
> > I think it must be the case that raising an object which does not
> > derive from an exception class must be deprecated as well in order
> > for "except:" to be deprecated. Otherwise, there is nothing you can
> > change "except:" to in order not to get a deprecation warning and
> > still have your code be correct in the face of documented features of
> > python.
> 
> I agree; isn't that already in ther PEP? This surely has been the
> thinking all along.
> 

Requiring inheritance of BaseException in order to pass it to 'raise'
has been in the PEP since the beginning.

-Brett

From bcannon at gmail.com  Thu Aug 25 03:43:23 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 24 Aug 2005 18:43:23 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082408103f0adc81@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
Message-ID: <bbaeab1005082418433c41a5b4@mail.gmail.com>

On 8/24/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/24/05, Michael Chermside <mcherm at mcherm.com> wrote:
> > Explicit is better than Implicit. I think that in newly written code
> > "except Exception:" is better (more explicit and easier to understand)
> > than "except:" Legacy code that uses "except:" can remain unchanged *IF*
> > the meaning of "except:" is unchanged... but I think we all agree that
> > this is unwise because the existing meaning is a tempting trap for the
> > unwary. So I don't see any advantage to keeping bare "except:" in the
> > long run. What we do to ease the transition is a different question,
> > but one more easily resolved.
> 
> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> deprecate them until then, without changing the meaning.
> 

Woohoo!  I am currently on vacation before school starts (orientation
is Sept 1., classes start Sept. 6), so it might take me a little while
to edit the PEP, but I will try to fit into my schedule ASAP (assuming
the tide doesn't turn on me before then).

> The deprecation message (to be generated by the compiler!) should
> steer people in the direction of specifying one particular exception
> (e.g. KeyError etc.) rather than Exception.

Is there any desire for a __future__ statement that makes it a syntax
error?  How about making 'raise' statements only work with objects
that inherit from BaseException?

-Brett

From gvanrossum at gmail.com  Thu Aug 25 04:02:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 19:02:00 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <bbaeab1005082418433c41a5b4@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<bbaeab1005082418433c41a5b4@mail.gmail.com>
Message-ID: <ca471dc205082419021c4cf553@mail.gmail.com>

On 8/24/05, Brett Cannon <bcannon at gmail.com> wrote:
> Is there any desire for a __future__ statement that makes it a syntax
> error?  How about making 'raise' statements only work with objects
> that inherit from BaseException?

I doubt it. Few people are going to put a __future__ statement in to
make sure that *don't* use a particular feature: it's just as easy to
grep your source code for "except:". __future__ is in general only
used to enable new syntax that previously has a different meaning.

Anyway, you can make it an error globally by using the -W option creatively.
-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From amk at amk.ca  Thu Aug 25 04:33:42 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 24 Aug 2005 22:33:42 -0400
Subject: [Python-Dev] New mailbox module
Message-ID: <20050825023342.GA20941@rogue.amk.ca>

Gregory K. Johnson, who's been working on the mailbox module in
nondist/sandbox/mailbox for Google's Summer of Code, thinks his
project is essentially complete.  He's added the ability to modifying
mailboxes by adding and removing messages, adding test cases for the
new features, and written the corresponding documentation.

So, it's time to start considering it for inclusion in the standard
library.  This is a big change to a non-obscure module, so don't feel
able to make this decision on my own.

I believe the code quality is acceptable, but would appreciate
comments on any cleanups that need to be made.  I still need to read
through the docs and make editing suggestions, and check that the code
is still backward-compatible with the old version of the module.

--amk

From barry at python.org  Thu Aug 25 06:22:43 2005
From: barry at python.org (Barry Warsaw)
Date: Thu, 25 Aug 2005 00:22:43 -0400
Subject: [Python-Dev] New mailbox module
In-Reply-To: <20050825023342.GA20941@rogue.amk.ca>
References: <20050825023342.GA20941@rogue.amk.ca>
Message-ID: <1124943762.10479.0.camel@geddy.wooz.org>

On Wed, 2005-08-24 at 22:33, A.M. Kuchling wrote:

> So, it's time to start considering it for inclusion in the standard
> library.  This is a big change to a non-obscure module, so don't feel
> able to make this decision on my own.
> 
> I believe the code quality is acceptable, but would appreciate
> comments on any cleanups that need to be made.  I still need to read
> through the docs and make editing suggestions, and check that the code
> is still backward-compatible with the old version of the module.

I plan to take a look at it, but won't get a chance to do so for several
days.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050825/fd21d4b6/attachment.pgp

From foom at fuhm.net  Thu Aug 25 06:45:12 2005
From: foom at fuhm.net (James Y Knight)
Date: Thu, 25 Aug 2005 00:45:12 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <bbaeab100508241839436eea14@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
	<ca471dc2050824101579f8304@mail.gmail.com>
	<bbaeab100508241839436eea14@mail.gmail.com>
Message-ID: <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net>


On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote:
> On 8/24/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
>> On 8/24/05, James Y Knight <foom at fuhm.net> wrote:
>>> I think it must be the case that raising an object which does not
>>> derive from an exception class must be deprecated as well in order
>>> for "except:" to be deprecated. Otherwise, there is nothing you can
>>> change "except:" to in order not to get a deprecation warning and
>>> still have your code be correct in the face of documented  
>>> features of
>>> python.
>>>
>>
>> I agree; isn't that already in ther PEP? This surely has been the
>> thinking all along.
>>
>>
>
> Requiring inheritance of BaseException in order to pass it to 'raise'
> has been in the PEP since the beginning.

Yes, it talks about that as a change that will happen in Python 3.0.  
I was responding to

>> OK, I'm convinced. Let's drop bare except for Python 3.0, and
>> deprecate them until then, without changing the meaning.

which is talking about deprecating bare excepts in Python 2.5. Now  
maybe it's the idea that everything that's slated for removal in  
Python 3.0 by PEP 348 is supposed to be getting a deprecation warning  
in Python 2.5, but that certainly isn't stated. The transition plan  
section says that all that will happen in Python 2.5 is the addition  
of "BaseException".

James

From gvanrossum at gmail.com  Thu Aug 25 07:13:09 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed, 24 Aug 2005 22:13:09 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
	<ca471dc2050824101579f8304@mail.gmail.com>
	<bbaeab100508241839436eea14@mail.gmail.com>
	<5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net>
Message-ID: <ca471dc2050824221313a2c9a1@mail.gmail.com>

On 8/24/05, James Y Knight <foom at fuhm.net> wrote:
> On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote:
> > On 8/24/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> >> On 8/24/05, James Y Knight <foom at fuhm.net> wrote:
> >>> I think it must be the case that raising an object which does not
> >>> derive from an exception class must be deprecated as well in order
> >>> for "except:" to be deprecated. Otherwise, there is nothing you can
> >>> change "except:" to in order not to get a deprecation warning and
> >>> still have your code be correct in the face of documented
> >>> features of
> >>> python.
> >>>
> >>
> >> I agree; isn't that already in ther PEP? This surely has been the
> >> thinking all along.
> >>
> >>
> >
> > Requiring inheritance of BaseException in order to pass it to 'raise'
> > has been in the PEP since the beginning.
> 
> Yes, it talks about that as a change that will happen in Python 3.0.
> I was responding to
> 
> >> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> >> deprecate them until then, without changing the meaning.
> 
> which is talking about deprecating bare excepts in Python 2.5. Now
> maybe it's the idea that everything that's slated for removal in
> Python 3.0 by PEP 348 is supposed to be getting a deprecation warning
> in Python 2.5, but that certainly isn't stated. The transition plan
> section says that all that will happen in Python 2.5 is the addition
> of "BaseException".

Then maybe the PEP isn't perfect just yet. :-)

It's never too early to start deprecating a feature we know will
disappear in 3.0.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mwh at python.net  Thu Aug 25 10:16:02 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 25 Aug 2005 09:16:02 +0100
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc20508241728b6ed4df@mail.gmail.com> (Guido van Rossum's
	message of "Wed, 24 Aug 2005 17:28:35 -0700")
References: <005e01c5a8e2$49102fc0$b729cb97@oemcomputer>
	<2mbr3nru36.fsf@starship.python.net>
	<ca471dc20508241728b6ed4df@mail.gmail.com>
Message-ID: <2m7jeasa7x.fsf@starship.python.net>

Guido van Rossum <gvanrossum at gmail.com> writes:

> On 8/24/05, Michael Hudson <mwh at python.net> wrote:
>> I really hope string exceptions can be killed off before 3.0.  They
>> should be fully deprecated in 2.5.
>
> But what about class exceptions that don't inherit from Exception?
> That will take a while before we can deprecate that.

Oh, for sure.  I didn't mean to imply anything else.

Cheers,
mwh

-- 
 "Sturgeon's Law (90% of everything is crap) applies to Usenet."
 "Nothing guarantees that the 10% isn't crap, too."
                -- Gene Spafford's Axiom #2 of Usenet, and a corollary

From t-meyer at ihug.co.nz  Thu Aug 25 10:51:18 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Thu, 25 Aug 2005 20:51:18 +1200
Subject: [Python-Dev] python-dev Summary for 2005-08-01 through 2005-08-15
	[draft]
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>

Here's August Part One.  As usual, if anyone can spare the time to proofread
this, that would be great!  Please send any corrections or suggestions to
Steve (steven.bethard at gmail.com) and/or me, rather than cluttering the
list.  Ta!

=============
Announcements
=============

----------------------------
QOTF: Quote of the Fortnight
----------------------------

Some wise words from Donovan Baarda in the PEP 347 discussions:

    It is true that some well designed/developed software becomes reliable
very quickly. However, it  still takes heavy use over time to prove that.

Contributing thread:

- `PEP: Migrating the Python CVS to Subversion
<http://mail.python.org/pipermail/python-dev/2005- August/055105.html>`__

[SJB]

------------
Process PEPs
------------

The PEP editors have introduced a new PEP category: "Process", for PEPs that
don't fit into the  "Standards Track" and "Informational" categories.  More
detail can be found in `PEP 1`_, which it  itself a Process PEP.

.. _PEP 1: http://www.python.org/peps/pep-0001.html

Contributing thread:

- `new PEP type: Process
<http://mail.python.org/pipermail/python-dev/2005-August/055361.html>`__

[TAM]

-----------------------------------------------
Tentative Schedule for 2.4.2 and 2.5a1 Releases
-----------------------------------------------

Python 2.4.2 is tentatively scheduled for a mid-to-late September release,
and a first alpha of Python  2.5 for March 2006 (with a final release around
May/June).  This means that a PEP for the 2.5 release,  detailing what will
be included, will likely be created soon; at present there are various
accepted  PEPs that have not yet been implemented.

Contributing thread:

- `plans for 2.4.2 and 2.5a1
<http://mail.python.org/pipermail/python-dev/2005-August/055342.html>`__

[TAM]

=========
Summaries
=========

-------------------------------
Moving Python CVS to Subversion
-------------------------------

The `PEP 347`_ discussion from last fortnight continued this week, with a
revision of the PEP, and a  lot more discussion about possible version
control software (RCS) for the Python repository, and where  the repository
should be hosted.  Note that this is not a discussion about bug trackers,
which will  remain with Sourceforge (unless a separate PEP is developed for
moving that).

Many revision control systems were extensively discussed, including
`Subversion`_ (SVN), `Perforce`_,  and `Monotone`_.  Whichever system is
moved to, it should be able to be hosted somewhere (if  *.python.org, then
it needs to be easily installable), needs to have software available to
convert a  repository from CVS, and ideally would be open-source; similarity
to CVS is also an advantage in that  it requires a smaller learning curve
for existing developers.  While Martin isn't willing to discuss  every
system there is, he will investigate those that make him curious, and will
add other people's  submissions to the PEP, where appropriate.

The thread included a short discussion about the authentication mechanism
that svn.python.org will  use; svn+ssh seems to be a clear winner, and a
test repository will be setup by Martin next fortnight.

The possibility of moving to a distributed revision control system
(particularly `Bazaar-NG`_) was  also brought up.  Many people liked the
idea of using a distributed revision control system, but it  seems unlikely
that Bazaar-NG is mature enough to be used for the main Python repository at
the  current time (a move to it at a later time is possible, but outside the
scope of the PEP).   Distributed RCS are meant to reduce the barrier to
participation (anyone can create the their own  branches, for example);
Bazaar-NG is also implemented in Python, which is of some benefit.  James Y
Knight pointed out `svk`_, which lets developers create their own branches
within SVN.

In general, the python-dev crowd is in favour of moving to SVN.  Initial
concern about the demands on  the volunteer admins should the repository be
hosted at svn.python.org were addressed by Barry Warsaw,  who believes that
the load will be easily managed with the existing volunteers.  Various
alternative  hosts were discussed, and if detailed reports about any of them
are created, these can be added to the  PEP.

While the fate of all PEPS lie with the BDFL (Guido), it is likely that the
preferences of those that  frequently check in changes, the pydotorg admins,
and the release managers (who have all given  favourable reports so far),
will have a significant effect on the pronouncement of this PEP.

.. _PEP 347: http://www.python.org/peps/pep-0347.html
.. _svk: http://svk.elixus.org/
.. _Perforce: http://www.perforce.com/
.. _Subversion: http://subversion.tigris.org/
.. _Monotone: http://venge.net/monotone/
.. _Bazaar-NG: http://www.bazaar-ng.org/

Contributing threads:

- `PEP: Migrating the Python CVS to Subversion
<http://mail.python.org/pipermail/python-dev/2005- August/055064.html>`__
- `PEP 347: Migration to Subversion
<http://mail.python.org/pipermail/python-dev/2005- August/055211.html>`__
- `Hosting svn.python.org
<http://mail.python.org/pipermail/python-dev/2005-August/055360.html>`__
- `Fwd: Distributed RCS
<http://mail.python.org/pipermail/python-dev/2005-August/055372.html>`__
- `cvs to bzr?
<http://mail.python.org/pipermail/python-dev/2005-August/055373.html>`__
- `Distributed RCS
<http://mail.python.org/pipermail/python-dev/2005-August/055377.html>`__
- `Fwd: PEP: Migrating the Python CVS to Subversion
<http://mail.python.org/pipermail/python-dev/2005 -August/055388.html>`__
- `On distributed vs centralised SCM for Python
<http://mail.python.org/pipermail/python-dev/2005- August/055432.html>`__

[TAM]

------------------------------------------
PEP 348: Exception Hierarchy in Python 3.0
------------------------------------------

This fortnight mostly concluded the previous discussion about `PEP 348`_,
which sets out a roadmap for  changes to the exception hierarchy in Python
3.0. The proposal was heavily scaled back to retain most  of the current
exception hierarchy unchanged.  A new exception, BaseException, will be
introduced  above Exception in the current hierarchy, and KeyboardInterrupt
and SystemExit will become siblings of  Exception.  The goal here is that::

    except Exception:

will now do the right thing for most cases, that is, it will catch all the
exceptions that you can  generally recover from.  The PEP would also move
NotImplementedError out from under RuntimeError, and  alter the semantics of
the bare except so that::

    except:

is the equivalent of::

    except Exception:

Only BaseException will appear in Python 2.5.  The remaining modifications
will not occur until Python  3.0.

.. _PEP 348: http://www.python.org/peps/pep-0348.html

Contributing threads:

- `Pre-PEP: Exception Reorganization for Python 3.0
<http://mail.python.org/pipermail/python-dev/2005 -August/055063.html>`__
- `PEP, take 2: Exception Reorganization for Python 3.0
<http://mail.python.org/pipermail/python- dev/2005-August/055103.html>`__
- `Exception Reorg PEP checked in
<http://mail.python.org/pipermail/python-dev/2005- August/055138.html>`__
- `PEP 348: Exception Reorganization for Python 3.0
<http://mail.python.org/pipermail/python-dev/2005 -August/055162.html>`__
- `Major revision of PEP 348 committed
<http://mail.python.org/pipermail/python-dev/2005- August/055199.html>`__
- `Exception Reorg PEP revised yet again
<http://mail.python.org/pipermail/python-dev/2005- August/055292.html>`__
- `PEP 348 and ControlFlow
<http://mail.python.org/pipermail/python-dev/2005-August/055310.html>`__
- `PEP 348 (exception reorg) revised again
<http://mail.python.org/pipermail/python-dev/2005- August/055412.html>`__

[SJB]

----------------------
Moving towards Unicode
----------------------

Neil Schemenauer presented `PEP 349`_, which tries to ease the transition to
Python 3.0, in which  there will be a bytes() type for byte data and a str()
type for text data.  Currently to convert an  object to text, you have one
of three options:

* Call str(). This breaks with a UnicodeEncodeError if the object is of type
unicode (or a subtype) or  can only represent itself in unicode and
therefore returns unicode from __str__.
* Call unicode(). This can break external code that is not yet Unicode-safe
and that passed a str  object to your code but got a unicode object back.
* Use the "%s" format specifier. This breaks with a UnicodeEncodeError if
the object can only  represent itself in unicode and therefore returns
unicode from __str__.

`PEP 349`_ attempts to address this problem by introducing a text() builtin
which returns str or  unicode instances unmodified, and returns the result
of calling __str__() on the object otherwise.  Guido preferred to instead
relax the restrictions on str() to allow it to return unicode objects. Neil
implemented such a patch, and found that it broke only two test cases. The
discussion stopped shortly  after Neil's report however, so it was unclear
if any permanent changes had been agreed upon.

Guido made a few other Python 3.0 suggestions in this thread:

* The bytes() type should be mutable with a corresponding frozenbytes()
immutable type
* Opening a file in binary or text mode would cause it to return bytes() or
str() objects,  respectively
* The file type should grow a getpos()/setpos() pair that are identical to
tell()/seek() when a file  is open in binary mode, and which work like
tell()/seek() but on characters instead of bytes when a  file is open in
text mode

However, none of these seemed to be solid commitments.

.. _PEP 349: http://www.python.org/peps/pep-0349.html

Contributing threads:

- `PEP: Generalised String Coercion
<http://mail.python.org/pipermail/python-dev/2005- August/055186.html>`__
- `Generalised String Coercion
<http://mail.python.org/pipermail/python-dev/2005- August/055194.html>`__

[SJB]

----------------------------
PEP 344 and reference cycles
----------------------------

Armin Rigo brought up an issue with `PEP 344`_ which proposes, among other
things, adding a  __traceback__ attribute to exceptions to avoid the hassle
of extracting it from sys.exc_info(). Armin  pointed out that if exceptions
grow a __traceback__ attribute, every statement::

    except Exception, e:

will create a cycle::

    e.__traceback__.tb_frame.f_locals['e']

Despite the fact that Python has cyclic garbage collection, there are still
some situations where  cycles like this can cause problems. Armin showed an
example of such a case::

    class X:
        def __del__(self):
            try:
                typo
            except Exception, e:
                e_type, e_value, e_tb = sys.exc_info()

Even in current Python, instances of the X class are uncollectible. When
garbage collection runs and  tries to collect an X object, it calls the
__del__() method.  This creates the cycle::

    e_tb.tb_frame.f_locals['e_tb']

The X object itself is available through this cycle (in
``f_locals['self']``), so the X object's  refcount does not drop to 0 when
__del__() returns, so it cannot be collected.  The next time garbage
collection runs, it finds that the X object has not been collected, calls
its __del__() method again  and repeats the process.

Tim Peters suggested this problem could be solved by declaring that
__del__() methods are called  exactly once. This allows the above X object
to be collected because on the second run of the garbage  collection,
__del__() is not called again.  Thus, the refcount of the X object is not
incremented, and  so it is collected by garbage collection.  However,
guaranteeing that __del__() is called only once  means keeping track somehow
of which objects' __del__() methods were called, which seemed somewhat
unattractive.

There was also brief talk about removing __del__ in favor of weakrefs, but
those waters seemed about  as murky as the garbage collection ones.

.. _PEP 344: http://www.python.org/peps/pep-0344.html

Contributing thread:

- `__traceback__ and reference cycles
<http://mail.python.org/pipermail/python-dev/2005- August/055251.html>`__

[SJB]

----------------------------
Style for raising exceptions
----------------------------

Guido explained that these days exceptions should always be raised as::

    raise SomeException("some argument")

instead of::

    raise SomeException, "some argument"

The second will go away in Python 3.0, and is only present now for backwards
compatibility.  (It was  necessary when strings could be exceptions, in
order to pass both the exception "type" and message.)   PEPs 8_ and 3000_
were accordingly updated.

.. _8: http://www.python.org/peps/pep-0008.html
.. _3000: http://www.python.org/peps/pep-3000.html

Contributing threads:

- `PEP 8: exception style
<http://mail.python.org/pipermail/python-dev/2005-August/055187.html>`__
- `FW: PEP 8: exception style
<http://mail.python.org/pipermail/python-dev/2005-August/055191.html>`__

[SJB]

-----------------------------------
Skipping list comprehensions in pdb
-----------------------------------

When using pdb, the only way to skip to the end of a loop is to set a
breakpoint on the line after the  loop.  Ilya Sandler suggested adding an
optimal numeric argument to pdb's "next" comment to indicate  how many lines
of code should be skipped.  Martin v. L?wis pointed out that this differs
from gdb's  "next <n>" command, which does "next" n times.  Ilya suggested
implementing gdb's "until" command  instead, which gained Martin's approval.

It was also pointed out that pdb is one of the less Pythonic modules,
particularly in terms of the  ability to subclass/extend, and would be a
good candidate for rewriting, if anyone had the inclination  and time.

Contributing threads:

- `pdb: should next command be extended?
<http://mail.python.org/pipermail/python-dev/2005- August/055218.html>`__
- `an alternative suggestion, Re: pdb: should next command be extended?
<http://mail.python.org/pipermail/python-dev/2005-August/055286.html>`__

[TAM]

------------------
Sets in Python 2.5
------------------

Raymond Hettinger has been checking-in the new implementation for sets in
Python 2.5.  The  implementation is based heavily on dictobject.c, the code
for Python dict() objects, and generally  deviates only when there is an
obvious gain in doing so.  Raymond posted his new API for discussion,  but
there didn't appear to be any comments.

Contributing threads:

- `[Python-checkins] python/dist/src/Objects setobject.c, 1.45, 1.46
<http://mail.python.org/pipermail/python-dev/2005-August/055337.html>`__
- `Discussion draft: Proposed Py2.5 C API for set and frozenset objects
<http://mail.python.org/pipermail/python-dev/2005-August/055365.html>`__

[SJB]

================================
Deferred Threads (for next time)
================================

- `SWIG and rlcompleter
<http://mail.python.org/pipermail/python-dev/2005-August/055413.html>`__

===============
Skipped Threads
===============

- `Extension of struct to handle non byte aligned values?
<http://mail.python.org/pipermail/python- dev/2005-August/055062.html>`__
- `Syscall Proxying in Python
<http://mail.python.org/pipermail/python-dev/2005-August/055069.html>`__
- `__autoinit__ (Was: Proposal: reducing self.x=x; self.y=y; self.z=z
boilerplate code)
<http://mail.python.org/pipermail/python-dev/2005-August/055093.html>`__
- `Weekly Python Patch/Bug Summary
<http://mail.python.org/pipermail/python-dev/2005- August/055110.html>`__
- `PEP 342 Implementation
<http://mail.python.org/pipermail/python-dev/2005-August/055151.html>`__
- `String exceptions in Python source
<http://mail.python.org/pipermail/python-dev/2005- August/055155.html>`__
- `[ python-Patches-790710 ] breakpoint command lists in pdb
<http://mail.python.org/pipermail/python -dev/2005-August/055157.html>`__
- `[C++-sig] GCC version compatibility
<http://mail.python.org/pipermail/python-dev/2005- August/055219.html>`__
- `PyTuple_Pack added references undocumented
<http://mail.python.org/pipermail/python-dev/2005- August/055232.html>`__
- `PEP-- Context Managment variant
<http://mail.python.org/pipermail/python-dev/2005- August/055271.html>`__
- `Sourceforge CVS down?
<http://mail.python.org/pipermail/python-dev/2005-August/055307.html>`__
- `PSF grant / contacts
<http://mail.python.org/pipermail/python-dev/2005-August/055311.html>`__
- `Python + Ping
<http://mail.python.org/pipermail/python-dev/2005-August/055319.html>`__
- `Terminology for PEP 343
<http://mail.python.org/pipermail/python-dev/2005-August/055347.html>`__
- `dev listinfo page (was: Re: Python + Ping)
<http://mail.python.org/pipermail/python-dev/2005- August/055348.html>`__
- `set.remove feature/bug
<http://mail.python.org/pipermail/python-dev/2005-August/055352.html>`__
- `Extension to dl module to allow passing strings from native function
<http://mail.python.org/pipermail/python-dev/2005-August/055363.html>`__
- `build problems on macosx (CVS HEAD)
<http://mail.python.org/pipermail/python-dev/2005- August/055385.html>`__
- `request for code review - hashlib - patch #1121611
<http://mail.python.org/pipermail/python- dev/2005-August/055410.html>`__
- `python-dev Summary for 2005-07-16 through 2005-07-31 [draft]
<http://mail.python.org/pipermail/python-dev/2005-August/055411.html>`__
- `string_join overrides TypeError exception thrown in generator
<http://mail.python.org/pipermail/python-dev/2005-August/055414.html>`__
- `implementation of copy standard lib
<http://mail.python.org/pipermail/python-dev/2005- August/055450.html>`__
- `xml.parsers.expat no userdata in callback functions
<http://mail.python.org/pipermail/python- dev/2005-August/055362.html>`__


From mal at egenix.com  Thu Aug 25 11:35:59 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 25 Aug 2005 11:35:59 +0200
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
 for 2005-08-01 through 2005-08-15	[draft])
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
Message-ID: <430D90FF.6060206@egenix.com>

I must have missed this one:

> ----------------------------
> Style for raising exceptions
> ----------------------------
> 
> Guido explained that these days exceptions should always be raised as::
> 
>     raise SomeException("some argument")
> 
> instead of::
> 
>     raise SomeException, "some argument"
> 
> The second will go away in Python 3.0, and is only present now for backwards
> compatibility.  (It was  necessary when strings could be exceptions, in
> order to pass both the exception "type" and message.)   PEPs 8_ and 3000_
> were accordingly updated.

AFAIR, the second form was also meant to be able to defer
the instantiation of the exception class until really
needed in order to reduce the overhead related to raising
exceptions in Python.

However, that optimization never made it into the implementation,
I guess.

> .. _8: http://www.python.org/peps/pep-0008.html
> .. _3000: http://www.python.org/peps/pep-3000.html
> 
> Contributing threads:
> 
> - `PEP 8: exception style
> <http://mail.python.org/pipermail/python-dev/2005-August/055187.html>`__
> - `FW: PEP 8: exception style
> <http://mail.python.org/pipermail/python-dev/2005-August/055191.html>`__

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From raymond.hettinger at verizon.net  Thu Aug 25 11:35:31 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 25 Aug 2005 05:35:31 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <bbaeab1005082418433c41a5b4@mail.gmail.com>
Message-ID: <000d01c5a958$566e1bc0$b729cb97@oemcomputer>

> > OK, I'm convinced. Let's drop bare except for Python 3.0, and
> > deprecate them until then, without changing the meaning.
> >
> 
> Woohoo

That's no cause for celebration.  Efforts to improve Py3.0 have spilled
over into breaking Py2.x code with no compensating benefits.  Bare
except clauses appear in almost every Python book that has ever been
written and occur at least once in most major Python applications.

I had thought the plan was to introduce Py3.0 capabilities into 2.x as
they become possible but not to break anything.  Isn't that why string
exceptions, buffer(), and repr() still live and breathe?

We don't have to wreck 2.x in order to make 3.0 better.  I wish the 3.0
PEPs would stop until we are actually working on the project and have
some chance of making people's lives better.  If people avoid 2.5 just
to avert unnecessary breakage, then Py3.0 doesn't benefit at all.

I propose that the transition plan be as simple as introducing
BaseException.  This allows people to write code that will work on both
2.x and 3.0.  It doesn't break anything.  

The guidance for cross-version (2.5 to 3.0) code would be:

* To catch all but terminating exceptions, write:

    except (KeyError, SystemExit):
        raise
    except Exception:               
        ...

* To catch all exceptions, write:
    except BaseException:           
        ...


To make the code also run on 2.4 and prior, add transition code:

    try:
        BaseException
    except NameError:
        class BaseException(Exception):
            pass

With that minimal guidance, people can write code that works on from 2.0
to 3.0 and not break anything that is currently working.  No
deprecations are necessary.

Remember, the ONLY benefit from the whole PEP is that in 3.0, it will no
longer be necessary to write "except (KeyError, SystemExit):  raise".
Steven and Jack's research show that that doesn't arise much in practice
anyway.  IOW, there's nothing worth inflicting destruction on tons of
2.x code.


Raymond


From mcherm at mcherm.com  Thu Aug 25 14:28:44 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu, 25 Aug 2005 05:28:44 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
Message-ID: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>

Raymond writes:
> Efforts to improve Py3.0 have spilled
> over into breaking Py2.x code with no compensating benefits. [...]
> We don't have to wreck 2.x in order to make 3.0 better.

I think you're overstating things a bit here.

> Remember, the ONLY benefit from the whole PEP is that in 3.0, it will no
> longer be necessary to write "except (KeyError, SystemExit):  raise".
> [...] IOW, there's nothing worth inflicting destruction on tons of
> 2.x code.

And now I *KNOW* you're overstating things. There are LOTS of benefits
to the PEP in 3.0. My own personal favorite is that users can be
guaranteed that all exceptions thrown will share a particular common
ancestor and type. And no one is proposing "destruction" of 2.x code.

On the other hand, I thought these were very good points:
> Bare
> except clauses appear in almost every Python book that has ever been
> written and occur at least once in most major Python applications.
         [...]
> I had thought the plan was to introduce Py3.0 capabilities into 2.x as
> they become possible but not to break anything.
         [...]
> I propose that the transition plan be as simple as introducing
> BaseException.  This allows people to write code that will work on both
> 2.x and 3.0.

I think the situation is both better than and worse than you describe. The
PEP is now proposing that bare "except:" be removed in Python 3.0. If I
understand Guido correctly, he is proposing that in 2.5 the use of
bare "except:" generate a PendingDeprecationWarning so that conscientious
developers who want to write code now that will continue to work in
Python 3.0 can avoid using bare "except:". Perhaps I'm misreading him
here, but I presume this was intended as a PENDINGDeprecationWarning so
that it's easy to ignore.

But it's a bit worse than it might seem, because conscientious users
aren't ABLE to write safe 2.5 code that will run in 3.0. The problem
arises when you need to write code that calls someone else's library
but then unconditionally recovers from errors in it. Correct 2.4 syntax
for this reads as follows:

    try:
        my_result = call_some_library(my_data)
    except (KeyboardInterrupt, MemoryError, SystemError):
        raise
    except:
        report_error()

Correct 3.0 syntax will read like this:

    try:
        my_result = call_some_library(my_data)
    except (KeyboardInterrupt, MemoryError, SystemError):
        raise
    except BaseException:
        report_error()

But no syntax will work in BOTH 2.5 and 3.0. The 2.4 syntax is
illegal in 3.0, and the 3.0 syntax fails to catch exceptions that
do not inherit from BaseException. Such exceptions are deprecated
(by documentation, if not by code) so our conscientious programmer
will never raise them and the standard library avoids doing so.
But "call_some_library()" was written by some less careful
developer, and may well contain these atavisims.

The only complete solution that comes to mind immediately is for
the raising of anything not extending BaseException to raise a
PendingDeprecationWarning as well. Then the conscientious developer
can feel confident again so long as her unit tests are reasonably
exhaustive. If we cannot produce a warning for these, then I'd
rather not produce the warning for the use of bare "except:".
After all, as it's been pointed out, if the use of bare "except:"
is all you are interested in it is quite easy to grep the code to
find all uses.

-- Michael Chermside


From raymond.hettinger at verizon.net  Thu Aug 25 15:03:36 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 25 Aug 2005 09:03:36 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>
Message-ID: <001b01c5a975$68909220$b729cb97@oemcomputer>

> > Efforts to improve Py3.0 have spilled
> > over into breaking Py2.x code with no compensating benefits. [...]
> > We don't have to wreck 2.x in order to make 3.0 better.
> 
> I think you're overstating things a bit here.

It's only an overstatement if Guido didn't mean what he said.  If bare
except clauses are deprecated in 2.x, it WILL affect tons of existing
code and invalidate a portion of almost all Python books.


> > Remember, the ONLY benefit from the whole PEP is that in 3.0, it
will no
> > longer be necessary to write "except (KeyError, SystemExit):
raise".
> > [...] IOW, there's nothing worth inflicting destruction on tons of
> > 2.x code.
> 
> And now I *KNOW* you're overstating things. There are LOTS of benefits
> to the PEP in 3.0. My own personal favorite is that users can be
> guaranteed that all exceptions thrown will share a particular common
> ancestor and type. 

Right, there are a couple of parts of the PEP that were
non-controversial from the start and would likely have happened even in
the absence of the PEP.

My point was that a lot of machinery is being thrown at a tiny problem.
To eliminate the need for "except (KeyError, SystemExit):  raise", we're
rearranging the tree, introducing a new builtin, banning an existing and
popular form of an except clause, and introducing a non-trivial
deprecation that will affect most users.  This is a lot of firepower
directed at a somewhat small problem.


> But no syntax will work in BOTH 2.5 and 3.0. 

There's the rub.  If you can't write code that will work for both, then
there is no reason to force 2.x users to make any changes to their
existing code, especially given that they won't see any benefit from the
mass edits.


> If we cannot produce a warning for these, then I'd
> rather not produce the warning for the use of bare "except:".
> After all, as it's been pointed out, if the use of bare "except:"
> is all you are interested in it is quite easy to grep the code to
> find all uses.

Bingo.  A bare except clause is well known as a consenting adults
construct.  If Guido feels driven to eliminate it from Py3.0, then that
is the way it is.  But for 2.x, why introduce unnecessary pain.

Of course, if bare except clauses weren't banned for 3.0, then we would
have no problem writing code that works on all versions on Python from
2.0 to 3.0, that doen't break existing code, and that doesn't invalidate
the text in Python books.  IMO, that is a nice situation.  Just how
badly do you want to kill bare except clauses.

I propose that leave them alone, and be happy that in 3.0 we can write
"except Exception" and get what we want without any fuss.


Raymond


From sjoerd at acm.org  Thu Aug 25 16:13:40 2005
From: sjoerd at acm.org (Sjoerd Mullender)
Date: Thu, 25 Aug 2005 16:13:40 +0200
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>
References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>
Message-ID: <430DD214.2050208@acm.org>

Michael Chermside wrote:
> Raymond writes:
> 
>>Efforts to improve Py3.0 have spilled
>>over into breaking Py2.x code with no compensating benefits. [...]
>>We don't have to wreck 2.x in order to make 3.0 better.
> 
> 
> I think you're overstating things a bit here.

There is an important point, though.  Recently I read complaints about
the lack of backward compatibility in Python on the fedora-list (mailing
list for users of Fedora Core).  Somebody asked what language he should
learn and people answered, don't learn Python because it changes too
often in backward incompatible ways.  They even suggested using that
other P language because that was much more backward compatible.

Check out the thread starting at
https://www.redhat.com/archives/fedora-list/2005-August/msg01682.html .

-- 
Sjoerd Mullender
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 369 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050825/e4902759/signature.pgp

From gvanrossum at gmail.com  Thu Aug 25 17:10:52 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 08:10:52 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <001b01c5a975$68909220$b729cb97@oemcomputer>
References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>
	<001b01c5a975$68909220$b729cb97@oemcomputer>
Message-ID: <ca471dc2050825081076aa5ff3@mail.gmail.com>

On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> It's only an overstatement if Guido didn't mean what he said.  If bare
> except clauses are deprecated in 2.x, it WILL affect tons of existing
> code and invalidate a portion of almost all Python books.

Deprecation means your code will still work I hope every book that
documents "except:" also adds "but don't use this except under very
special circumstances".

I think you're overreacting (again), Raymond. 3.0 will be much more
successful if we can introduce many of its features into 2.x. Many of
those features are in fact improvements of the language even if they
break old code. We're trying to balance between breaking old code and
introducing new features; deprecation is the accepted way to do this.

Regarding the complaint that Python is changing too fast, that really
sounds like FUD to me. With a new release every 18 months Python is
about as stable as it gets barring dead languages. The PHP is in the
throws of the 4->5 conversion which breaks worse than Python 2->3 will
(Rasmus ia changing object assignment semantics from copying to
sharing).  Maybe they should be warned not to learn Perl because Larry
is deconstructing it all for Perl 6? :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tlesher at gmail.com  Thu Aug 25 17:16:10 2005
From: tlesher at gmail.com (Tim Lesher)
Date: Thu, 25 Aug 2005 11:16:10 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <430DD214.2050208@acm.org>
References: <20050825052844.sby2v04s568sgg00@login.werra.lunarpages.com>
	<430DD214.2050208@acm.org>
Message-ID: <9613db6005082508162794cd5b@mail.gmail.com>

On 8/25/05, Sjoerd Mullender <sjoerd at acm.org> wrote:
> There is an important point, though.  Recently I read complaints about
> the lack of backward compatibility in Python on the fedora-list (mailing
> list for users of Fedora Core).  Somebody asked what language he should
> learn and people answered, don't learn Python because it changes too
> often in backward incompatible ways.  They even suggested using that
> other P language because that was much more backward compatible.

I think you're overstating what actually happened there.  Here's the
actual quote from the thread:

: perl is more portable than python - programs written for perl are far
: more likely to run on a new version of perl than the equivalent for
: python. However, python is probably more readable and writable than perl
: for a new user, and is the language most Fedora system utilities (e.g.
: yum) are written in. Both perl and python run on Windows too.
: 
: You have to be very careful about how you write your code to make it
: portable to both environments. If you need a GUI, you'll need a
: cross-platform GUI toolkit like Qt too.
: 
: If it's only one language to learn, and you're a Fedora user, I'd go for
: python.

Yes, later there were additional posts about portability and
backwards-compatibility, but they were for the most part factually
incorrect (reliance on new 2.x features, not
backwards-incompatibility, were the issue with CML1) and relied to "I
heard that..." information

So your point is well-taken, but the problem is one of user
perception.  That's not a dismissal of the problem--witness the
"JAVA/LISP/Python is too slow" and "all PERL code is cryptic" memes.

To me, this perception problem alone raises the bar on backwards
compatibility. Even if obsoleted features are seldom useed, "$language
breaks old code!" is a virulent meme, in both senses of the word.
-- 
Tim Lesher <tlesher at gmail.com>

From gvanrossum at gmail.com  Thu Aug 25 17:17:24 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 08:17:24 -0700
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
	for 2005-08-01 through 2005-08-15 [draft])
In-Reply-To: <430D90FF.6060206@egenix.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
	<430D90FF.6060206@egenix.com>
Message-ID: <ca471dc205082508174cb1d240@mail.gmail.com>

On 8/25/05, M.-A. Lemburg <mal at egenix.com> wrote:
> I must have missed this one:
> 
> > ----------------------------
> > Style for raising exceptions
> > ----------------------------
> >
> > Guido explained that these days exceptions should always be raised as::
> >
> >     raise SomeException("some argument")
> >
> > instead of::
> >
> >     raise SomeException, "some argument"
> >
> > The second will go away in Python 3.0, and is only present now for backwards
> > compatibility.  (It was  necessary when strings could be exceptions, in
> > order to pass both the exception "type" and message.)   PEPs 8_ and 3000_
> > were accordingly updated.
> 
> AFAIR, the second form was also meant to be able to defer
> the instantiation of the exception class until really
> needed in order to reduce the overhead related to raising
> exceptions in Python.
> 
> However, that optimization never made it into the implementation,
> I guess.

Something equivalent is used internally in the C code, but that
doesn't mean we'll need it in Python code. The optimization only works
if the exception is also *caught* in C code, BTW (it is instantiated
as soon as it is handled by a Python except clause).

Originally, the second syntax was the only available syntax, because
all we had were string exceptions. Now that string exceptions are dead
(although not yet buried :) I really don't see why we need to keep
both versions of the syntax; Python 3.0 will only have one version.
(We're still debating what to do with the traceback argument; wanna
revive PEP 344?)

If you need to raise exceptions fast, pre-instantiate an instance.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Thu Aug 25 17:58:48 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 25 Aug 2005 11:58:48 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc2050825081076aa5ff3@mail.gmail.com>
Message-ID: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>

> Deprecation means your code will still work I hope every book that
> documents "except:" also adds "but don't use this except under very
> special circumstances".
> 
> I think you're overreacting (again), Raymond. 3.0 will be much more
> successful if we can introduce many of its features into 2.x. Many of
> those features are in fact improvements of the language even if they
> break old code. We're trying to balance between breaking old code and
> introducing new features; deprecation is the accepted way to do this.

IMO, the proponents of 2.x deprecation are underreacting.  Deprecation
has a cost -- there needs to be a corresponding payoff.  Deprecation is
warranted if the substitute code would still run on future Pythons
(Michael explained the issues here).  Deprecation is only warranted if
the interim substitute works -- AFAICT, there is no other way to broadly
catch exceptions not derived from Exception.  The effort is only
warranted if it makes the code better -- but here nothing is currently
broken and the new code will be much less attractive and less readable
(if the changes are done correctly); only 3.0 will offer the tools to do
it readably and beautifully.  Also, as we learned with apply(), even if
ignored, the deprecation machinery has a tremendous runtime cost.  None
of this will make upgrading to Py2.5 an attractive option.

There is a reason that over 120 bare except clauses remain in the
standard library despite a number of attempts to get rid of them.  It
won't be trivial to properly evaluate whether each should be Exception
or BaseException; to catch string exceptions; to write the test cases;
to follow other PEPs requiring compatibility with older Pythons; or to
do this in a way that it won't have to be done again for Py3.0.  If the
proponents don't have time to fix the standard library, how can they in
good conscience mandate change for the rest of the world.

Besides, I thought Guido was opposed to efforts to roam through
mountains of code, making alterations in a non-holistic way.  With a
change this complex, the odds of introducing errors are very high.

Fredrik, please speak up.  Someone should represent the users here.  I'm
reached my limit on how much time I can devote to thinking out the
implications of these proposals.  Someone else needs to "overreact". 


From nas at arctrix.com  Thu Aug 25 18:33:03 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Aug 2005 10:33:03 -0600
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>
References: <ca471dc2050825081076aa5ff3@mail.gmail.com>
	<000a01c5a98d$e1c20940$b729cb97@oemcomputer>
Message-ID: <20050825163302.GA21089@mems-exchange.org>

On Thu, Aug 25, 2005 at 11:58:48AM -0400, Raymond Hettinger wrote:
> Deprecation is only warranted if the interim substitute works --
> AFAICT, there is no other way to broadly catch exceptions not
> derived from Exception.

This seems to get to the heart of the problem.  I'm no fan of bare
excepts but I think we could handle them in 2.x (at least for the
next few releases) by providing a workable alternative and then
strongly discouraging their use (like we do for "from x import *").

  Neil

From dieter at handshake.de  Wed Aug 24 21:11:18 2005
From: dieter at handshake.de (Dieter Maurer)
Date: 24 Aug 2005 21:11:18 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <mailman.3386.1124746314.10512.python-list@python.org>
References: <mailman.3386.1124746314.10512.python-list@python.org>
Message-ID: <x7br3ngnft.fsf@handshake.de>

The following message is a courtesy copy of an article
that has been posted to comp.lang.python as well.

Neil Schemenauer <nas at arctrix.com> writes on Mon, 22 Aug 2005 15:31:42 -0600:
> ...
>     Some code may require that str() returns a str instance.  In the
>     standard library, only one such case has been found so far.  The
>     function email.header_decode() requires a str instance and the
>     email.Header.decode_header() function tries to ensure this by
>     calling str() on its argument.  The code was fixed by changing
>     the line "header = str(header)" to:
> 
>         if isinstance(header, unicode):
>             header = header.encode('ascii')

Note, that this is not equivalent to the old "str(header)":

  "str(header)" used Python's "default encoding" while the
  new code uses 'ascii'.

  The new code might be more correct than the old one has been.


> ...
> Alternative Solutions
> 
>     A new built-in function could be added instead of changing str().
>     Doing so would introduce virtually no backwards compatibility
>     problems.  However, since the compatibility problems are expected to
>     rare, changing str() seems preferable to adding a new built-in.

Can we get a new builtin with the exact same behaviour as
the current "str" which can be used when we do require an "str"
(and cannot use a "unicode").


Dieter

From gvwilson at cs.utoronto.ca  Wed Aug 24 14:05:23 2005
From: gvwilson at cs.utoronto.ca (Greg Wilson)
Date: Wed, 24 Aug 2005 08:05:23 -0400 (EDT)
Subject: [Python-Dev] [Argon] Re: 51 Million calls to
 _PyUnicodeUCS2_IsLinebreak() (???)
In-Reply-To: <430C48F9.8060801@v.loewis.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de> <430C48F9.8060801@v.loewis.de>
Message-ID: <Pine.GSO.4.58.0508240802461.27108@dvp.cs>

Hi Martin (and everyone else); thanks for your mail.  The N*N/2
invocations would explain why we saw such a large number of invocations
--- thanks for figuring it out.  W.r.t. how we're invoking our script:

> > But if you're using CGI, you're importing your source on every
> > invocation.
>
> Well, no. Only the CGI script needs to be parsed every time; all modules
> could load off bytecode files.
>
> Which suggests that Keir Mierle doesn't use bytecode files, I think he
> should.

Yes, mod_python and .pyc's are the obviously way to go --- once the code
actually works ;-).  I just wanted students to have as few moving parts as
possible while debugging.

Thanks again,
Greg

From gvwilson at cs.utoronto.ca  Wed Aug 24 20:20:59 2005
From: gvwilson at cs.utoronto.ca (Greg Wilson)
Date: Wed, 24 Aug 2005 14:20:59 -0400 (EDT)
Subject: [Python-Dev] [Argon] Re: 51 Million calls to
 _PyUnicodeUCS2_IsLinebreak() (???)
In-Reply-To: <430CB987.5000601@livinglogic.de>
References: <20050823201021.GE32195@cs.toronto.edu>
	<430C41BD.8010602@livinglogic.de>
	<430C48F9.8060801@v.loewis.de> <430C5E6E.2040405@livinglogic.de>
	<430C6E7E.7070106@v.loewis.de> <430C7F4C.9010703@livinglogic.de>
	<430C854E.1080200@v.loewis.de> <430CA75F.7090900@livinglogic.de>
	<430CB0AE.1040201@v.loewis.de> <430CB987.5000601@livinglogic.de>
Message-ID: <Pine.GSO.4.58.0508241419430.27108@dvp.cs>

> > Walter D?rwald wrote:
> >>At least it would remove the quadratic number of calls to
> >>_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only
> >>once.

> Martin v. L?wis wrote:
> > Correct. However, I very much doubt that this is the cause of the
> > slowdown.

> Walter D?rwald wrote:
> Probably. We'd need a test with the original Argon source to really know.

We can do that.

> OK, so should we add this for 2.4.2 or only for 2.5?

2.4.2 please ;-)

Thanks,
Greg

From gvanrossum at gmail.com  Thu Aug 25 19:01:33 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 10:01:33 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>
References: <ca471dc2050825081076aa5ff3@mail.gmail.com>
	<000a01c5a98d$e1c20940$b729cb97@oemcomputer>
Message-ID: <ca471dc205082510014b39a72@mail.gmail.com>

On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>[...] AFAICT, there is no other way to broadly
> catch exceptions not derived from Exception.

But there is rarely a need to do so. I bet you that 99 out of 100 bare
excepts in the stdlib could be replaced by "except Exception" without
breaking anything, since they only expect a wide variety of standard
exceptions, and don't care about string exceptions or user exceptions.

The exception is the first of the two bare except: clauses in code.py.

> The effort is only
> warranted if it makes the code better -- but here nothing is currently
> broken and the new code will be much less attractive and less readable
> (if the changes are done correctly); only 3.0 will offer the tools to do
> it readably and beautifully.

Please explain? If 9 out of 10 bare excepts can safely be replaced by
"except Exception", what's not beautiful about that?

> Also, as we learned with apply(), even if
> ignored, the deprecation machinery has a tremendous runtime cost.  None
> of this will make upgrading to Py2.5 an attractive option.

Not in this case; bare except: can be flagged by the parser so the
warning happens only once per compilation.
 
> There is a reason that over 120 bare except clauses remain in the
> standard library despite a number of attempts to get rid of them.

I betcha almost all of then can safely be replaced with "except Exception".

> Besides, I thought Guido was opposed to efforts to roam through
> mountains of code, making alterations in a non-holistic way.

This is trumped by the need to keep the standard library warning-free.

But how about the following compromise: make it a silent deprecation
in 2.5, and a full deprecation in 2.6.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas at arctrix.com  Thu Aug 25 19:03:32 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Aug 2005 11:03:32 -0600
Subject: [Python-Dev] Revised PEP 349: Allow str() to return unicode
	strings
In-Reply-To: <x7br3ngnft.fsf@handshake.de>
References: <mailman.3386.1124746314.10512.python-list@python.org>
	<x7br3ngnft.fsf@handshake.de>
Message-ID: <20050825170332.GA21225@mems-exchange.org>

On Wed, Aug 24, 2005 at 09:11:18PM +0200, Dieter Maurer wrote:
> Neil Schemenauer <nas at arctrix.com> writes on Mon, 22 Aug 2005 15:31:42 -0600:
> >     The code was fixed by changing
> >     the line "header = str(header)" to:
> > 
> >         if isinstance(header, unicode):
> >             header = header.encode('ascii')
> 
> Note, that this is not equivalent to the old "str(header)":
> 
>   "str(header)" used Python's "default encoding" while the
>   new code uses 'ascii'.

It also doesn't call __str__ if the object is not a basestring
instance.  I have a hard time understanding the exact purpose of
calling str() here.  Maybe Barry can comment.

> Can we get a new builtin with the exact same behaviour as
> the current "str" which can be used when we do require an "str"
> (and cannot use a "unicode").

That fact that no code in the standard library requires such a
function (AFAIK), leads me to believe that it would not be useful
enough to be made a built-in.  You would just write it yourself:

    def mystr(s):
        s = str(s)
        if isinstance(s, unicode):
            s = s.encode(sys.getdefaultencoding())
        return s

Cheers,

  Neil

From reinhold-birkenfeld-nospam at wolke7.net  Thu Aug 25 19:17:14 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Thu, 25 Aug 2005 19:17:14 +0200
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>
References: <ca471dc2050825081076aa5ff3@mail.gmail.com>
	<000a01c5a98d$e1c20940$b729cb97@oemcomputer>
Message-ID: <dekuer$a03$1@sea.gmane.org>

Raymond Hettinger wrote:
>> Deprecation means your code will still work I hope every book that
>> documents "except:" also adds "but don't use this except under very
>> special circumstances".
>> 
>> I think you're overreacting (again), Raymond. 3.0 will be much more
>> successful if we can introduce many of its features into 2.x. Many of
>> those features are in fact improvements of the language even if they
>> break old code. We're trying to balance between breaking old code and
>> introducing new features; deprecation is the accepted way to do this.

> Fredrik, please speak up.  Someone should represent the users here.  I'm
> reached my limit on how much time I can devote to thinking out the
> implications of these proposals.  Someone else needs to "overreact".

Perhaps I may add a pragmatic POV (yes, I know that "pragmatic" is usually
attributed to another language ;-).

If "except:" issues a deprecation warning in 2.5, many people will come and
say "woohoo, Python breaks backwards compatibility" and "I knew it, Python
is unreliable, my script issues 1,233 warnings now" and such.

You can see this effect looking at the discussion that broke out when Guido
announced that map, filter and reduce would vanish (as builtins) in 3.0.
People spoke up and said, "if that's going to be the plan, I'll stop using
Python" etc.

That said, I think that unless it is a new feature (like with statements)
transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0,
everyone expects a clear cut and a compatibility breach.

Reinhold

-- 
Mail address is perfectly valid!


From raymond.hettinger at verizon.net  Thu Aug 25 19:28:15 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 25 Aug 2005 13:28:15 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082510014b39a72@mail.gmail.com>
Message-ID: <000a01c5a99a$61080360$b729cb97@oemcomputer>


> > Also, as we learned with apply(), even if
> > ignored, the deprecation machinery has a tremendous runtime cost.
None
> > of this will make upgrading to Py2.5 an attractive option.
> 
> Not in this case; bare except: can be flagged by the parser so the
> warning happens only once per compilation.

That's good news.  It mitigates runtime cost completely.


> > There is a reason that over 120 bare except clauses remain in the
> > standard library despite a number of attempts to get rid of them.
> 
> I betcha almost all of then can safely be replaced with "except
> Exception".

Because the tree is not being re-arranged until 3.0, those cases should
also introduce a preceding:

   except (KeyboardInterrupt, SystemExit):
       raise

Anywhere that doesn't apply will need:

   except BaseException:
      . . .

and also some corresponding backwards compatibility code to work with
older pythons.

If any are expected to work with user or third-party modules, then they
cannot safely ignore string exceptions and exceptions not derived from
Exception.  

Each of those changes needs to be accompanied by test cases so that all
code paths get exercised.

After the change, we should run Zope, Twisted, Gadfly, etc to make sure
no major application got broken.  Long running apps should verify that
their recover and restart routines haven't been compromised.  This is
doubly true if the invariant for a bare except was being relied upon as
a security measure (this may or may not be a real issue).


> But how about the following compromise: make it a silent deprecation
> in 2.5, and a full deprecation in 2.6.

I'd love to compromise but it's your language.  If you're going to
deprecate, just do it.  Pulling the band-aid off slowly doesn't lessen
the total pain.

My preference is of course, to leave 2.x alone and make this part of the
attraction to 3.0.  Remember, none of the code changes buys us anything
in 2.x.  It is an exercise without payoff.

My even stronger preference is to leave bare excepts in for Py3.0.  That
buys us a happy world where code old code continues to work and new code
can be written that functions as intended on all pythons new and old.

I'm no fan of bare exceptions, but I'm not inclined to shoot myself in
the foot to be rid of them.  I wish Fredrik would chime in.  He would
have something pithy, angry, and incisive to say about this.


Raymond


From Scott.Daniels at Acm.Org  Thu Aug 25 19:30:22 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Thu, 25 Aug 2005 10:30:22 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000d01c5a958$566e1bc0$b729cb97@oemcomputer>
References: <bbaeab1005082418433c41a5b4@mail.gmail.com>
	<000d01c5a958$566e1bc0$b729cb97@oemcomputer>
Message-ID: <dekv7d$ckk$1@sea.gmane.org>

Raymond Hettinger wrote:
>... I propose that the transition plan be as simple as introducing
> BaseException.  This allows people to write code that will work on both
> 2.x and 3.0.  It doesn't break anything.  
> 
> The guidance for cross-version (2.5 to 3.0) code would be:
> 
> * To catch all but terminating exceptions, write:
> 
>     except (KeyError, SystemExit):
>         raise
>     except Exception:               
>         ...
How about:
       except BaseException, error:
           if not isinstance(error, Exception):
               raise
           ...

This would accommodate other invented exceptions  such as
"FoundConvergance(BaseException)", which is my pseudo-example
for an exiting exception that is not properly a subclass of
either KeyError or SystemExit.  The idea is a relaxation stops
when it doesn't move and may start generating something silly
like divide-by-zero.  Not the end of an App, but the end of a Phase.

--Scott David Daniels
Scott.Daniels at Acm.Org


From gvanrossum at gmail.com  Thu Aug 25 19:43:45 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 10:43:45 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000a01c5a99a$61080360$b729cb97@oemcomputer>
References: <ca471dc205082510014b39a72@mail.gmail.com>
	<000a01c5a99a$61080360$b729cb97@oemcomputer>
Message-ID: <ca471dc205082510433918986f@mail.gmail.com>

On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:

> I wish Fredrik would chime in.  He would
> have something pithy, angry, and incisive to say about this.

Raymond, I'm sick of the abuse. Consider the PEP rejected.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mcherm at mcherm.com  Thu Aug 25 19:52:19 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu, 25 Aug 2005 10:52:19 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
Message-ID: <20050825105219.o99mjihmuycgckos@login.werra.lunarpages.com>

Guido:
> But how about the following compromise: make it a silent deprecation
> in 2.5, and a full deprecation in 2.6.

Reinhold Birkenfeld:
> That said, I think that unless it is a new feature (like with statements)
> transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0,
> everyone expects a clear cut and a compatibility breach.

Raymond:
> I'd love to compromise but it's your language.  If you're going to
> deprecate, just do it.  Pulling the band-aid off slowly doesn't lessen
> the total pain.

There are actually THREE possible levels of deprecation available. In order
of severity, they are:

  1. Modifying the documentation to advise people to avoid this feature.
     No one gets alerted.

  2. Using a PendingDeprecationWarning so people who explicitly request
     it can have the compiler alert them when they use it.

  3. Using a DeprecationWarning so people using it are alerted unless they
     explicitly request NOT to be alerted.

I think 3 is unwarrented in this case. For reasons I explained in a previous
posting, I would be in favor of 2 if we can *also* have a
PendingDeprecationWarning for use of string exceptions and arbitrary-object
exceptions (those not derived from BaseException). I am in favor of 3 in
any case. Of course, that's just one person's opinion...


Raymond also raised this excellent point:
> There is a reason that over 120 bare except clauses remain in the
> standard library despite a number of attempts to get rid of them. [...]
> If the proponents don't have time to fix the standard library, how can
> they in good conscience mandate change for the rest of the world.

That seems like a fair criticism to me. As we've already noted, it is
impossible to replace ALL uses of bare "except:" in 2.5 (particularly
the

From mcherm at mcherm.com  Thu Aug 25 19:55:30 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu, 25 Aug 2005 10:55:30 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
Message-ID: <20050825105530.ev2xwy7r754wogo0@login.werra.lunarpages.com>


[PLEASE IGNORE PREVIOUS EMAIL... I HIT [Send] BY MISTAKE]

Guido:
> But how about the following compromise: make it a silent deprecation
> in 2.5, and a full deprecation in 2.6.

Reinhold Birkenfeld:
> That said, I think that unless it is a new feature (like with statements)
> transitions to Python 3.0 shouldn't be enforced in the 2.x series. With 3.0,
> everyone expects a clear cut and a compatibility breach.

Raymond:
> I'd love to compromise but it's your language.  If you're going to
> deprecate, just do it.  Pulling the band-aid off slowly doesn't lessen
> the total pain.

There are actually THREE possible levels of deprecation available. In order
of severity, they are:

  1. Modifying the documentation to advise people to avoid this feature.
     No one gets alerted.

  2. Using a PendingDeprecationWarning so people who explicitly request
     it can have the compiler alert them when they use it.

  3. Using a DeprecationWarning so people using it are alerted unless they
     explicitly request NOT to be alerted.

I think 3 is unwarrented in this case. For reasons I explained in a previous
posting, I would be in favor of 2 if we can *also* have a
PendingDeprecationWarning for use of string exceptions and arbitrary-object
exceptions (those not derived from BaseException). I am in favor of 3 in
any case. Of course, that's just one person's opinion...


Raymond also raised this excellent point:
> There is a reason that over 120 bare except clauses remain in the
> standard library despite a number of attempts to get rid of them. [...]
> If the proponents don't have time to fix the standard library, how can
> they in good conscience mandate change for the rest of the world.

That seems like a fair criticism to me. As we've already noted, it is
impossible to replace ALL uses of bare "except:" in 2.5 (particularly
the use in code.py that Guido referred to). But we ought to make an
extra effort to remove unnecessary uses of bare "except:" from the
standard library if we intend to deprecate it.

-- Michael Chermisde


From steve at holdenweb.com  Thu Aug 25 20:30:53 2005
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 25 Aug 2005 14:30:53 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082510433918986f@mail.gmail.com>
References: <ca471dc205082510014b39a72@mail.gmail.com>	<000a01c5a99a$61080360$b729cb97@oemcomputer>
	<ca471dc205082510433918986f@mail.gmail.com>
Message-ID: <430E0E5D.8010503@holdenweb.com>

Guido van Rossum wrote:
> On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> 
> 
>>I wish Fredrik would chime in.  He would
>>have something pithy, angry, and incisive to say about this.
> 
> 
> Raymond, I'm sick of the abuse. Consider the PEP rejected.
> 
Perhaps you should go for the ?10 argument next door?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/

From steve at holdenweb.com  Thu Aug 25 20:30:53 2005
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 25 Aug 2005 14:30:53 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082510433918986f@mail.gmail.com>
References: <ca471dc205082510014b39a72@mail.gmail.com>	<000a01c5a99a$61080360$b729cb97@oemcomputer>
	<ca471dc205082510433918986f@mail.gmail.com>
Message-ID: <430E0E5D.8010503@holdenweb.com>

Guido van Rossum wrote:
> On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> 
> 
>>I wish Fredrik would chime in.  He would
>>have something pithy, angry, and incisive to say about this.
> 
> 
> Raymond, I'm sick of the abuse. Consider the PEP rejected.
> 
Perhaps you should go for the ?10 argument next door?

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/


From rrr at ronadam.com  Thu Aug 25 20:33:35 2005
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 25 Aug 2005 14:33:35 -0400
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>
References: <000a01c5a98d$e1c20940$b729cb97@oemcomputer>
Message-ID: <430E0EFF.1080707@ronadam.com>

Raymond Hettinger wrote:
>>Deprecation means your code will still work I hope every book that
>>documents "except:" also adds "but don't use this except under very
>>special circumstances".
>>
>>I think you're overreacting (again), Raymond. 3.0 will be much more
>>successful if we can introduce many of its features into 2.x. Many of
>>those features are in fact improvements of the language even if they
>>break old code. We're trying to balance between breaking old code and
>>introducing new features; deprecation is the accepted way to do this.

> Fredrik, please speak up.  Someone should represent the users here.  I'm
> reached my limit on how much time I can devote to thinking out the
> implications of these proposals.  Someone else needs to "overreact".

How about a middle of the road (or there abouts) opinion from an average
user?  Just my 2 cents anyways.

I get the impression that just how much existing code will work or not
work in 3.0 is still fairly up in the air.  Python 3.0 still quite a
ways off from what I understand.

So to me.. depreciating anything at this time that's not going to be
removed *before* Python 3.0 is possibly jumping the gun a bit. (IMHO) It
definitely makes since to depreciate anything that will be removed prior
to Python 3.0.  And to also document anything that will be changed in
3.0. (but not depreciate yet)

If/when it is decided (maybe it already has) that a smooth transition
can be made between 2.x and 3.0 with a high degree of backwards
compatibility, then depreciating 2.x features that will be removed from
3.0 makes since at some point but maybe not in 2.5.

If it turns out that the amount of changes in 3.0 are such as to be a
"New but non backwards compatible version of Python" with a lot of
really great new features.  Then depreciating items in 2.x that will not
be removed from 2.x seems like it gives a since of false hope.  It might
be better to just document the differences (but not depreciate them) and
make a clean break.

Or to put it another way... having a lot of depreciated items in the
final 2.x version may give a message 2.x is flawed, yet it may not be
possible for many programs to move to 3.0 easily for some time if there
are a lot of large changes.

My opinion is... I would rather see the final version of 2.x not have
any depreciated items and efforts be made to make it the best and most
dependable 2.x version that will be around for a while.  And then have
Python 3.0 be a new beginning and an open book without the backwards
compatible chains holding it back.  That dosen't mean it won't be, I
think it's just too soon to tell to what degree.

At this time the efforts towards 3.0 seem to be towards those
improvements that may be included in some future version of 2.x which is
great.

Is it possible the big changes have yet to be considered for Python 3.0?


Cheers,
Ron


From ianb at colorstudy.com  Thu Aug 25 21:10:43 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 25 Aug 2005 14:10:43 -0500
Subject: [Python-Dev] PEP 342: simple example, closure alternative
Message-ID: <430E17B3.3080900@colorstudy.com>

I was trying to translate a pattern that uses closures in a language 
like Scheme (where closed values can be written to) to generators using 
PEP 342, but I'm not clear exactly how it works; the examples in the PEP 
have different motivations.  Since I can't actually run these examples, 
perhaps someone could confirm or debug these:

A closure based accumulator (using Scheme):

(define (accum n)
  (lambda (incr)
   (set! n (+ n incr))
   n))
(define s (accum 0))
(s 1) ; -> 1 == 0+1
(s 5) ; -> 6 == 1+5

So I thought the generator version might look like:

def accum(n):
     while 1:
         incr = (yield n) or 0
         n += incr

 >>> s = accum(0)
 >>> s.next()
 >>> s.send(1)
0
 >>> s.send(5)
1
 >>> s.send(1)
6

Is the order of the output correct?  Is there a better way to write 
accum, that makes it feel more like the closure-based version?

Is this for loop correct?

 >>> s = accum(0)
 >>> for i in s:
...     if i >= 10: break
...     print i,
...     assert s.send(2) == i
0 2 4 6 8

Hmm... maybe this would make it feel more closure-like:

def closure_like(func):
     def replacement(*args, **kw):
         return ClosureLike(func(*args, **kw))
     return replacement

class ClosureLike(object):
     def __init__(self, iterator):
         self.iterator = iterator
         # I think this initial .next() is required, but I'm
         # very confused on this point:
         assert self.iterator.next() is None
     def __call__(self, input):
         assert self.iterator.send(input) is None
         return self.iterator.next()

@closure_like
def accum(n):
     while 1:
         # yields should always be in pairs, the first yield is input
         # and the second yield is output.
         incr = (yield) # this line is equivalent to (lambda (incr)...
         n += incr      # equivalent to (set! ...)
         yield n        # equivalent to n; this yield always returns None

 >>> s = accum(0)
 >>> s(1)
1
 >>> s(5)
6

Everything before the first (yield) is equivalent to the closed values 
between "(define (accum n)" and "(lambda" (for this example there's 
nothing there; I guess a more interesting example would have closed 
variables that were written to that were not function parameters).

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From pje at telecommunity.com  Thu Aug 25 21:23:10 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 25 Aug 2005 15:23:10 -0400
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <430E17B3.3080900@colorstudy.com>
Message-ID: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com>

At 02:10 PM 8/25/2005 -0500, Ian Bicking wrote:
>I was trying to translate a pattern that uses closures in a language
>like Scheme (where closed values can be written to) to generators using
>PEP 342, but I'm not clear exactly how it works; the examples in the PEP
>have different motivations.  Since I can't actually run these examples,
>perhaps someone could confirm or debug these:
>
>A closure based accumulator (using Scheme):
>
>(define (accum n)
>   (lambda (incr)
>    (set! n (+ n incr))
>    n))
>(define s (accum 0))
>(s 1) ; -> 1 == 0+1
>(s 5) ; -> 6 == 1+5
>
>So I thought the generator version might look like:
>
>def accum(n):
>      while 1:
>          incr = (yield n) or 0
>          n += incr
>
>  >>> s = accum(0)
>  >>> s.next()

The initial next() will yield 0, not None.

>  >>> s.send(1)
>0

1

>  >>> s.send(5)
>1

6

>  >>> s.send(1)
>6

7


>Is the order of the output correct?  Is there a better way to write
>accum, that makes it feel more like the closure-based version?
>
>Is this for loop correct?
>
>  >>> s = accum(0)
>  >>> for i in s:
>...     if i >= 10: break
>...     print i,
>...     assert s.send(2) == i
>0 2 4 6 8

The assert will fail on the first pass.  s.send(2) will == i+2, e.g.:

 >>> s = accum(0)
 >>> for i in s:
...     if i>=10: break
...     print i,
...     assert s.send(2) == i+2
...
0 2 4 6 8


From ianb at colorstudy.com  Thu Aug 25 22:12:35 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 25 Aug 2005 15:12:35 -0500
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com>
References: <5.1.1.6.0.20050825151456.028cf830@mail.telecommunity.com>
Message-ID: <430E2633.1020902@colorstudy.com>

Phillip J. Eby wrote:
> At 02:10 PM 8/25/2005 -0500, Ian Bicking wrote:
> 
>>I was trying to translate a pattern that uses closures in a language
>>like Scheme (where closed values can be written to) to generators using
>>PEP 342, but I'm not clear exactly how it works; the examples in the PEP
>>have different motivations.  Since I can't actually run these examples,
>>perhaps someone could confirm or debug these:
>>
>>A closure based accumulator (using Scheme):
>>
>>(define (accum n)
>>  (lambda (incr)
>>   (set! n (+ n incr))
>>   n))
>>(define s (accum 0))
>>(s 1) ; -> 1 == 0+1
>>(s 5) ; -> 6 == 1+5
>>
>>So I thought the generator version might look like:
>>
>>def accum(n):
>>     while 1:
>>         incr = (yield n) or 0
>>         n += incr

Bah, I don't know why this had me so confused.  Well, I kind of know 
why.  So maybe this example would be better written:

def accum(n):
     incr = yield       # wait to get the first incr to be sent in
     while 1:
         n += incr
         incr = yield n # return the new value, wait for next incr

This way it is more explicit all around -- the first call to .next() is 
just setup, kind of like __init__ in an object, except it has to be 
explicitly invoked.
-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org

From skip at pobox.com  Thu Aug 25 22:23:51 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 25 Aug 2005 15:23:51 -0500
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc2050824221313a2c9a1@mail.gmail.com>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
	<ca471dc2050824101579f8304@mail.gmail.com>
	<bbaeab100508241839436eea14@mail.gmail.com>
	<5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net>
	<ca471dc2050824221313a2c9a1@mail.gmail.com>
Message-ID: <17166.10455.121002.213724@montanaro.dyndns.org>


    Guido> It's never too early to start deprecating a feature we know will
    Guido> disappear in 3.0.

Though if it's a widely used feature the troops will be highly annoyed by
all the deprecation warnings.  (Or does deprecation not coincide with
emitting warnings?)

Skip


From skip at pobox.com  Thu Aug 25 22:45:00 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 25 Aug 2005 15:45:00 -0500
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
 for 2005-08-01 through 2005-08-15 [draft])
In-Reply-To: <430D90FF.6060206@egenix.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
	<430D90FF.6060206@egenix.com>
Message-ID: <17166.11724.133034.374929@montanaro.dyndns.org>


    MAL> I must have missed this one:

That's because it was brief and to the point, so the discussion lasted for
maybe three messages.  Also, someone told us you were on holiday so we
thought we could squeak it through without you noticing.  Darn those
Aussies.  Late on the pydev summary again! <wink>

    >> ----------------------------
    >> Style for raising exceptions
    >> ----------------------------
    >> 
    >> Guido explained that these days exceptions should always be raised as::
    >> 
    >> raise SomeException("some argument")
    >> 
    >> instead of::
    >> 
    >> raise SomeException, "some argument"
    >> 
    >> The second will go away in Python 3.0, and is only present now for
    >> backwards compatibility.  (It was necessary when strings could be
    >> exceptions, in order to pass both the exception "type" and message.)
    >> PEPs 8_ and 3000_ were accordingly updated.

I do have a followup question on the style thing.  (I'll leave others to
answer MAL's question about optimization.)  If I want to raise an exception
without an argument, which of the following is the proper form?

    raise ValueError
    raise ValueError()

Skip

From gvanrossum at gmail.com  Fri Aug 26 01:59:31 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 16:59:31 -0700
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
	for 2005-08-01 through 2005-08-15 [draft])
In-Reply-To: <17166.11724.133034.374929@montanaro.dyndns.org>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
	<430D90FF.6060206@egenix.com>
	<17166.11724.133034.374929@montanaro.dyndns.org>
Message-ID: <ca471dc20508251659182995ec@mail.gmail.com>

On 8/25/05, skip at pobox.com <skip at pobox.com> wrote:
> I do have a followup question on the style thing.  (I'll leave others to
> answer MAL's question about optimization.)  If I want to raise an exception
> without an argument, which of the following is the proper form?
> 
>     raise ValueError
>     raise ValueError()

The latter.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Fri Aug 26 02:04:44 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu, 25 Aug 2005 17:04:44 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <17166.10455.121002.213724@montanaro.dyndns.org>
References: <20050824060856.hex8yt68qc0c0g00@login.werra.lunarpages.com>
	<ca471dc205082408103f0adc81@mail.gmail.com>
	<D5E5D912-2D2E-4EA3-B05A-9A89AB21979F@fuhm.net>
	<ca471dc2050824101579f8304@mail.gmail.com>
	<bbaeab100508241839436eea14@mail.gmail.com>
	<5BB88A76-FB4E-41F7-B82D-4B7C5B6D28DD@fuhm.net>
	<ca471dc2050824221313a2c9a1@mail.gmail.com>
	<17166.10455.121002.213724@montanaro.dyndns.org>
Message-ID: <ca471dc205082517047b6cfc69@mail.gmail.com>

On 8/25/05, skip at pobox.com <skip at pobox.com> wrote:
> 
>     Guido> It's never too early to start deprecating a feature we know will
>     Guido> disappear in 3.0.
> 
> Though if it's a widely used feature the troops will be highly annoyed by
> all the deprecation warnings. (Or does deprecation not coincide with
> emitting warnings?)

See Michael Chermside's post.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ark at acm.org  Fri Aug 26 05:18:43 2005
From: ark at acm.org (Andrew Koenig)
Date: Thu, 25 Aug 2005 23:18:43 -0400
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <430E17B3.3080900@colorstudy.com>
Message-ID: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>

> A closure based accumulator (using Scheme):
> 
> (define (accum n)
>   (lambda (incr)
>    (set! n (+ n incr))
>    n))
> (define s (accum 0))
> (s 1) ; -> 1 == 0+1
> (s 5) ; -> 6 == 1+5
> 
> So I thought the generator version might look like:
> 
> def accum(n):
>      while 1:
>          incr = (yield n) or 0
>          n += incr

Maybe I'm missing something but this example seems needlessly tricky to me.
How about doing it this way?

	def accum(n):
		acc = [n]
		def f(incr):
			acc[0] += incr
			return acc[0]
		return f

Here, the [0] turns "read-only" access into write access to a list element.
The list itself isn't written; only its element is.


From ianb at colorstudy.com  Fri Aug 26 06:59:42 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 25 Aug 2005 23:59:42 -0500
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>
References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>
Message-ID: <430EA1BE.9090804@colorstudy.com>

Andrew Koenig wrote:
>>A closure based accumulator (using Scheme):
>>
>>(define (accum n)
>>  (lambda (incr)
>>   (set! n (+ n incr))
>>   n))
>>(define s (accum 0))
>>(s 1) ; -> 1 == 0+1
>>(s 5) ; -> 6 == 1+5
>>
>>So I thought the generator version might look like:
>>
>>def accum(n):
>>     while 1:
>>         incr = (yield n) or 0
>>         n += incr
> 
> 
> Maybe I'm missing something but this example seems needlessly tricky to me.
> How about doing it this way?
> 
> 	def accum(n):
> 		acc = [n]
> 		def f(incr):
> 			acc[0] += incr
> 			return acc[0]
> 		return f
> 
> Here, the [0] turns "read-only" access into write access to a list element.
> The list itself isn't written; only its element is.

I was just exploring how it could be done with coroutines.  But also 
because using lists as pointers isn't that elegant, and isn't something 
I'd encourage people do to coming from other languages (where closures 
are used more heavily).

More generally, I've been doing some language comparisons, and I don't 
like literal but non-idiomatic translations of programming patterns.  So 
I'm considering better ways to translate some of the same use cases.

-- 
Ian Bicking  /  ianb at colorstudy.com  / http://blog.ianbicking.org

From bcannon at gmail.com  Fri Aug 26 08:01:33 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 25 Aug 2005 23:01:33 -0700
Subject: [Python-Dev] Bare except clauses in PEP 348
In-Reply-To: <ca471dc205082510433918986f@mail.gmail.com>
References: <ca471dc205082510014b39a72@mail.gmail.com>
	<000a01c5a99a$61080360$b729cb97@oemcomputer>
	<ca471dc205082510433918986f@mail.gmail.com>
Message-ID: <bbaeab1005082523011c97a18@mail.gmail.com>

The PEP has been rejected.

-Brett

On 8/25/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/25/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> 
> > I wish Fredrik would chime in.  He would
> > have something pithy, angry, and incisive to say about this.
> 
> Raymond, I'm sick of the abuse. Consider the PEP rejected.
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From mal at egenix.com  Fri Aug 26 10:12:12 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 26 Aug 2005 10:12:12 +0200
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
 for	2005-08-01 through 2005-08-15 [draft])
In-Reply-To: <ca471dc205082508174cb1d240@mail.gmail.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>	
	<430D90FF.6060206@egenix.com>
	<ca471dc205082508174cb1d240@mail.gmail.com>
Message-ID: <430ECEDC.7040206@egenix.com>

Guido van Rossum wrote:
> On 8/25/05, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>>I must have missed this one:
>>
>>
>>>----------------------------
>>>Style for raising exceptions
>>>----------------------------
>>>
>>>Guido explained that these days exceptions should always be raised as::
>>>
>>>    raise SomeException("some argument")
>>>
>>>instead of::
>>>
>>>    raise SomeException, "some argument"
>>>
>>>The second will go away in Python 3.0, and is only present now for backwards
>>>compatibility.  (It was  necessary when strings could be exceptions, in
>>>order to pass both the exception "type" and message.)   PEPs 8_ and 3000_
>>>were accordingly updated.
>>
>>AFAIR, the second form was also meant to be able to defer
>>the instantiation of the exception class until really
>>needed in order to reduce the overhead related to raising
>>exceptions in Python.
>>
>>However, that optimization never made it into the implementation,
>>I guess.
> 
> 
> Something equivalent is used internally in the C code, but that
> doesn't mean we'll need it in Python code. The optimization only works
> if the exception is also *caught* in C code, BTW (it is instantiated
> as soon as it is handled by a Python except clause).

Ah, I knew it was in there somewhere (just couldn't find yesterday
when I was looking for the optimization :-).

> Originally, the second syntax was the only available syntax, because
> all we had were string exceptions. Now that string exceptions are dead
> (although not yet buried :) I really don't see why we need to keep
> both versions of the syntax; Python 3.0 will only have one version.

Actually, we do only have one version: the first syntax is just
a special case of the second (with the value argument set
to None).

I don't see a need for two or more syntaxes either, but most code
nowadays uses the second variant (I don't know of any code that
uses the traceback argument), which puts up a high barrier
for changes.

This is from a comment in ceval.c:

	/* We support the following forms of raise:
	   raise <class>, <classinstance>
	   raise <class>, <argument tuple>
	   raise <class>, None
	   raise <class>, <argument>
	   raise <classinstance>, None
	   raise <string>, <object>
	   raise <string>, None

	   An omitted second argument is the same as None.

	   In addition, raise <tuple>, <anything> is the same as
	   raising the tuple's first item (and it better have one!);
	   this rule is applied recursively.

	   Finally, an optional third argument can be supplied, which
	   gives the traceback to be substituted (useful when
	   re-raising an exception after examining it).  */

That's quite a list of combinations that will all break
in Python 3.0 if we only allow "raise <classinstance>".

I guess the reason for most code using the variante "raise
<class>, <argument tuple>" is that it simply looks a lot
like the corresponding "except <class>, errorobj" clause.

> (We're still debating what to do with the traceback argument; wanna
> revive PEP 344?)
> 
> If you need to raise exceptions fast, pre-instantiate an instance.

Ideally, I'd like Python to take care of such optimizations
rather than having to explicitly code for them:

If I write "raise ValueError, 'bad format'" and then
catch the error with just "except ValueError", there would
be no need for Python to actually instantiate the
exception object.

OTOH, lazy instantiation may have unwanted side-effects
(just like any lazy evaluation), e.g. the instantiation
could result in another exception to get raised.

Can't have 'em all, I guess.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 26 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From abkhd at hotmail.com  Fri Aug 26 14:35:22 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Fri, 26 Aug 2005 12:35:22 +0000
Subject: [Python-Dev] operator.c for release24-maint and test_bz2 on Python
	2.4.1
Message-ID: <BAY23-F361A64131F96FC5D84BB37ABAA0@phx.gbl>

Hello there,


The release24-maint check-ins for today contained this typo:

===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/operator.c,v
retrieving revision 2.29
retrieving revision 2.29.4.1
diff -u -d -r2.29 -r2.29.4.1
--- operator.c 4 Dec 2003 22:17:49 -0000 2.29
+++ operator.c 26 Aug 2005 06:43:16 -0000 2.29.4.1
@@ -267,6 +267,9 @@
itemgetterobject *ig;
PyObject *item;

+ if (!_PyArg_NoKeywords("itemgetter()", kdws)) <----- kdws should be kwds
+ return NULL;
+
if (!PyArg_UnpackTuple(args, "itemgetter", 1, 1, &item))
return NULL;


Also I wish to report that testBug1191043 of test_bz2 still fails in some 
cases on Python 2.4.1 on both WinXP Pro and Win98. Following is the output 
of the said test.


#----------------------------Python 2.5a0------------------------------#
# In intrepreted session mode
#-----------------------------------------------------------------------------#
$ python -i
Python 2.5a0 (#65, Aug 26 2005, 14:57:28)
[GCC 3.4.4 (mingw special)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>from test import test_bz2 as t
>>>t.test_main()
testBug1191043 (test.test_bz2.BZ2FileTest) ... ok
testIterator (test.test_bz2.BZ2FileTest) ... ok
testModeU (test.test_bz2.BZ2FileTest) ... ok
testOpenDel (test.test_bz2.BZ2FileTest) ... ok
testOpenNonexistent (test.test_bz2.BZ2FileTest) ... ok
testRead (test.test_bz2.BZ2FileTest) ... ok
testRead100 (test.test_bz2.BZ2FileTest) ... ok
testReadChunk10 (test.test_bz2.BZ2FileTest) ... ok
testReadLine (test.test_bz2.BZ2FileTest) ... ok
testReadLines (test.test_bz2.BZ2FileTest) ... ok
testSeekBackwards (test.test_bz2.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (test.test_bz2.BZ2FileTest) ... ok
testSeekForward (test.test_bz2.BZ2FileTest) ... ok
testSeekPostEnd (test.test_bz2.BZ2FileTest) ... ok
testSeekPostEndTwice (test.test_bz2.BZ2FileTest) ... ok
testSeekPreStart (test.test_bz2.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (test.test_bz2.BZ2FileTest) ... ok
testUniversalNewlinesLF (test.test_bz2.BZ2FileTest) ... ok
testWrite (test.test_bz2.BZ2FileTest) ... ok
testWriteChunks10 (test.test_bz2.BZ2FileTest) ... ok
testWriteLines (test.test_bz2.BZ2FileTest) ... ok
testXReadLines (test.test_bz2.BZ2FileTest) ... ok
testCompress (test.test_bz2.BZ2CompressorTest) ... ok
testCompressChunks10 (test.test_bz2.BZ2CompressorTest) ... ok
testDecompress (test.test_bz2.BZ2DecompressorTest) ... ok
testDecompressChunks10 (test.test_bz2.BZ2DecompressorTest) ... ok
testDecompressUnusedData (test.test_bz2.BZ2DecompressorTest) ... ok
testEOFError (test.test_bz2.BZ2DecompressorTest) ... ok
test_Constructor (test.test_bz2.BZ2DecompressorTest) ... ok
testCompress (test.test_bz2.FuncTest) ... ok
testDecompress (test.test_bz2.FuncTest) ... ok
testDecompressEmpty (test.test_bz2.FuncTest) ... ok
testDecompressIncomplete (test.test_bz2.FuncTest) ... ok

----------------------------------------------------------------------
Ran 33 tests in 4.730s

OK


#----------------------------Python 2.4.1 from CVS ----------------#
# In intrepreted session mode
#-----------------------------------------------------------------------------#
$ python -i
Python 2.4.1 (#65, Aug 26 2005, 14:38:48)
[GCC 3.4.4 (mingw special)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>from test import test_bz2 as t
>>>t.test_main()
testBug1191043 (test.test_bz2.BZ2FileTest) ... ok
testIterator (test.test_bz2.BZ2FileTest) ... ok
testModeU (test.test_bz2.BZ2FileTest) ... ok
testOpenDel (test.test_bz2.BZ2FileTest) ... ok
testOpenNonexistent (test.test_bz2.BZ2FileTest) ... ok
testRead (test.test_bz2.BZ2FileTest) ... ok
testRead100 (test.test_bz2.BZ2FileTest) ... ok
testReadChunk10 (test.test_bz2.BZ2FileTest) ... ok
testReadLine (test.test_bz2.BZ2FileTest) ... ok
testReadLines (test.test_bz2.BZ2FileTest) ... ok
testSeekBackwards (test.test_bz2.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (test.test_bz2.BZ2FileTest) ... ok
testSeekForward (test.test_bz2.BZ2FileTest) ... ok
testSeekPostEnd (test.test_bz2.BZ2FileTest) ... ok
testSeekPostEndTwice (test.test_bz2.BZ2FileTest) ... ok
testSeekPreStart (test.test_bz2.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (test.test_bz2.BZ2FileTest) ... ok
testUniversalNewlinesLF (test.test_bz2.BZ2FileTest) ... ok
testWrite (test.test_bz2.BZ2FileTest) ... ok
testWriteChunks10 (test.test_bz2.BZ2FileTest) ... ok
testWriteLines (test.test_bz2.BZ2FileTest) ... ok
testXReadLines (test.test_bz2.BZ2FileTest) ... ok
testCompress (test.test_bz2.BZ2CompressorTest) ... ok
testCompressChunks10 (test.test_bz2.BZ2CompressorTest) ... ok
testDecompress (test.test_bz2.BZ2DecompressorTest) ... ok
testDecompressChunks10 (test.test_bz2.BZ2DecompressorTest) ... ok
testDecompressUnusedData (test.test_bz2.BZ2DecompressorTest) ... ok
testEOFError (test.test_bz2.BZ2DecompressorTest) ... ok
test_Constructor (test.test_bz2.BZ2DecompressorTest) ... ok
testCompress (test.test_bz2.FuncTest) ... ok
testDecompress (test.test_bz2.FuncTest) ... ok
testDecompressEmpty (test.test_bz2.FuncTest) ... ok
testDecompressIncomplete (test.test_bz2.FuncTest) ... ok

----------------------------------------------------------------------
Ran 33 tests in 5.060s

OK


So here we have a passing test_bz2 test when invoked from inside a running 
Python.


#-------------------------- Python 2.4.1 from CVS -----------------#
# Not in intrepreted session mode
#-----------------------------------------------------------------------------#
However, and in Python 2.4.1 the following happens when the test is not 
invoked from an interpreted session:

$ python ../Lib/test/test_bz2.py
testBug1191043 (__main__.BZ2FileTest) ... ERROR
ERROR
testIterator (__main__.BZ2FileTest) ... ok
testModeU (__main__.BZ2FileTest) ... ok
testOpenDel (__main__.BZ2FileTest) ... ok
testOpenNonexistent (__main__.BZ2FileTest) ... ok
testRead (__main__.BZ2FileTest) ... ok
testRead100 (__main__.BZ2FileTest) ... ok
testReadChunk10 (__main__.BZ2FileTest) ... ok
testReadLine (__main__.BZ2FileTest) ... ok
testReadLines (__main__.BZ2FileTest) ... ok
testSeekBackwards (__main__.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok
testSeekForward (__main__.BZ2FileTest) ... ok
testSeekPostEnd (__main__.BZ2FileTest) ... ok
testSeekPostEndTwice (__main__.BZ2FileTest) ... ok
testSeekPreStart (__main__.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok
testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok
testWrite (__main__.BZ2FileTest) ... ok
testWriteChunks10 (__main__.BZ2FileTest) ... ok
testWriteLines (__main__.BZ2FileTest) ... ok
testXReadLines (__main__.BZ2FileTest) ... ok
testCompress (__main__.BZ2CompressorTest) ... ok
testCompressChunks10 (__main__.BZ2CompressorTest) ... ok
testDecompress (__main__.BZ2DecompressorTest) ... ok
testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok
testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok
testEOFError (__main__.BZ2DecompressorTest) ... ok
test_Constructor (__main__.BZ2DecompressorTest) ... ok
testCompress (__main__.FuncTest) ... ok
testDecompress (__main__.FuncTest) ... ok
testDecompressEmpty (__main__.FuncTest) ... ok
testDecompressIncomplete (__main__.FuncTest) ... ok

======================================================================
ERROR: testBug1191043 (__main__.BZ2FileTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "../Lib/test/test_bz2.py", line 255, in testBug1191043
   lines = bz2f.readlines()
RuntimeError: wrong sequence of bz2 library commands used

======================================================================
ERROR: testBug1191043 (__main__.BZ2FileTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "../Lib/test/test_bz2.py", line 47, in tearDown
   os.unlink(self.filename)
OSError: [Errno 13] Permission denied: '@test'

----------------------------------------------------------------------
Ran 33 tests in 6.210s

FAILED (errors=2)
Traceback (most recent call last):
File "../Lib/test/test_bz2.py", line 357, in ?
   test_main()
File "../Lib/test/test_bz2.py", line 353, in test_main
   FuncTest
File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, in 
run_unittest
   run_suite(suite, testclass)
File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, in 
run_suite
   raise TestFailed(msg)
test.test_support.TestFailed: errors occurred; run in verbose mode for 
details


#-------------------------- Python 2.5a0 from CVS -----------------#
# Not in intrepreted session mode
#-----------------------------------------------------------------------------#
That problem disappears in Python 2.5a0:


$ python ../Lib/test/test_bz2.py
testBug1191043 (__main__.BZ2FileTest) ... ok
testIterator (__main__.BZ2FileTest) ... ok
testModeU (__main__.BZ2FileTest) ... ok
testOpenDel (__main__.BZ2FileTest) ... ok
testOpenNonexistent (__main__.BZ2FileTest) ... ok
testRead (__main__.BZ2FileTest) ... ok
testRead100 (__main__.BZ2FileTest) ... ok
testReadChunk10 (__main__.BZ2FileTest) ... ok
testReadLine (__main__.BZ2FileTest) ... ok
testReadLines (__main__.BZ2FileTest) ... ok
testSeekBackwards (__main__.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok
testSeekForward (__main__.BZ2FileTest) ... ok
testSeekPostEnd (__main__.BZ2FileTest) ... ok
testSeekPostEndTwice (__main__.BZ2FileTest) ... ok
testSeekPreStart (__main__.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok
testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok
testWrite (__main__.BZ2FileTest) ... ok
testWriteChunks10 (__main__.BZ2FileTest) ... ok
testWriteLines (__main__.BZ2FileTest) ... ok
testXReadLines (__main__.BZ2FileTest) ... ok
testCompress (__main__.BZ2CompressorTest) ... ok
testCompressChunks10 (__main__.BZ2CompressorTest) ... ok
testDecompress (__main__.BZ2DecompressorTest) ... ok
testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok
testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok
testEOFError (__main__.BZ2DecompressorTest) ... ok
test_Constructor (__main__.BZ2DecompressorTest) ... ok
testCompress (__main__.FuncTest) ... ok
testDecompress (__main__.FuncTest) ... ok
testDecompressEmpty (__main__.FuncTest) ... ok
testDecompressIncomplete (__main__.FuncTest) ... ok

----------------------------------------------------------------------
Ran 33 tests in 5.880s

OK


Regards
Khalid

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


From reinhold-birkenfeld-nospam at wolke7.net  Fri Aug 26 15:08:57 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Fri, 26 Aug 2005 15:08:57 +0200
Subject: [Python-Dev] operator.c for release24-maint and test_bz2 on
	Python 2.4.1
In-Reply-To: <BAY23-F361A64131F96FC5D84BB37ABAA0@phx.gbl>
References: <BAY23-F361A64131F96FC5D84BB37ABAA0@phx.gbl>
Message-ID: <den49a$n7q$1@sea.gmane.org>

A.B., Khalid wrote:
> Hello there,
> 
> 
> The release24-maint check-ins for today contained this typo:
> 
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Modules/operator.c,v
> retrieving revision 2.29
> retrieving revision 2.29.4.1
> diff -u -d -r2.29 -r2.29.4.1
> --- operator.c 4 Dec 2003 22:17:49 -0000 2.29
> +++ operator.c 26 Aug 2005 06:43:16 -0000 2.29.4.1
> @@ -267,6 +267,9 @@
> itemgetterobject *ig;
> PyObject *item;
> 
> + if (!_PyArg_NoKeywords("itemgetter()", kdws)) <----- kdws should be kwds
> + return NULL;
> +
> if (!PyArg_UnpackTuple(args, "itemgetter", 1, 1, &item))
> return NULL;

Thank you, that is corrected now.

> However, and in Python 2.4.1 the following happens when the test is not 
> invoked from an interpreted session:
> 
> $ python ../Lib/test/test_bz2.py
> testBug1191043 (__main__.BZ2FileTest) ... ERROR
> ERROR

[...]
> ======================================================================
> ERROR: testBug1191043 (__main__.BZ2FileTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "../Lib/test/test_bz2.py", line 255, in testBug1191043
>    lines = bz2f.readlines()
> RuntimeError: wrong sequence of bz2 library commands used
> 
> ======================================================================
> ERROR: testBug1191043 (__main__.BZ2FileTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "../Lib/test/test_bz2.py", line 47, in tearDown
>    os.unlink(self.filename)
> OSError: [Errno 13] Permission denied: '@test'
> 
> ----------------------------------------------------------------------
> Ran 33 tests in 6.210s
> 
> FAILED (errors=2)
> Traceback (most recent call last):
> File "../Lib/test/test_bz2.py", line 357, in ?
>    test_main()
> File "../Lib/test/test_bz2.py", line 353, in test_main
>    FuncTest
> File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, in 
> run_unittest
>    run_suite(suite, testclass)
> File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, in 
> run_suite
>    raise TestFailed(msg)
> test.test_support.TestFailed: errors occurred; run in verbose mode for 
> details

Are you sure that you are calling the newly-built python.exe? It is strange that
the test should pass in interactive mode when it doesn't in normal mode.

For a confirmation, can you execute this piece of code both interactively and
from a file:


        data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 \x00!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t'
        f = open('test.bz2', "wb")
        f.write(data)
        f.close()
        bz2f = BZ2File('test.bz2')
        lines = bz2f.readlines()
        bz2f.close()
        assert lines == ['Test']
        bz2f = BZ2File('test.bz2)
        xlines = list(bz2f.xreadlines())
        bz2f.close()
        assert lines == ['Test']
        os.unlink('test.bz2')

Reinhold

-- 
Mail address is perfectly valid!


From mcherm at mcherm.com  Fri Aug 26 15:15:17 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Fri, 26 Aug 2005 06:15:17 -0700
Subject: [Python-Dev] Style for raising exceptions (python-dev
	Summary	for 2005-08-01 through 2005-08-15 [draft])
Message-ID: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com>

Marc-Andre Lemburg writes:
> This is from a comment in ceval.c:
>
> 	/* We support the following forms of raise:
> 	   raise <class>, <classinstance>
> 	   raise <class>, <argument tuple>
> 	   raise <class>, None
> 	   raise <class>, <argument>
> 	   raise <classinstance>, None
> 	   raise <string>, <object>
> 	   raise <string>, None
>
> 	   An omitted second argument is the same as None.
>
> 	   In addition, raise <tuple>, <anything> is the same as
> 	   raising the tuple's first item (and it better have one!);
> 	   this rule is applied recursively.
>
> 	   Finally, an optional third argument can be supplied, which
> 	   gives the traceback to be substituted (useful when
> 	   re-raising an exception after examining it).  */
>
> That's quite a list of combinations that will all break
> in Python 3.0 if we only allow "raise <classinstance>".

Oh my GOD! Are you saying that in order to correctly read Python code
that a programmer must know all of THAT! I would be entirely
unsurprised to learn that NO ONE on this list... in fact, no one
in the whole world could have reproduced that specification from
memory accurately. I have never seen a more convincing argument for
why we should allow only limited forms in Python 3.0.

And next time that I find myself in need of an obfuscated python
entry, I've got a great trick up my sleeve.

-- Michael Chermside


From abkhd at hotmail.com  Fri Aug 26 15:45:25 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Fri, 26 Aug 2005 13:45:25 +0000
Subject: [Python-Dev] test_bz2 on Python 2.4.1
Message-ID: <BAY23-F25EFABE15D39101A054A4CABAA0@phx.gbl>

Reinhold Birkenfeld wrote:
>Are you sure that you are calling the newly-built python.exe? It is strange 
>that
>the test should pass in interactive mode when it doesn't in normal mode.

>For a confirmation, can you execute this piece of code both interactively 
>and
>from a file:

Yes, both Python's tested are fresh from CVS. Here is the output of the test 
you asked I run

#----------------------
# File: testbz2.py
#----------------------
"""
import os
from bz2 import BZ2File

data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 
\x00!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t'
f = open('test.bz2', "wb")
f.write(data)
f.close()
bz2f = BZ2File('test.bz2')
lines = bz2f.readlines()
bz2f.close()
assert lines == ['Test']
bz2f = BZ2File('test.bz2')
xlines = list(bz2f.xreadlines())
bz2f.close()
assert lines == ['Test']
os.unlink('test.bz2')
"""


-------------
RESULTS:
-------------


#--------------------------- Python 2.5a0 from CVS -----------------#
# Result: passes
$ /g/projs/py25/python/dist/src/MinGW/python testbz2.py


#--------------------------- Python 2.4.1 from CVS -----------------#
# Result: fails
$ /g/projs/py24/python/dist/src/MinGW/python testbz2.py
Traceback (most recent call last):
  File "testbz2.py", line 9, in ?
    lines = bz2f.readlines()
RuntimeError: wrong sequence of bz2 library commands used


#--------------------------- Python 2.4.1 from CVS -----------------#
# Interpreted session: testbz2 fails here as well now
$ /g/projs/py24/python/dist/src/MinGW/python -i
Python 2.4.1 (#65, Aug 26 2005, 14:38:48)
[GCC 3.4.4 (mingw special)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>import os
>>>from bz2 import BZ2File
>>>data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 
>>>\x0                 0!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t'
>>>f = open('test.bz2', "wb")
>>>f.write(data)
>>>f.close()
>>>bz2f = BZ2File('test.bz2')
>>>lines = bz2f.readlines()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
RuntimeError: wrong sequence of bz2 library commands used
>>>raise SystemExit


#--------------------------- Python 2.5a0 from CVS -----------------#
# Interpreted session: testbz2 passes
$ /g/projs/py25/python/dist/src/MinGW/python -i
Python 2.5a0 (#65, Aug 26 2005, 14:57:28)
[GCC 3.4.4 (mingw special)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>import os
>>>from bz2 import BZ2File
>>>data = 'BZh91AY&SY\xd9b\x89]\x00\x00\x00\x03\x80\x04\x00\x02\x00\x0c\x00 
>>>\x0                 0!\x9ah3M\x13<]\xc9\x14\xe1BCe\x8a%t'
>>>f = open('test.bz2', "wb")
>>>f.write(data)
>>>f.close()
>>>bz2f = BZ2File('test.bz2')
>>>lines = bz2f.readlines()
>>>bz2f.close()
>>>assert lines == ['Test']
>>>bz2f = BZ2File('test.bz2')
>>>xlines = list(bz2f.xreadlines())
>>>bz2f.close()
>>>assert lines == ['Test']
>>>os.unlink('test.bz2')
>>>raise SystemExit


Regards,
Khalid

_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar - get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


From fdrake at acm.org  Fri Aug 26 16:01:37 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Aug 2005 10:01:37 -0400
Subject: [Python-Dev]
	=?iso-8859-1?q?Style_for_raising_exceptions_=28pytho?=
	=?iso-8859-1?q?n-dev_Summary=09for_2005-08-01_through_2005-08-15_?=
	=?iso-8859-1?q?=5Bdraft=5D=29?=
In-Reply-To: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com>
References: <20050826061517.jxvx356u9ow0wksw@login.werra.lunarpages.com>
Message-ID: <200508261001.37893.fdrake@acm.org>

On Friday 26 August 2005 09:15, Michael Chermside wrote:
 > Oh my GOD! Are you saying that in order to correctly read Python code
 > that a programmer must know all of THAT! I would be entirely
 > unsurprised to learn that NO ONE on this list... in fact, no one
 > in the whole world could have reproduced that specification from
 > memory accurately. I have never seen a more convincing argument for
 > why we should allow only limited forms in Python 3.0.

No kidding.

The stuff about the tuples is particularly painful, but is specifically there 
to deal with string exceptions and the idiom that an exception could be 
defined as a tuple of exceptions.  In fact, anydbm is particularly 
eggregious:  it defines an error class derived from Exception, and then adds 
that to a tuple with the string exceptions from the specific modules it 
fronts for.  The tuple handling in raise allows anydbm.error to be raised and 
then caught again abstractly, in addition to allow anydbm.error to act as a 
"base" exception that catches the specific errors raised by the backend 
databases.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From paragate at gmx.net  Tue Aug 23 14:59:28 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 23 Aug 2005 14:59:28 +0200
Subject: [Python-Dev] Revised PEP 349: Allow str() to
	return	unicode	strings
In-Reply-To: <430AFCC7.9030402@egenix.com>
References: <20050822213142.GA5702@mems-exchange.org>
	<wtmdot0g.fsf@python.net> <430AFCC7.9030402@egenix.com>
Message-ID: <op.svyo1ero0gn541@theta>


just tested the proposed implementation on a unicode-naive module
basically using

import sys	
import __builtin__
reload( sys ); sys.setdefaultencoding( 'utf-8' )
__builtin__.__dict__[ 'str' ] = new_str_function

et voil?, str() calls in the module are rewritten, and
print u'd?sseldorf' does work as expected(*) (even on
systems where i have no access to sitecustomize, like
at my python-friendly isp's servers).

---
* my expectation is that unicode strings do print out
   as utf-8, as i can't see any better solution.

i suggest to make this option available e.g. via a module in
the standard lib to ease transition for people in case the pep
doesn't make it. it may be applied where deemed necessary and
left ignored otherwise.

if nobody thinks the reload hack is too awful and this solution
stands testing, i guess i'll post it to the aspn cookbook. after
all these countless hours of hunting down ordinal not in range,
finally i'm starting to see some light in the issue.

_wolf


On Tue, 23 Aug 2005 12:39:03 +0200, M.-A. Lemburg <mal at egenix.com> wrote:

> Thomas Heller wrote:
>> Neil Schemenauer <nas at arctrix.com> writes:
>>
>>
>>> [Please mail followups to python-dev at python.org.]
>>>
>>> The PEP has been rewritten based on a suggestion by Guido to change
>>> str() rather than adding a new built-in function.  Based on my
>>> testing, I believe the idea is feasible.  It would be helpful if
>>> people could test the patched Python with their own applications and
>>> report any incompatibilities.
>>>
>>
>>
>> I like the fact that currently unicode(x) is guarateed to return a
>> unicode instance, or raises a UnicodeDecodeError.  Same for str(x),
>> which is guaranteed to return a (byte) string instance or raise an
>> error.
>>
>> Wouldn't also a new function make the intent clearer?
>>
>> So I think I'm +1 on the text() built-in, and -0 on changing str.
>
> Same here.
>
> A new API would also help make the transition easier from the
> current mixed data/text type (strings) to data-only (bytes)
> and text-only (text, renamed from unicode) in Py3.0.
>


-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
-- 
http://mail.python.org/mailman/listinfo/python-list

From tim.peters at gmail.com  Fri Aug 26 16:32:37 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 26 Aug 2005 10:32:37 -0400
Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test
	test_bz2.py, 1.18, 1.19
In-Reply-To: <20050826132405.30B221E4003@bag.python.org>
References: <20050826132405.30B221E4003@bag.python.org>
Message-ID: <1f7befae05082607323db5ceee@mail.gmail.com>

[birkenfeld at users.sourceforge.net]
> Update of /cvsroot/python/python/dist/src/Lib/test
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4822/Lib/test
> 
> Modified Files:
>        test_bz2.py
> Log Message:
> Add list() around xreadlines()
> 
> 
> 
> Index: test_bz2.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/test/test_bz2.py,v
> retrieving revision 1.18
> retrieving revision 1.19
> diff -u -d -r1.18 -r1.19
> --- test_bz2.py 21 Aug 2005 14:16:04 -0000      1.18
> +++ test_bz2.py 26 Aug 2005 13:23:54 -0000      1.19
> @@ -191,7 +191,7 @@
>     def testSeekBackwardsFromEnd(self):
>         # "Test BZ2File.seek(-150, 2)"
>         self.createTempFile()
> -        bz2f = BZ2File(self.filename)
> +        )bz2f = BZ2File(self.filename)

Note that this added a right parenthesis to the start of the line. 
That creates a syntax error, so this test could not have been tried
before checking in.  It also causes test_compiler to fail.

From reinhold-birkenfeld-nospam at wolke7.net  Fri Aug 26 16:46:20 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Fri, 26 Aug 2005 16:46:20 +0200
Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test
 test_bz2.py, 1.18, 1.19
In-Reply-To: <1f7befae05082607323db5ceee@mail.gmail.com>
References: <20050826132405.30B221E4003@bag.python.org>
	<1f7befae05082607323db5ceee@mail.gmail.com>
Message-ID: <den9vs$a2g$1@sea.gmane.org>

Tim Peters wrote:
> [birkenfeld at users.sourceforge.net]
>> Update of /cvsroot/python/python/dist/src/Lib/test
>> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4822/Lib/test
>> 
>> Modified Files:
>>        test_bz2.py
>> Log Message:
>> Add list() around xreadlines()
>> 
>> 
>> 
>> Index: test_bz2.py
>> ===================================================================
>> RCS file: /cvsroot/python/python/dist/src/Lib/test/test_bz2.py,v
>> retrieving revision 1.18
>> retrieving revision 1.19
>> diff -u -d -r1.18 -r1.19
>> --- test_bz2.py 21 Aug 2005 14:16:04 -0000      1.18
>> +++ test_bz2.py 26 Aug 2005 13:23:54 -0000      1.19
>> @@ -191,7 +191,7 @@
>>     def testSeekBackwardsFromEnd(self):
>>         # "Test BZ2File.seek(-150, 2)"
>>         self.createTempFile()
>> -        bz2f = BZ2File(self.filename)
>> +        )bz2f = BZ2File(self.filename)
> 
> Note that this added a right parenthesis to the start of the line. 
> That creates a syntax error, so this test could not have been tried
> before checking in.  It also causes test_compiler to fail.

Thank you for correcting. The parenthesis must have been accidentally slipped
in while I was reviewing the change for correctness.

Reinhold

-- 
Mail address is perfectly valid!


From gvanrossum at gmail.com  Fri Aug 26 16:57:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 26 Aug 2005 07:57:08 -0700
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <430EA1BE.9090804@colorstudy.com>
References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>
	<430EA1BE.9090804@colorstudy.com>
Message-ID: <ca471dc2050826075778b5efae@mail.gmail.com>

On 8/25/05, Ian Bicking <ianb at colorstudy.com> wrote:
> More generally, I've been doing some language comparisons, and I don't
> like literal but non-idiomatic translations of programming patterns. 

True. (But that doesn't mean I think using generators for this example
is great either.)

> So I'm considering better ways to translate some of the same use cases.

Remember that this particuar example was invented to show the
superiority of Lisp; it has no practical value when taken literally.
If you substitute a method call for the "acc += incr" operation, the
Python translation using nested functions is very natural. For larger
examples, I'd recommend defining a class as always.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From alain.poirier at net-ng.com  Fri Aug 26 18:21:58 2005
From: alain.poirier at net-ng.com (Alain Poirier)
Date: Fri, 26 Aug 2005 18:21:58 +0200
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <ca471dc2050826075778b5efae@mail.gmail.com>
References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>
	<430EA1BE.9090804@colorstudy.com>
	<ca471dc2050826075778b5efae@mail.gmail.com>
Message-ID: <200508261821.58392.alain.poirier@net-ng.com>

Le Vendredi 26 Ao?t 2005 16:57, Guido van Rossum a ?crit :
> On 8/25/05, Ian Bicking <ianb at colorstudy.com> wrote:
> > More generally, I've been doing some language comparisons, and I don't
> > like literal but non-idiomatic translations of programming patterns.
>
> True. (But that doesn't mean I think using generators for this example
> is great either.)
>
> > So I'm considering better ways to translate some of the same use cases.
>
> Remember that this particuar example was invented to show the
> superiority of Lisp; it has no practical value when taken literally.
> If you substitute a method call for the "acc += incr" operation, the
> Python translation using nested functions is very natural. For larger
> examples, I'd recommend defining a class as always.

For example, I often use this class to help me in functional programming :

  _marker = ()

  class var:
      def __init__(self, v=None):
          self.v = v

      def __call__(self, v=_marker):
          if v is not _marker:
              self.v = v

          return self.v

and so the nested functions become very functional :

  def accum(n):
      acc = var(n)
      return lambda incr: acc(acc()+incr)


From nas at arctrix.com  Fri Aug 26 18:33:26 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Aug 2005 10:33:26 -0600
Subject: [Python-Dev] PEP 342: simple example, closure alternative
In-Reply-To: <200508261821.58392.alain.poirier@net-ng.com>
References: <006a01c5a9ec$e0a2f880$6402a8c0@arkdesktop>
	<430EA1BE.9090804@colorstudy.com>
	<ca471dc2050826075778b5efae@mail.gmail.com>
	<200508261821.58392.alain.poirier@net-ng.com>
Message-ID: <20050826163326.GA2382@mems-exchange.org>

On Fri, Aug 26, 2005 at 06:21:58PM +0200, Alain Poirier wrote:
> For example, I often use this class to help me in functional programming :
> 
>   _marker = ()
[...]

You should not use an immutable object here (e.g. the empty tuple is
shared).  My preferred idiom is:

    _marker = object()

Cheers,

  Neil

From tjreedy at udel.edu  Fri Aug 26 21:54:10 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 26 Aug 2005 15:54:10 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
Message-ID: <dens12$5kg$1@sea.gmane.org>

Can str.find be listed in PEP 3000 (under builtins) for removal?
Would anyone really object?

Reasons:

1. Str.find is essentially redundant with str.index.  The only difference 
is that str.index Pythonically indicates 'not found' by raising an 
exception while str.find does the same by anomalously returning -1.  As 
best as I can remember, this is common for Unix system calls but unique 
among Python builtin functions.  Learning and remembering both is a 
nuisance.

2. As is being discussed in a current c.l.p thread, -1 is a legal indexing 
subscript.  If one uses the return value as a subscript without checking, 
the bug is not caught.  None would be a better return value should find not 
be deleted.

3. Anyone who prefers to test return values instead of catch exceptions can 
write (simplified, without start,end params):

def sfind(string, target):
  try:
    return string.index(target)
  except ValueError:
    return None # or -1 for back compatibility, but None better

This can of course be done for any function/method that indicates input 
errors with exceptions instead of a special return value.  I see no reason 
other than history that this particular method should be doubled.

If .find is scheduled for the dustbin of history, I would be willing to 
suggest doc and docstring changes.  (str.index.__doc__ currently refers to 
str.find.__doc__.  This should be reversed.)

Terry J. Reedy


From gvanrossum at gmail.com  Fri Aug 26 22:10:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 26 Aug 2005 13:10:00 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <dens12$5kg$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org>
Message-ID: <ca471dc20508261310809b1e3@mail.gmail.com>

On 8/26/05, Terry Reedy <tjreedy at udel.edu> wrote:
> Can str.find be listed in PEP 3000 (under builtins) for removal?

Yes please. (Except it's not technically a builtin but a string method.)

> Would anyone really object?

Not me.

> Reasons:
> 
> 1. Str.find is essentially redundant with str.index.  The only difference
> is that str.index Pythonically indicates 'not found' by raising an
> exception while str.find does the same by anomalously returning -1.  As
> best as I can remember, this is common for Unix system calls but unique
> among Python builtin functions.  Learning and remembering both is a
> nuisance.
> 
> 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing
> subscript.  If one uses the return value as a subscript without checking,
> the bug is not caught.  None would be a better return value should find not
> be deleted.
> 
> 3. Anyone who prefers to test return values instead of catch exceptions can
> write (simplified, without start,end params):
> 
> def sfind(string, target):
>   try:
>     return string.index(target)
>   except ValueError:
>     return None # or -1 for back compatibility, but None better
> 
> This can of course be done for any function/method that indicates input
> errors with exceptions instead of a special return value.  I see no reason
> other than history that this particular method should be doubled.

I'd like to add:

4. The no. 1 use case for str.find() used to be testing whether a
substring was present or not; "if s.find(sub) >= 0" can now be written
as "if sub in s". This avoids the nasty bug in "if s.find(sub)".

> If .find is scheduled for the dustbin of history, I would be willing to
> suggest doc and docstring changes.  (str.index.__doc__ currently refers to
> str.find.__doc__.  This should be reversed.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Fri Aug 26 22:08:33 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 26 Aug 2005 16:08:33 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <dens12$5kg$1@sea.gmane.org>
Message-ID: <000a01c5aa79$f0414020$a8bb9d8d@oemcomputer>

> Can str.find be listed in PEP 3000 (under builtins) for removal?


FWIW, here is a sample code transformation (extracted from zipfile.py).
Judge for yourself whether the index version is better:


Existing code:
--------------
    END_BLOCK = min(filesize, 1024 * 4)
    fpin.seek(filesize - END_BLOCK, 0)
    data = fpin.read()
    start = data.rfind(stringEndArchive)
    if start >= 0:     # Correct signature string was found
        endrec = struct.unpack(structEndArchive, data[start:start+22])
        endrec = list(endrec)
        comment = data[start+22:]
        if endrec[7] == len(comment):     # Comment length checks out
            # Append the archive comment and start offset
            endrec.append(comment)
            endrec.append(filesize - END_BLOCK + start)
            return endrec
    return      # Error, return None


Revised code:
-------------
    END_BLOCK = min(filesize, 1024 * 4)
    fpin.seek(filesize - END_BLOCK, 0)
    data = fpin.read()
    try:
        start = data.rindex(stringEndArchive)
    except ValueError:
        pass
    else:
        # Correct signature string was found
        endrec = struct.unpack(structEndArchive, data[start:start+22])
        endrec = list(endrec)
        comment = data[start+22:]
        if endrec[7] == len(comment):     # Comment length checks out
            # Append the archive comment and start offset
            endrec.append(comment)
            endrec.append(filesize - END_BLOCK + start)
            return endrec
    return      # Error, return None


From raymond.hettinger at verizon.net  Fri Aug 26 22:34:04 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 26 Aug 2005 16:34:04 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <dens12$5kg$1@sea.gmane.org>
Message-ID: <000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer>

> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?
> 
> Reasons:
  . . .


I had one further thought.  In addition to your excellent list of
reasons, it would be great if these kind of requests were accompanied by
a patch that removed the offending construct from the standard library.

The most important reason for the patch is that looking at the context
diff will provide an objective look at how real code will look before
and after the change.  This would make subsequent discussions
substantially more informed and less anecdotal.

The second reason is that the revised library code becomes more likely
to survive the transition to 3.0.  Further, it can continue to serve as
example code which highlights current best practices.

This patch wouldn't take long.  I've tried about a half dozen cases
since you first posted.  Each provided a new insight (zipfile was not
improved, webbrowser was improved, and urlparse was about the same).


Raymond


From jcarlson at uci.edu  Fri Aug 26 22:54:35 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 26 Aug 2005 13:54:35 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <dens12$5kg$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org>
Message-ID: <20050826134317.7DFD.JCARLSON@uci.edu>


"Terry Reedy" <tjreedy at udel.edu> wrote:
> 
> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?

I would object to the removal of str.find() .  In fact, older versions
of Python which only allowed for single-character 'x in str' containment
tests offered 'str.find(...) != -1' as a suitable replacement option,
which is found in the standard library more than a few times...

Further, forcing users to use try/except when they are looking for the
offset of a substring seems at least a little strange (if not a lot
braindead, no offense to those who prefer their code to spew exceptions
at every turn).

I've been thinking for years that .find should be part of the set of
operations offered to most, if not all sequences (lists, buffers, tuples, ...). 
Considering the apparent dislike/hatred for str.find, it seems I was
wise in not requesting it in the past.

> 
> Reasons:
> 
> 1. Str.find is essentially redundant with str.index.  The only difference 
> is that str.index Pythonically indicates 'not found' by raising an 
> exception while str.find does the same by anomalously returning -1.  As 
> best as I can remember, this is common for Unix system calls but unique 
> among Python builtin functions.  Learning and remembering both is a 
> nuisance.

So pick one and forget the other.  I think of .index as a list method 
(because it doesn't offer .find), not a string method, even though it is.

> 2. As is being discussed in a current c.l.p thread, -1 is a legal indexing 
> subscript.  If one uses the return value as a subscript without checking, 
> the bug is not caught.  None would be a better return value should find not 
> be deleted.

And would break potentially thousands of lines of code in the wild which
expect -1 right now.  Look in the standard library for starting examples,
and google around for others.

> 3. Anyone who prefers to test return values instead of catch exceptions can 
> write (simplified, without start,end params):
> 
> def sfind(string, target):
>   try:
>     return string.index(target)
>   except ValueError:
>     return None # or -1 for back compatibility, but None better
> 
> This can of course be done for any function/method that indicates input 
> errors with exceptions instead of a special return value.  I see no reason 
> other than history that this particular method should be doubled.

I prefer my methods to stay on my instances, and I could have sworn that
the string module's functions were generally deprecated in favor of
string methods.  Now you are (implicitly) advocating the reversal of
such for one method which doesn't return an exception under a very
normal circumstance.

Would you further request that .rfind be removed from strings?  The
inclusion of .rindex?

 - Josiah


From tjreedy at udel.edu  Sat Aug 27 01:48:54 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 26 Aug 2005 19:48:54 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <dens12$5kg$1@sea.gmane.org>
	<ca471dc20508261310809b1e3@mail.gmail.com>
Message-ID: <deo9p6$95c$1@sea.gmane.org>


"Guido van Rossum" <gvanrossum at gmail.com> wrote in message 
news:ca471dc20508261310809b1e3 at mail.gmail.com...
> On 8/26/05, Terry Reedy <tjreedy at udel.edu> wrote:
>> Can str.find be listed in PEP 3000 (under builtins) for removal?
>
> Yes please. (Except it's not technically a builtin but a string method.)

To avoid suggesting a new header, I interpreted Built-ins broadly to 
include builtin types.  The header could be expanded to Built-in Constants, 
Functions, and Types or Built-ins and Built-in Types but I leave such 
details to the PEP authors.

Terry J. Reedy


From tjreedy at udel.edu  Sat Aug 27 03:07:49 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 26 Aug 2005 21:07:49 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <dens12$5kg$1@sea.gmane.org>
	<000a01c5aa79$f0414020$a8bb9d8d@oemcomputer>
Message-ID: <deoed5$hcl$1@sea.gmane.org>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote in message 
news:000a01c5aa79$f0414020$a8bb9d8d at oemcomputer...
>> Can str.find be listed in PEP 3000 (under builtins) for removal?
>
> FWIW, here is a sample code transformation (extracted from zipfile.py).
> Judge for yourself whether the index version is better:

I am sure that we both could write similar code that would be smoother if 
the math module also had a 'powhalf' function that was the same as sqrt 
except for returning -1 instead of raising an error on negative or 
non-numerical input.

I'll continue in response to Josiah...

Terry J. Reedy


From tjreedy at udel.edu  Sat Aug 27 03:07:46 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 26 Aug 2005 21:07:46 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
Message-ID: <deoed1$hce$1@sea.gmane.org>


"Josiah Carlson" <jcarlson at uci.edu> wrote in message 
news:20050826134317.7DFD.JCARLSON at uci.edu...
>
> "Terry Reedy" <tjreedy at udel.edu> wrote:
>>
>> Can str.find be listed in PEP 3000 (under builtins) for removal?

Guido has already approved, but I will try to explain my reasoning a bit 
better for you.  There are basically two ways for a system, such as a 
Python function, to indicate 'I cannot give a normal response."  One (1a) 
is to give an inband signal that is like a normal response except that it 
is not (str.find returing -1).  A variation (1b) is to give an inband 
response that is more obviously not a real response (many None returns). 
The other (2) is to not respond (never return normally) but to give an 
out-of-band signal of some sort (str.index raising ValueError).

Python as distributed usually chooses 1b or 2.  I believe str.find and 
.rfind are unique in the choice of 1a.  I am pretty sure that the choice 
of -1 as error return, instead of, for instance, None, goes back the the 
need in static languages such as C to return something of the declared 
return type.  But Python is not C, etcetera.  I believe that this pair is 
also unique in having exact counterparts of type 2.  (But maybe I forgot 
something.)

>> Would anyone really object?

> I would object to the removal of str.find().

So, I wonder, what is your favored alternative?

A. Status quo: ignore the opportunity to streamline the language.

B. Change the return type of .find to None.

C. Remove .(r)index instead.

D. Add more redundancy for those who do not like exceptions.

> Further, forcing users to use try/except when they are looking for the
> offset of a substring seems at least a little strange (if not a lot
> braindead, no offense to those who prefer their code to spew exceptions
> at every turn).

So are you advocating D above or claiming that substring indexing is 
uniquely deserving of having two versions?  If the latter, why so special? 
If we only has str.index, would you actually suggest adding this particular 
duplication?

> Considering the apparent dislike/hatred for str.find.

I don't hate str.find.  I simply (a) recognize that a function designed for 
static typing constraints is out of place in Python, which does not have 
those constraints and (b) believe that there is no reason other than 
history for the duplication and (c) believe that dropping .find is 
definitely better than dropping .index and changing .find.

> Would you further request that .rfind be removed from strings?

Of course.  Thanks for reminding me.

>  The inclusion of .rindex?

Yes, the continued inclusion of .rindex, which we already have.

Terry J. Reedy


From janssen at parc.com  Sat Aug 27 04:40:29 2005
From: janssen at parc.com (Bill Janssen)
Date: Fri, 26 Aug 2005 19:40:29 PDT
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: Your message of "Fri, 26 Aug 2005 18:07:46 PDT."
	<deoed1$hce$1@sea.gmane.org> 
Message-ID: <05Aug26.194031pdt."58617"@synergy1.parc.xerox.com>

> There are basically two ways for a system, such as a 
> Python function, to indicate 'I cannot give a normal response."  One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).  A variation (1b) is to give an inband 
> response that is more obviously not a real response (many None returns). 
> The other (2) is to not respond (never return normally) but to give an 
> out-of-band signal of some sort (str.index raising ValueError).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.

Doubt it.  The problem with returning None is that it tests as False,
but so does 0, which is a valid string index position.  The reason
string.find() returns -1 is probably to allow a test:

      if line.find("\f"):
	 ... do something

Might add a boolean "str.contains()" to cover this test case.

Bill

From gvanrossum at gmail.com  Sat Aug 27 05:05:29 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 26 Aug 2005 20:05:29 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <-1439615371039171070@unknownmsgid>
References: <deoed1$hce$1@sea.gmane.org> <-1439615371039171070@unknownmsgid>
Message-ID: <ca471dc205082620057d2ddf42@mail.gmail.com>

On 8/26/05, Bill Janssen <janssen at parc.com> wrote:
> Doubt it.  The problem with returning None is that it tests as False,
> but so does 0, which is a valid string index position.  The reason
> string.find() returns -1 is probably to allow a test:
> 
>       if line.find("\f"):
>          ... do something

This has a bug; it is equivalent to "if not line.startswith("\f"):".

This mistake (which I have made more than once myself and have seen
many times in code by others) is one of the main reasons to want to
get rid of this style of return value.

> Might add a boolean "str.contains()" to cover this test case.

We already got that: "\f" in line.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 27 05:14:53 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri, 26 Aug 2005 20:14:53 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer>
References: <dens12$5kg$1@sea.gmane.org>
	<000d01c5aa7d$80564ae0$a8bb9d8d@oemcomputer>
Message-ID: <ca471dc2050826201457b7afae@mail.gmail.com>

On 8/26/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> I had one further thought.  In addition to your excellent list of
> reasons, it would be great if these kind of requests were accompanied by
> a patch that removed the offending construct from the standard library.

Um? Are we now requiring patches for PYTHON THREE DOT OH proposals?

Raymond, we all know and agree that Python 3.0 will be incompatible in
many ways. range() and keys() becoming iterators, int/int returning
float, and so on; we can safely say that it will break nearly every
module under the sun, and no amount of defensive coding in Python 2.x
will save us.

> The most important reason for the patch is that looking at the context
> diff will provide an objective look at how real code will look before
> and after the change.  This would make subsequent discussions
> substantially more informed and less anecdotal.

No, you're just artificially trying to raise the bar for Python 3.0
proposals to an unreasonable height.

> The second reason is that the revised library code becomes more likely
> to survive the transition to 3.0.  Further, it can continue to serve as
> example code which highlights current best practices.

But we don't *want* all of the library code to survive. Much of it is
10-15 years old and in dear need of a total rewrite. See Anthony
Baxter's lightning talk at OSCON (I'm sure Google can find it for
you).

> This patch wouldn't take long.  I've tried about a half dozen cases
> since you first posted.  Each provided a new insight (zipfile was not
> improved, webbrowser was improved, and urlparse was about the same).

So it's neutral in terms of code readability. Great. Given all the
other advantages for the proposal (an eminent member of this group
just posted a buggy example :-) I'm now doubly convinced that we
should do it.

Also remember, the standard library is rather atypical -- while some
of it makes great example code, other parts of it are highly contorted
in order to either maintain backwards compatibility or provide an
unusually high level of defensiveness.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From anthony at interlink.com.au  Sat Aug 27 05:55:10 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 27 Aug 2005 13:55:10 +1000
Subject: [Python-Dev] [Python-checkins] python/dist/src/Lib/test
	test_bz2.py, 1.18, 1.19
In-Reply-To: <den9vs$a2g$1@sea.gmane.org>
References: <20050826132405.30B221E4003@bag.python.org>
	<1f7befae05082607323db5ceee@mail.gmail.com>
	<den9vs$a2g$1@sea.gmane.org>
Message-ID: <200508271355.12750.anthony@interlink.com.au>

On Saturday 27 August 2005 00:46, Reinhold Birkenfeld wrote:
> > Note that this added a right parenthesis to the start of the line.
> > That creates a syntax error, so this test could not have been tried
> > before checking in.  It also causes test_compiler to fail.
>
> Thank you for correcting. The parenthesis must have been accidentally
> slipped in while I was reviewing the change for correctness.

Please ensure that you run the test suite before checking code in!

Thanks,
Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From janssen at parc.com  Sat Aug 27 05:58:23 2005
From: janssen at parc.com (Bill Janssen)
Date: Fri, 26 Aug 2005 20:58:23 PDT
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: Your message of "Fri, 26 Aug 2005 20:05:29 PDT."
	<ca471dc205082620057d2ddf42@mail.gmail.com> 
Message-ID: <05Aug26.205831pdt."58617"@synergy1.parc.xerox.com>

Don't know *what* I wasn't thinking :-).

Bill

> On 8/26/05, Bill Janssen <janssen at parc.com> wrote:
> > Doubt it.  The problem with returning None is that it tests as False,
> > but so does 0, which is a valid string index position.  The reason
> > string.find() returns -1 is probably to allow a test:
> > 
> >       if line.find("\f"):
> >          ... do something
> 
> This has a bug; it is equivalent to "if not line.startswith("\f"):".
> 
> This mistake (which I have made more than once myself and have seen
> many times in code by others) is one of the main reasons to want to
> get rid of this style of return value.
> 
> > Might add a boolean "str.contains()" to cover this test case.
> 
> We already got that: "\f" in line.

From jcarlson at uci.edu  Sat Aug 27 06:48:27 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 26 Aug 2005 21:48:27 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <deoed1$hce$1@sea.gmane.org>
References: <20050826134317.7DFD.JCARLSON@uci.edu> <deoed1$hce$1@sea.gmane.org>
Message-ID: <20050826184634.7E06.JCARLSON@uci.edu>


"Terry Reedy" <tjreedy at udel.edu> wrote:
> 
> "Josiah Carlson" <jcarlson at uci.edu> wrote in message 
> news:20050826134317.7DFD.JCARLSON at uci.edu...
> >
> > "Terry Reedy" <tjreedy at udel.edu> wrote:
> >>
> >> Can str.find be listed in PEP 3000 (under builtins) for removal?
> 
> Guido has already approved,

I noticed, but he approved before anyone could say anything.  I
understand it is a dictatorship, but he seems to take advisment and
reverse (or not) his decisions on occasion based on additional
information. Whether this will lead to such, I don't know.


> but I will try to explain my reasoning a bit 
> better for you.  There are basically two ways for a system, such as a 
> Python function, to indicate 'I cannot give a normal response."  One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).  A variation (1b) is to give an inband 
> response that is more obviously not a real response (many None returns). 
> The other (2) is to not respond (never return normally) but to give an 
> out-of-band signal of some sort (str.index raising ValueError).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.  I am pretty sure that the choice 
> of -1 as error return, instead of, for instance, None, goes back the the 
> need in static languages such as C to return something of the declared 
> return type.  But Python is not C, etcetera.  I believe that this pair is 
> also unique in having exact counterparts of type 2.  (But maybe I forgot 
> something.)

Taking a look at the commits that Guido did way back in 1993, he doesn't
mention why he added .find, only that he did.  Maybe it was another of
the 'functional language additions' that he now regrets, I don't know.


> >> Would anyone really object?
> 
> > I would object to the removal of str.find().
> 
> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

str.find is not a language construct.  It is a method on a built-in type
that many people use.  This is my vote.


> B. Change the return type of .find to None.

Again, this would break potentially thousands of lines of user code that
is in the wild.  Are we talking about changes for 2.5 here, or 3.0?

> C. Remove .(r)index instead.

see below *

> D. Add more redundancy for those who do not like exceptions.

In 99% of the cases, such implementations would be minimal.  While I
understand that "There should be one-- and preferably only one --obvious
way to do it.", please see below *.


> > Further, forcing users to use try/except when they are looking for the
> > offset of a substring seems at least a little strange (if not a lot
> > braindead, no offense to those who prefer their code to spew exceptions
> > at every turn).
> 
> So are you advocating D above or claiming that substring indexing is 
> uniquely deserving of having two versions?  If the latter, why so special? 
> If we only has str.index, would you actually suggest adding this particular 
> duplication?

Apparently everyone has forgotten the dozens of threads on similar
topics over the years.  I'll attempt to summarize.

Adding functionality that isn't used is harmful, but not nearly as
harmful as removing functionality that people use.

If you take just two seconds and do a search on '.find(' vs '.index(' in
the standard library, you will notice that '.find(' is used more often
than '.index(' regardless of type (I don't have the time this evening to
pick out which ones are string only, but I doubt the standard library
uses mmap.find, DocTestFinder.find, or gettext.find).  This example
seems to show that people find str.find to be more intuitive and/or
useful than str.index, even though you spent two large paragraphs
explaining that Python 'doesn't do it that way very often so it isn't
Pythonic'. Apparently the majority of people who have been working on
the standard library for the last decade disagree.


> > Considering the apparent dislike/hatred for str.find.
> 
> I don't hate str.find.  I simply (a) recognize that a function designed for 
> static typing constraints is out of place in Python, which does not have 
> those constraints and (b) believe that there is no reason other than 
> history for the duplication and (c) believe that dropping .find is 
> definitely better than dropping .index and changing .find.

* I don't see why it is necessary to drop or change either one.  We've
got list() and [] for construcing a list.  Heck, we've even got
list(iterable) and [i for i in iterable] for making a list copy of any
arbitrary iterable.  This goes against TSBOOWTDI, so why don't we toss
list comprehensions now that we have list(generator expression)?  Or did
I miss something and this was already going to happen?


> > Would you further request that .rfind be removed from strings?
> 
> Of course.  Thanks for reminding me.

No problem, but again, do a search in the standard library...  I found 4
examples if str.rindex, but over 40 of str.rfind.  Koders.com offers
1153 and 100 for .rfind and .rindex respectively (probably not all
string methods, but I'm too lazy to check every one). A common factor of
over 10. If koders.com had a decent way to search for the full name of a
method call, we could do a find vs. index as well, though I expect we
would see closer to 4:3 with .find winning (those are the approximate
numbers I get when checking the standard library).


The reason I'm making a stink is because you are proposing (and Guido
has agreed) to get rid of methods V,W which are used more often than
methods X,Y in order to 'streamline the language' for 3.0.  The removal
of two methods and their implementations will not go terribly far
towards streamlining the language, especially when all four methods
(find, rfind, index, rindex) call the same C function to do the actual
search.

 - Josiah


From martin at v.loewis.de  Sat Aug 27 08:54:12 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 Aug 2005 08:54:12 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <deoed1$hce$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org>
Message-ID: <43100E14.6080009@v.loewis.de>

Terry Reedy wrote:
> One (1a) 
> is to give an inband signal that is like a normal response except that it 
> is not (str.find returing -1).
> 
> Python as distributed usually chooses 1b or 2.  I believe str.find and 
> .rfind are unique in the choice of 1a.

That is not true. str.find's choice is not 1a, and there are other
functions which chose 1a): -1 does *not* look like a normal response,
since a normal response is non-negative. It is *not* the only method
with choice 1a): dict.get returns None if the key is not found, even
though None could also be the value for the key.

For another example, file.read() returns an empty string at EOF.


> I am pretty sure that the choice 
> of -1 as error return, instead of, for instance, None, goes back the the 
> need in static languages such as C to return something of the declared 
> return type.  But Python is not C, etcetera.  I believe that this pair is 
> also unique in having exact counterparts of type 2.

dict.__getitem__ is a counterpart of type 2 of dict.get.

> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

My favourite choice is the status quo. I probably don't fully
understand the word "to streamline", but I don't see this as
rationalizing. Instead, some applications will be more tedious
to write.

> So are you advocating D above or claiming that substring indexing is 
> uniquely deserving of having two versions?  If the latter, why so special? 

Because it is no exception that a string is not part of another string,
and because the question I'm asking "is the string in the other string,
and if so, where?". This is similar to the question "does the dictionary
have a value for that key, and if so, which?"

> If we only has str.index, would you actually suggest adding this particular 
> duplication?

That is what happened to dict.get: it was not originally there (I
believe), but added later.

Regards,
Martin

From reinhold-birkenfeld-nospam at wolke7.net  Sat Aug 27 09:39:37 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 27 Aug 2005 09:39:37 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <05Aug26.194031pdt."58617"@synergy1.parc.xerox.com>
References: <deoed1$hce$1@sea.gmane.org>
	<05Aug26.194031pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <dep5bp$op9$1@sea.gmane.org>

Bill Janssen wrote:
>> There are basically two ways for a system, such as a 
>> Python function, to indicate 'I cannot give a normal response."  One (1a) 
>> is to give an inband signal that is like a normal response except that it 
>> is not (str.find returing -1).  A variation (1b) is to give an inband 
>> response that is more obviously not a real response (many None returns). 
>> The other (2) is to not respond (never return normally) but to give an 
>> out-of-band signal of some sort (str.index raising ValueError).
>> 
>> Python as distributed usually chooses 1b or 2.  I believe str.find and 
>> .rfind are unique in the choice of 1a.
> 
> Doubt it.  The problem with returning None is that it tests as False,
> but so does 0, which is a valid string index position.

Heh. You know what the Perl6 folks would suggest in this case?

return 0 but true; # literally!

> Might add a boolean "str.contains()" to cover this test case.

There's already __contains__.

Reinhold

-- 
Mail address is perfectly valid!


From raymond.hettinger at verizon.net  Sat Aug 27 10:20:38 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 04:20:38 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc2050826201457b7afae@mail.gmail.com>
Message-ID: <002f01c5aae0$35aaa820$a8bb9d8d@oemcomputer>

> > The most important reason for the patch is that looking at the
context
> > diff will provide an objective look at how real code will look
before
> > and after the change.  This would make subsequent discussions
> > substantially more informed and less anecdotal.
> 
> No, you're just artificially trying to raise the bar for Python 3.0
> proposals to an unreasonable height.

Not really.  I'm mostly for the proposal (+0), but am certain the
conversation about the proposal would be substantially more informed if
we had a side-by-side comparison of what real-world code looks like
before and after the change.  There are not too many instances of
str.find() in the library and it is an easy patch to make.  I'm just
asking for a basic, objective investigative tool.

Unlike more complex proposals, this one doesn't rely on any new
functionality.  It just says don't use X anymore.  That makes it
particularly easy to investigate in an objective way.

BTW, this isn't unprecedented.  We're already done it once when
backticks got slated for removal in 3.0.  All instances of it got
changed in the standard library.  As a result of the patch, we were able
to 1) get an idea of how much work it took, 2) determine every category
of use case, 3) learn that the resulting code was more beautiful,
readable, and only microscopically slower, 4) learn about a handful of
cases that were unexpectedly difficult to convert, and 5) update the
library to be an example of what we think modern code looks like.  That
patch painlessly informed the decision making and validated that we were
doing the right thing.

The premise of Terry's proposal is that Python code is better when
str.find() is not used.  This is a testable proposition.  Why not use
the wealth of data at our fingertips to augment a priori reasoning and
anecdotes.  I'm not at all arguing against the proposal; I'm just asking
for a thoughtful design process.

 
Raymond


P.S.  Josiah was not alone.  The comp.lang.python discussion had other
posts expressing distaste for raising exceptions instead of using return
codes.  While I don't feel the same way, I don't think the respondants
should be ignored.


"Those people who love sausage and respect the law should not watch
either one being made."


From raymond.hettinger at verizon.net  Sat Aug 27 10:28:28 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 04:28:28 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <43100E14.6080009@v.loewis.de>
Message-ID: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer>

[Martin]
> For another example, file.read() returns an empty string at EOF.

When my turn comes for making 3.0 proposals, I'm going to recommend
nixing the "empty string at EOF" API.  That is a carry-over from C that
made some sense before there were iterators.  Now, we have the option of
introducing much cleaner iterator versions of these methods that use
compact, fast, and readable for-loops instead of multi-statement
while-loop boilerplate.


Raymond


From reinhold-birkenfeld-nospam at wolke7.net  Sat Aug 27 12:01:55 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 27 Aug 2005 12:01:55 +0200
Subject: [Python-Dev] test_bz2 on Python 2.4.1
In-Reply-To: <BAY23-F25EFABE15D39101A054A4CABAA0@phx.gbl>
References: <BAY23-F25EFABE15D39101A054A4CABAA0@phx.gbl>
Message-ID: <depdmj$cms$1@sea.gmane.org>

A.B., Khalid wrote:

> #--------------------------- Python 2.5a0 from CVS -----------------#
> # Result: passes
> $ /g/projs/py25/python/dist/src/MinGW/python testbz2.py
> 
> 
> #--------------------------- Python 2.4.1 from CVS -----------------#
> # Result: fails
> $ /g/projs/py24/python/dist/src/MinGW/python testbz2.py
> Traceback (most recent call last):
>   File "testbz2.py", line 9, in ?
>     lines = bz2f.readlines()
> RuntimeError: wrong sequence of bz2 library commands used

I don't understand this. The sources for the bz2 modules are exactly equal
in both branches.

How do you check out the 2.4.1 from CVS?

Reinhold

-- 
Mail address is perfectly valid!


From paragate at gmx.net  Sat Aug 27 12:21:03 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Sat, 27 Aug 2005 12:21:03 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <43100E14.6080009@v.loewis.de>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <43100E14.6080009@v.loewis.de>
Message-ID: <op.sv5wddyq0gn541@theta>

On Sat, 27 Aug 2005 08:54:12 +0200, Martin v. L?wis <martin at v.loewis.de>  
wrote:
> with choice 1a): dict.get returns None if the key is not found, even
> though None could also be the value for the key.

that's a bug! i had to *test* it to find out it's true! i've been writing
code for *years* all in the understanding that dict.get(x) acts precisely
like dict['x'] *except* you get a chance to define a default value. which,
for me, has become sort of a standard solution to the problem the last ten
or so postings were all about: when i write a function and realize that's
one of the cases where python philosophy strongly favors raising an  
exception
because something e.g. could not be found where expected, i make it so that
a reasonable exception is raised *and* where meaningful i give consumers
a chance to pass in a default value to eschew exceptions. i believe
this is the way to go to resolve this .index/.find conflict. and, no,
returning -1 when a substring is not found and None when a key is not
found is *highly* problematic. i'd sure like to see cases like that to go.

i'm not sure why .rindex() should go (correct?), and how to do what it does
(reverse the string before doing .index()? is that what is done  
internally?)

and of course, like always, there is the question why these are methods
at all and why there is a function len(str) but a method str.index(); one
could just as well have *either* str.length and str.index() *or*
length(str) and, say, a builtin

  locate( x, element, start = 0 , stop = None, reversed = False, default =  
Misfit )

(where Misfit indicates a 'meta-None', so None is still a valid default  
value;
i also like to indicate 'up to the end' with stop=None) that does on  
iterables
(or only on sequences) what the methods do now, but with this strange  
pattern:

------------------------------------------------------------------
         .index() .find() .get() .pop()
list       +               ?(3)   +
tuple                      ?(3)   ??(1)
str        +        +      ?(3)   ??(1)
dict       x(2)     x(2)   +      +

(1) one could argue this should return a copy of a tuple or str,
but doubtful. (2) index/find meaningless for dicts. (3) there
is no .get() for list, tuple, str, although it would make sense:
return the indexed element, or raise IndexError where not found
if no default return value given.
------------------------------------------------------------------

what bites me here is expecially that we have both index and find
for str *but a gaping hole* for tuples. assuming tuples are not slated
for removal, i suggest to move in a direction that makes things look
more like this:

------------------------------------------------------------------
         .index() .get() .pop()
list       +       +      +
tuple      +       +
str        +       +
dict               +      +
------------------------------------------------------------------

where .index() looks like locate, above:

------------------------------------------------------------------
{list,tuple,str}.index(
     element,            # element in the collection
     start = 0,          # where to start searching; default is zero
     stop = None,        # where to end; the default, None, indicates
                         # 'to the end'
     reversed = False,   # should we search from the back? *may* cause
                         # reversion of sequence, depending on impl.
     default = _Misfit,  # default value, when given, prevents
                         # IndexError from being raised
     )
------------------------------------------------------------------

hope i didn't miss out crucial points here.

_wolf


-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

From martin at v.loewis.de  Sat Aug 27 12:35:30 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 27 Aug 2005 12:35:30 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <op.sv5wddyq0gn541@theta>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <43100E14.6080009@v.loewis.de>
	<op.sv5wddyq0gn541@theta>
Message-ID: <431041F2.6060307@v.loewis.de>

Wolfgang Lipp wrote:
> that's a bug! i had to *test* it to find out it's true! i've been writing
> code for *years* all in the understanding that dict.get(x) acts precisely
> like dict['x'] *except* you get a chance to define a default value.

Clearly, your understanding *all* these years *was* wrong. If you don't
specify *a* default value, *it* defaults to None.

Regards,
Martin

P.S. Emphasis mine :-)

From paragate at gmx.net  Sat Aug 27 12:47:55 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Sat, 27 Aug 2005 12:47:55 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <431041F2.6060307@v.loewis.de>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <43100E14.6080009@v.loewis.de>
	<op.sv5wddyq0gn541@theta> <431041F2.6060307@v.loewis.de>
Message-ID: <op.sv5xl5cq0gn541@theta>

On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. L?wis <martin at v.loewis.de>  
wrote:
> P.S. Emphasis mine :-)

no, emphasis all **mine** :-) just to reflect i never expected .get()
to work that way (return an unsolicited None) -- i do consider this
behavior harmful and suggest it be removed.

_wolf


From just at letterror.com  Sat Aug 27 13:01:02 2005
From: just at letterror.com (Just van Rossum)
Date: Sat, 27 Aug 2005 13:01:02 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <op.sv5xl5cq0gn541@theta>
Message-ID: <r01050400-1039-DCD0371A16E911DA86CA001124365170@[10.0.0.24]>

Wolfgang Lipp wrote:

> On Sat, 27 Aug 2005 12:35:30 +0200, Martin v. L?wis
<martin at v.loewis.de>  
> wrote:
> > P.S. Emphasis mine :-)
> 
> no, emphasis all **mine** :-) just to reflect i never expected .get()
> to work that way (return an unsolicited None) -- i do consider this
> behavior harmful and suggest it be removed.

Just because you don't read the documentation and guessed wrong d.get()
needs to be removed?!?

It's a *feature* of d.get(k) to never raise KeyError. If you need an
exception, why not just use d[k]?

Just

From paragate at gmx.net  Sat Aug 27 13:33:40 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Sat, 27 Aug 2005 13:33:40 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <r01050400-1039-DCD0371A16E911DA86CA001124365170@[10.0.0.24]>
References: <r01050400-1039-DCD0371A16E911DA86CA001124365170@[10.0.0.24]>
Message-ID: <op.sv5zqejk0gn541@theta>

On Sat, 27 Aug 2005 13:01:02 +0200, Just van Rossum <just at letterror.com>  
wrote:

> Just because you don't read the documentation and guessed wrong d.get()
> needs to be removed?!?

no, not removed... never said that.

> It's a *feature* of d.get(k) to never raise KeyError. If you need an
> exception, why not just use d[k]?

i agree i misread the specs, but then, i read the specs a lot, and
i guess everyone here agrees that if it's in the specs doesn't mean
it's automatically what we want or expect -- else there's nothing to
discuss. i say

	d.get('x') == None
	<==
	{ ( 'x' not in d ) OR ( d['x'] == None ) }

is not what i expect (even tho the specs say so) especially since
d.pop('x') *does* throw a KeyError when 'x' is not a key in mydict.
ok, pop is not get and so on but still i perceive this a problematic
behavior (to the point i call it a 'bug' in a jocular way, no offense
implied). the reason of being for d.get() -- to me -- is simply so you
get a chance to pass a default value, which is syntactically well-nigh
impossible with d['x'].

_wolf

From skip at pobox.com  Sat Aug 27 14:48:20 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 27 Aug 2005 07:48:20 -0500
Subject: [Python-Dev] Style for raising exceptions (python-dev Summary
 for 2005-08-01 through 2005-08-15 [draft])
In-Reply-To: <430ECEDC.7040206@egenix.com>
References: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04E3@its-xchg4.massey.ac.nz>
	<430D90FF.6060206@egenix.com>
	<ca471dc205082508174cb1d240@mail.gmail.com>
	<430ECEDC.7040206@egenix.com>
Message-ID: <17168.24852.883667.347938@montanaro.dyndns.org>


    MAL> I don't see a need for two or more syntaxes either, but most code
    MAL> nowadays uses the second variant (I don't know of any code that
    MAL> uses the traceback argument), which puts up a high barrier for
    MAL> changes.

Earlier this week I managed to fix all the instances in the projects I'm
involved with at my day job in a couple rounds of grep/emacs macro sessions.
It took all of about 20 minutes, so I don't think the conversion will be
onerous.

Skip

From kay.schluehr at gmx.net  Sat Aug 27 14:57:08 2005
From: kay.schluehr at gmx.net (Kay Schluehr)
Date: Sat, 27 Aug 2005 14:57:08 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <deoed1$hce$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org>
Message-ID: <depnv5$236$1@sea.gmane.org>

Terry Reedy wrote:

>>I would object to the removal of str.find().
> 
> 
> So, I wonder, what is your favored alternative?
> 
> A. Status quo: ignore the opportunity to streamline the language.

I actually don't see much benefits from the user perspective. The 
discourse about Python3000 has shrunken from the expectation of the 
"next big thing" into a depressive rhetorics of feature elimination.
The language doesn't seem to become deeper, smaller and more powerfull 
but just smaller.


> B. Change the return type of .find to None.
> 
> C. Remove .(r)index instead.
> 
> D. Add more redundancy for those who do not like exceptions.

Why not turning index() into an iterator that yields indices 
sucessively? From this generalized perspective we can try to reconstruct 
behaviour of Python 2.X.

Sometimes I use a custom keep() function if I want to prevent defining a 
block for catching StopIteration. The keep() function takes an iterator 
and returns a default value in case of StopIteration:

def keep(iter, default=None):
     try:
         return iter.next()
     except StopIteration:
         return default

Together with an index iterator the user can mimic the behaviour he 
wants. Instead of a ValueError a StopIteration exception can hold as
an "external" information ( other than a default value ):

 >>> keep( "abcdabc".index("bc"), default=-1)  # current behaviour of the
                                               # find() function
 >>> (idx for idx in "abcdabc".rindex("bc"))   # generator expression


Since the find() method acts on a string literal it is not easy to
replace it syntactically. But why not add functions that can be hooked 
into classes whose objects are represented by literals?

def find( string, substring):
     return keep( string.index( substring), default=-1)

str.register(find)

 >>> "abcdabc".find("bc")
1

Now find() can be stored in a pure Python module without maintaining it 
on interpreter level ( same as with reduce, map and filter ).

Kay


From raymond.hettinger at verizon.net  Sat Aug 27 14:56:28 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 08:56:28 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <op.sv5zqejk0gn541@theta>
Message-ID: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>

FWIW, here are three more comparative code fragments.  They are
presented without judgment as an evaluation tool to let everyone form
their own opinion about the merits of each:


--- From CGIHTTPServer.py ---------------

def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    i = rest.rfind('?')
    if i >= 0:
        rest, query = rest[:i], rest[i+1:]
    else:
        query = ''
    i = rest.find('/')
    if i >= 0:
        script, rest = rest[:i], rest[i:]
    else:
        script, rest = rest, ''
    . . .


def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    try:
        i = rest.rindex('?')
    except ValueError():
        query = ''
    else:
        rest, query = rest[:i], rest[i+1:]
    try:
        i = rest.index('/')
    except ValueError():
        script, rest = rest, ''
    else:
        script, rest = rest[:i], rest[i:]
    . . .


--- From ConfigParser.py ---------------

optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval:
    # ';' is a comment delimiter only if it follows
    # a spacing character
    pos = optval.find(';')
    if pos != -1 and optval[pos-1].isspace():
        optval = optval[:pos]
optval = optval.strip()
. . .


optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval:
    # ';' is a comment delimiter only if it follows
    # a spacing character
    try:
        pos = optval.index(';')
    except ValueError():
        pass
    else:
        if optval[pos-1].isspace():
            optval = optval[:pos]
optval = optval.strip()
. . .


--- StringIO.py ---------------

i = self.buf.find('\n', self.pos)
if i < 0:
    newpos = self.len
else:
    newpos = i+1
. . .


try:
    i = self.buf.find('\n', self.pos)
except ValueError():
    newpos = self.len
else:
    newpos = i+1
. . .


My notes so far weren't meant to judge the proposal.  I'm just
suggesting that examining fragments like the ones above will help inform
the design process.

Peace,


Raymond


From just at letterror.com  Sat Aug 27 15:08:34 2005
From: just at letterror.com (Just van Rossum)
Date: Sat, 27 Aug 2005 15:08:34 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <op.sv5zqejk0gn541@theta>
Message-ID: <r01050400-1039-ADAAF1E416FB11DA86CA001124365170@[10.0.0.24]>

Wolfgang Lipp wrote:

> > Just because you don't read the documentation and guessed wrong
> > d.get() needs to be removed?!?
> 
> no, not removed... never said that.

Fair enough, you proposed to remove the behavior. Not sure how that's
all that much less bad, though...

> implied). the reason of being for d.get() -- to me -- is simply so you
> get a chance to pass a default value, which is syntactically well-nigh
> impossible with d['x'].

Close, but the main reason to add d.get() was to avoid the exception.
The need to specify a default value followed from that.

Just

From paragate at gmx.net  Sat Aug 27 15:16:13 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Sat, 27 Aug 2005 15:16:13 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <depnv5$236$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <depnv5$236$1@sea.gmane.org>
Message-ID: <op.sv54hbhb0gn541@theta>


kay,

your suggestion makes perfect sense for me, i haven't actually tried
the examples tho. guess there could be a find() or index() or
indices() or iterIndices() ??? function 'f' roughly with these arguments:

def f( x, element, start = 0, stop = None, default = _Misfit, maxcount =  
None, reverse = False )

that iterates over the indices of x where element (a substring, key, or
value in a sequence or iterator) is found, raising sth. like IndexError
when nothing at all is found except when default is not '_Misfit'  
(mata-None),
and starts looking from the right end when reverse is True (this *may*
imply that reversed(x) is done on x where no better implementation is
available). not quite sure whether it makes sense to me to always return
default as the last value of the iteration -- i tend to say rather not.
ah yes, only up to maxcount indices are yielded.

bet it said that passing an iterator for x would mean that the iterator is  
gone
up to where the last index was yielded; passing an iterator is not
acceptable for reverse = True.

MHO,

_wolf


On Sat, 27 Aug 2005 14:57:08 +0200, Kay Schluehr <kay.schluehr at gmx.net>  
wrote:
>
> def keep(iter, default=None):
>      try:
>          return iter.next()
>      except StopIteration:
>          return default
>
> Together with an index iterator the user can mimic the behaviour he
> wants. Instead of a ValueError a StopIteration exception can hold as
> an "external" information ( other than a default value ):
>
>  >>> keep( "abcdabc".index("bc"), default=-1)  # current behaviour of the
>                                                # find() function
>  >>> (idx for idx in "abcdabc".rindex("bc"))   # generator expression
>
>
> Since the find() method acts on a string literal it is not easy to
> replace it syntactically. But why not add functions that can be hooked
> into classes whose objects are represented by literals?
>
> def find( string, substring):
>      return keep( string.index( substring), default=-1)
>
> str.register(find)
>
>  >>> "abcdabc".find("bc")
> 1
>
> Now find() can be stored in a pure Python module without maintaining it
> on interpreter level ( same as with reduce, map and filter ).
>
> Kay


From abkhd at hotmail.com  Sat Aug 27 16:18:49 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Sat, 27 Aug 2005 14:18:49 +0000
Subject: [Python-Dev] test_bz2 on Python 2.4.1
Message-ID: <BAY23-F189A51C8C2C4251E5710B2ABAD0@phx.gbl>

Reinhold Birkenfeld wrote:

>>#--------------------------- Python 2.5a0 from CVS -----------------#
>># Result: passes
>>$ /g/projs/py25/python/dist/src/MinGW/python testbz2.py
>>
>>
>>#--------------------------- Python 2.4.1 from CVS -----------------#
>># Result: fails
>>$ /g/projs/py24/python/dist/src/MinGW/python testbz2.py
>>Traceback (most recent call last):
>>   File "testbz2.py", line 9, in ?
>>     lines = bz2f.readlines()
>>RuntimeError: wrong sequence of bz2 library commands used
>
>I don't understand this. The sources for the bz2 modules are exactly equal
>in both branches.

I know. Even the tests are equal. I didn't say that these files are to 
blame, I just said that the test is failing in Python 2.4.1 on Windows.


>How do you check out the 2.4.1 from CVS?


Well, I've been updating Python from CVS from more than a year now and I 
doubt that this is the problem. After all, Python 2.3.5 is passing the 
regrtests, and last time I checked, so is Python 2.5a0. Python 2.4.1 was 
also passing all the regtests until recently (not sure exatcly when, but it 
could be about a month ago). But anyway, here is how I update my copy of 
Python 2.4 from CVS. Roughly,

cvs -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python login
[Enter]
cvs -z7 -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python update -dP 
-r release24-maint python


And it is, more or less, the same way I check out other branches.

I will download the Python 2.4.1 source archieve and to build it to see what 
happens. I'll report back when I am done.


Regards,
Khalid

_________________________________________________________________
Don�t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/


From gvanrossum at gmail.com  Sat Aug 27 16:29:07 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 07:29:07 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer>
References: <43100E14.6080009@v.loewis.de>
	<003001c5aae1$4d843280$a8bb9d8d@oemcomputer>
Message-ID: <ca471dc20508270729502f22f@mail.gmail.com>

On 8/27/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> [Martin]
> > For another example, file.read() returns an empty string at EOF.
> 
> When my turn comes for making 3.0 proposals, I'm going to recommend
> nixing the "empty string at EOF" API.  That is a carry-over from C that
> made some sense before there were iterators.  Now, we have the option of
> introducing much cleaner iterator versions of these methods that use
> compact, fast, and readable for-loops instead of multi-statement
> while-loop boilerplate.

-1.

For reading lines we already have that in the status quo.

For reading bytes, I *know* that a lot of code would become uglier if
the API changed to raise EOFError exceptions. It's not a coincidence
that raw_input() raises EOFError but readline() doesn't -- the
readline API was designed after externsive experience with
raw_input().

The situation is different than for find():

- there aren't two APIs that only differ in their handling of the
exceptional case

- the error return value tests false and all non-error return values tests true

- in many cases processing the error return value the same as
non-error return values works just fine (as long as you have another
way to test for termination)

Also, even if read() raised EOFError instead of returning '', code
that expects certain data wouldn't be simplified -- after attempting
to read e.g. 4 bytes, you'd still have to check that you got exactly
4, so there'd be three cases to handle (EOFError, short, good) instead
of two (short, good).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 27 16:36:46 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 07:36:46 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
References: <op.sv5zqejk0gn541@theta>
	<000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
Message-ID: <ca471dc20508270736c4fe03@mail.gmail.com>

On 8/27/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:

> --- From ConfigParser.py ---------------
> 
> optname, vi, optval = mo.group('option', 'vi', 'value')
> if vi in ('=', ':') and ';' in optval:
>     # ';' is a comment delimiter only if it follows
>     # a spacing character
>     pos = optval.find(';')
>     if pos != -1 and optval[pos-1].isspace():
>         optval = optval[:pos]
> optval = optval.strip()
> . . .
> 
> 
> optname, vi, optval = mo.group('option', 'vi', 'value')
> if vi in ('=', ':') and ';' in optval:
>     # ';' is a comment delimiter only if it follows
>     # a spacing character
>     try:
>         pos = optval.index(';')
>     except ValueError():

I'm sure you meant "except ValueError:"

>         pass
>     else:
>         if optval[pos-1].isspace():
>             optval = optval[:pos]
> optval = optval.strip()
> . . .

That code is buggy before and after the transformation -- consider
what happens if optval *starts* with a semicolon. Also, the code is
searching optval for ';' twice. Suggestion:

if vi in ('=',':'):
  try: pos = optval.index(';')
  except ValueError: pass
  else:
    if pos > 0 and optval[pos-1].isspace():
      optval = optval[:pos]

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From reinhold-birkenfeld-nospam at wolke7.net  Sat Aug 27 16:40:36 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 27 Aug 2005 16:40:36 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <003001c5aae1$4d843280$a8bb9d8d@oemcomputer>
References: <43100E14.6080009@v.loewis.de>
	<003001c5aae1$4d843280$a8bb9d8d@oemcomputer>
Message-ID: <depu14$ekb$1@sea.gmane.org>

Raymond Hettinger wrote:
> [Martin]
>> For another example, file.read() returns an empty string at EOF.
> 
> When my turn comes for making 3.0 proposals, I'm going to recommend
> nixing the "empty string at EOF" API.  That is a carry-over from C that
> made some sense before there were iterators.  Now, we have the option of
> introducing much cleaner iterator versions of these methods that use
> compact, fast, and readable for-loops instead of multi-statement
> while-loop boilerplate.

I think

for char in iter(lambda: f.read(1), ''):
    pass

is not bad, too.

Reinhold

-- 
Mail address is perfectly valid!


From gvanrossum at gmail.com  Sat Aug 27 16:42:48 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 07:42:48 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <depnv5$236$1@sea.gmane.org>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <depnv5$236$1@sea.gmane.org>
Message-ID: <ca471dc205082707427e5630e5@mail.gmail.com>

On 8/27/05, Kay Schluehr <kay.schluehr at gmx.net> wrote:
> The discourse about Python3000 has shrunken from the expectation of the
> "next big thing" into a depressive rhetorics of feature elimination.
> The language doesn't seem to become deeper, smaller and more powerfull
> but just smaller.

I understand how your perception reading python-dev would make you
think that, but it's not true.

There is much focus on removing things, because we want to be able to
add new stuff but we don't want the language to grow. Python-dev is
(correctly) very focused on the status quo and the near future, so
discussions on what can be removed without hurting are valuable here.

Discussions on what to add should probably happen elsewhere, since the
proposals tend to range from genius to insane (sometimes within one
proposal :-) and the discussion tends to become even more rampant than
the discussions about changes in 2.5.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 27 16:46:07 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 07:46:07 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <op.sv5xl5cq0gn541@theta>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <43100E14.6080009@v.loewis.de>
	<op.sv5wddyq0gn541@theta> <431041F2.6060307@v.loewis.de>
	<op.sv5xl5cq0gn541@theta>
Message-ID: <ca471dc20508270746f31bac3@mail.gmail.com>

On 8/27/05, Wolfgang Lipp <paragate at gmx.net> wrote:
> i never expected .get()
> to work that way (return an unsolicited None) -- i do consider this
> behavior harmful and suggest it be removed.

That's a bizarre attitude. You don't read the docs and hence you want
a feature you weren't aware of to be removed? I'm glad you're not on
*my* team. (Emphasis mine. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From reinhold-birkenfeld-nospam at wolke7.net  Sat Aug 27 16:50:58 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 27 Aug 2005 16:50:58 +0200
Subject: [Python-Dev] test_bz2 on Python 2.4.1
In-Reply-To: <BAY23-F189A51C8C2C4251E5710B2ABAD0@phx.gbl>
References: <BAY23-F189A51C8C2C4251E5710B2ABAD0@phx.gbl>
Message-ID: <depuki$gfi$1@sea.gmane.org>

A.B., Khalid wrote:

>>>#--------------------------- Python 2.4.1 from CVS -----------------#
[test_bz2]
>>>RuntimeError: wrong sequence of bz2 library commands used
>>
>>I don't understand this. The sources for the bz2 modules are exactly equal
>>in both branches.
> 
> I know. Even the tests are equal. I didn't say that these files are to 
> blame, I just said that the test is failing in Python 2.4.1 on Windows.


> cvs -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python login
> cvs -z7 -d :pserver:anonymous at cvs.sourceforge.net:/cvsroot/python update -dP 
> -r release24-maint python
> 
> And it is, more or less, the same way I check out other branches.

No problem here, just eliminating possibilities.

Could anyone else on Windows please try the test_bz2, too?

Reinhold

-- 
Mail address is perfectly valid!


From gvanrossum at gmail.com  Sat Aug 27 17:03:35 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 08:03:35 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <20050826184634.7E06.JCARLSON@uci.edu>
References: <20050826134317.7DFD.JCARLSON@uci.edu> <deoed1$hce$1@sea.gmane.org>
	<20050826184634.7E06.JCARLSON@uci.edu>
Message-ID: <ca471dc205082708035df3cfb7@mail.gmail.com>

On 8/26/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> Taking a look at the commits that Guido did way back in 1993, he doesn't
> mention why he added .find, only that he did.  Maybe it was another of
> the 'functional language additions' that he now regrets, I don't know.

There's nothing functional about it. I remember adding it after
finding it cumbersome to write code using index/rindex. However, that
was long before we added startswith(), endswith(), and 's in t' for
multichar s. Clearly all sorts of varieties of substring matching are
important, or we wouldn't have so many methods devoted to it! (Not to
mention the 're' module.)

However, after 12 years, I believe that the small benefit of having
find() is outweighed by the frequent occurrence of bugs in its use.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Sat Aug 27 17:04:54 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 11:04:54 -0400
Subject: [Python-Dev] empty string api for files
In-Reply-To: <ca471dc20508270729502f22f@mail.gmail.com>
Message-ID: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>

> For reading bytes, I *know* that a lot of code would become uglier if
> the API changed to raise EOFError exceptions


I had StopIteration in mind.  Instead of writing:

    while 1:
        block = f.read(20)
        if line == '':
            break
        . . .

We would use:

    for block in f.readblocks(20):
        . . .


More beauty, a little faster, more concise, and less error-prone.  Of
course, there are likely better choices for the method name, but you get
the gist of it.


From paragate at gmx.net  Sat Aug 27 17:12:45 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Sat, 27 Aug 2005 17:12:45 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc20508270746f31bac3@mail.gmail.com>
References: <dens12$5kg$1@sea.gmane.org> <20050826134317.7DFD.JCARLSON@uci.edu>
	<deoed1$hce$1@sea.gmane.org> <43100E14.6080009@v.loewis.de>
	<op.sv5wddyq0gn541@theta> <431041F2.6060307@v.loewis.de>
	<op.sv5xl5cq0gn541@theta>
	<ca471dc20508270746f31bac3@mail.gmail.com>
Message-ID: <op.sv59vja10gn541@theta>

On Sat, 27 Aug 2005 16:46:07 +0200, Guido van Rossum  
<gvanrossum at gmail.com> wrote:

> On 8/27/05, Wolfgang Lipp <paragate at gmx.net> wrote:
>> i never expected .get()
>> to work that way (return an unsolicited None) -- i do consider this
>> behavior harmful and suggest it be removed.
>
> That's a bizarre attitude. You don't read the docs and hence you want
> a feature you weren't aware of to be removed?

i do read the docs, and i believe i do keep a lot of detail in my
head. every now and then, tho, you piece sth together using a logic
that is not 100% the way it was intended, or the way it came about.
let me say that for someone who did developement for python for
a while it is natural to know that ~.get() is there for avoidance
of exceptions, and default values are an afterthought, but for someone
who did developement *with* python (and lacks experience of the other
side) this ain't necessarily so. that said, i believe it to be
more expressive and safer to demand ~.get('x',None) to be written
to achieve the present behavior, and let ~.get('x') raise an
exception. personally, i can live with either way, and am happier
the second. just my thoughts.

> I'm glad you're not on *my* team. (Emphasis mine. :-)

i wonder what that would be like.

_wolf


From abkhd at hotmail.com  Sat Aug 27 17:23:35 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Sat, 27 Aug 2005 15:23:35 +0000
Subject: [Python-Dev] test_bz2 fails on Python 2.4.1 from CVS,
	passes on same from source archieve
Message-ID: <BAY23-F37ECD0D1ED0BE3585D12A7ABAD0@phx.gbl>

Okay here is the output of test_bz2 on Python 2.4.1 updated and compiled 
fresh from CVS, and on Python 2.4.1 from the source archieve from python.org 
(http://www.python.org/ftp/python/2.4.1/Python-2.4.1.tar.bz2).


#-----------------------------------------------------------------------------
# Python 2.4.1 compiled from source archieve:
# Result: passes
#-----------------------------------------------------------------------------
$ cd /g/projs/py241-src-arc/mingw
$ python ../Lib/test/test_bz2.py
testIterator (__main__.BZ2FileTest) ... ok
testOpenDel (__main__.BZ2FileTest) ... ok
testOpenNonexistent (__main__.BZ2FileTest) ... ok
testRead (__main__.BZ2FileTest) ... ok
testRead100 (__main__.BZ2FileTest) ... ok
testReadChunk10 (__main__.BZ2FileTest) ... ok
testReadLine (__main__.BZ2FileTest) ... ok
testReadLines (__main__.BZ2FileTest) ... ok
testSeekBackwards (__main__.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok
testSeekForward (__main__.BZ2FileTest) ... ok
testSeekPostEnd (__main__.BZ2FileTest) ... ok
testSeekPostEndTwice (__main__.BZ2FileTest) ... ok
testSeekPreStart (__main__.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok
testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok
testWrite (__main__.BZ2FileTest) ... ok
testWriteChunks10 (__main__.BZ2FileTest) ... ok
testWriteLines (__main__.BZ2FileTest) ... ok
testXReadLines (__main__.BZ2FileTest) ... ok
testCompress (__main__.BZ2CompressorTest) ... ok
testCompressChunks10 (__main__.BZ2CompressorTest) ... ok
testDecompress (__main__.BZ2DecompressorTest) ... ok
testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok
testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok
testEOFError (__main__.BZ2DecompressorTest) ... ok
test_Constructor (__main__.BZ2DecompressorTest) ... ok
testCompress (__main__.FuncTest) ... ok
testDecompress (__main__.FuncTest) ... ok
testDecompressEmpty (__main__.FuncTest) ... ok
testDecompressIncomplete (__main__.FuncTest) ... ok

----------------------------------------------------------------------
Ran 31 tests in 6.430s

OK


#-----------------------------------------------------------------------------
# Python 2.4.1 compiled from CVS updated even today:
# Result: fails
#-----------------------------------------------------------------------------
$ cd /g/projs/py24/python/dist/src/MinGW
$ python ../Lib/test/test_bz2.py
testBug1191043 (__main__.BZ2FileTest) ... ERROR
ERROR
testIterator (__main__.BZ2FileTest) ... ok
testModeU (__main__.BZ2FileTest) ... ok
testOpenDel (__main__.BZ2FileTest) ... ok
testOpenNonexistent (__main__.BZ2FileTest) ... ok
testRead (__main__.BZ2FileTest) ... ok
testRead100 (__main__.BZ2FileTest) ... ok
testReadChunk10 (__main__.BZ2FileTest) ... ok
testReadLine (__main__.BZ2FileTest) ... ok
testReadLines (__main__.BZ2FileTest) ... ok
testSeekBackwards (__main__.BZ2FileTest) ... ok
testSeekBackwardsFromEnd (__main__.BZ2FileTest) ... ok
testSeekForward (__main__.BZ2FileTest) ... ok
testSeekPostEnd (__main__.BZ2FileTest) ... ok
testSeekPostEndTwice (__main__.BZ2FileTest) ... ok
testSeekPreStart (__main__.BZ2FileTest) ... ok
testUniversalNewlinesCRLF (__main__.BZ2FileTest) ... ok
testUniversalNewlinesLF (__main__.BZ2FileTest) ... ok
testWrite (__main__.BZ2FileTest) ... ok
testWriteChunks10 (__main__.BZ2FileTest) ... ok
testWriteLines (__main__.BZ2FileTest) ... ok
testXReadLines (__main__.BZ2FileTest) ... ok
testCompress (__main__.BZ2CompressorTest) ... ok
testCompressChunks10 (__main__.BZ2CompressorTest) ... ok
testDecompress (__main__.BZ2DecompressorTest) ... ok
testDecompressChunks10 (__main__.BZ2DecompressorTest) ... ok
testDecompressUnusedData (__main__.BZ2DecompressorTest) ... ok
testEOFError (__main__.BZ2DecompressorTest) ... ok
test_Constructor (__main__.BZ2DecompressorTest) ... ok
testCompress (__main__.FuncTest) ... ok
testDecompress (__main__.FuncTest) ... ok
testDecompressEmpty (__main__.FuncTest) ... ok
testDecompressIncomplete (__main__.FuncTest) ... ok

======================================================================
ERROR: testBug1191043 (__main__.BZ2FileTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "../Lib/test/test_bz2.py", line 255, in testBug1191043
    lines = bz2f.readlines()
RuntimeError: wrong sequence of bz2 library commands used

======================================================================
ERROR: testBug1191043 (__main__.BZ2FileTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "../Lib/test/test_bz2.py", line 47, in tearDown
    os.unlink(self.filename)
OSError: [Errno 13] Permission denied: '@test'

----------------------------------------------------------------------
Ran 33 tests in 5.980s

FAILED (errors=2)
Traceback (most recent call last):
  File "../Lib/test/test_bz2.py", line 357, in ?
    test_main()
  File "../Lib/test/test_bz2.py", line 353, in test_main
    FuncTest
  File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 290, 
in run_unittest
    run_suite(suite, testclass)
  File "G:\PROJS\PY24\PYTHON\DIST\SRC\lib\test\test_support.py", line 274, 
in run_suite
    raise TestFailed(msg)
test.test_support.TestFailed: errors occurred; run in verbose mode for 
details

_________________________________________________________________
Don't just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/


From raymond.hettinger at verizon.net  Sat Aug 27 17:54:39 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 11:54:39 -0400
Subject: [Python-Dev]  Remove str.find in 3.0?
Message-ID: <001801c5ab1f$a2375bc0$a8bb9d8d@oemcomputer>

[Guido]
> However, after 12 years, I believe that the small benefit of having
> find() is outweighed by the frequent occurrence of bugs in its use.

My little code transformation exercise is bearing that out.  Two of the
first four cases in the standard library were buggy :-(


Raymond


From tim.peters at gmail.com  Sat Aug 27 18:38:47 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 27 Aug 2005 12:38:47 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
References: <op.sv5zqejk0gn541@theta>
	<000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
Message-ID: <1f7befae05082709382c6701b5@mail.gmail.com>

[Raymond Hettinger, rewrites some code]
> ...
> --- StringIO.py ---------------
> 
> i = self.buf.find('\n', self.pos)
> if i < 0:
>    newpos = self.len
> else:
>    newpos = i+1
> . . .
> 
> 
> try:
>    i = self.buf.find('\n', self.pos)
> except ValueError():
>    newpos = self.len
> else:
>    newpos = i+1
> . . .

You probably want "except ValueError:" in all these, not "except ValueError():".

Leaving that alone, the last example particularly shows one thing I
dislike about try/except here:  in a language with properties, how is
the code reader supposed to guess that it's specifically and only the
.find() call that _can_ raise ValueError in

    i = self.buf.find('\n', self.pos)

?  I agree it's clear enough here from context, but there's no
confusion possible on this point in the original spelling:  it's
immediately obvious that the result of find() is the only thing being
tested.  There's also strong temptation to slam everything into the
'try' block, and reduce nesting:

newpos = self.len
try:
    newpos = self.buf.find('\n', self.pos) + 1
except ValueError:
    pass

I've often seen code in the wild with, say, two-three dozen lines in a
``try`` block, with an "except AttributeError:" that was _intended_ to
catch an expected AttributeError only in the second of those lines. 
Of course that hides legitimate bugs too.  Like ``object.attr``, the
result of ``string.find()`` is normally used in further computation,
so the temptation is to slam the computation inside the ``try`` block
too.

.find() is a little delicate to use, but IME sloppy try/except
practice (putting much more in the ``try`` block than the specific
little operation where an exception is expected) is common, and harder
to get people to change because it requires thought instead of just
reading the manual to see that -1 means "not there" <0.5 wink>.

Another consideration is code that needs to use .find() a _lot_.  In
my programs of that sort, try/except is a lot more expensive than
letting -1 signal "not there".

From raymond.hettinger at verizon.net  Sat Aug 27 18:46:17 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 Aug 2005 12:46:17 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <1f7befae05082709382c6701b5@mail.gmail.com>
Message-ID: <001a01c5ab26$d901adc0$a8bb9d8d@oemcomputer>

[Tim]
> You probably want "except ValueError:" in all these, not "except
> ValueError():".

Right.  I was misremembering the new edict to write:

   raise ValueError()


Raymond


From tim.peters at gmail.com  Sat Aug 27 19:09:20 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 27 Aug 2005 13:09:20 -0400
Subject: [Python-Dev] test_bz2 on Python 2.4.1
In-Reply-To: <depuki$gfi$1@sea.gmane.org>
References: <BAY23-F189A51C8C2C4251E5710B2ABAD0@phx.gbl>
	<depuki$gfi$1@sea.gmane.org>
Message-ID: <1f7befae05082710091228649@mail.gmail.com>

[Reinhold Birkenfeld]
> Could anyone else on Windows please try the test_bz2, too?

test_bz2 works fine here, on WinXP Pro SP2, under release and debug
builds, on current CVS HEAD and on current CVS release24-maint branch.
 I built those 4 Pythons with the MS compiler, not MinGW.

From jcarlson at uci.edu  Sat Aug 27 19:16:34 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 27 Aug 2005 10:16:34 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc205082708035df3cfb7@mail.gmail.com>
References: <20050826184634.7E06.JCARLSON@uci.edu>
	<ca471dc205082708035df3cfb7@mail.gmail.com>
Message-ID: <20050827095203.7E0C.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum at gmail.com> wrote:
> 
> On 8/26/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Taking a look at the commits that Guido did way back in 1993, he doesn't
> > mention why he added .find, only that he did.  Maybe it was another of
> > the 'functional language additions' that he now regrets, I don't know.
> 
> There's nothing functional about it. I remember adding it after
> finding it cumbersome to write code using index/rindex. However, that
> was long before we added startswith(), endswith(), and 's in t' for
> multichar s. Clearly all sorts of varieties of substring matching are
> important, or we wouldn't have so many methods devoted to it! (Not to
> mention the 're' module.)
> 
> However, after 12 years, I believe that the small benefit of having
> find() is outweighed by the frequent occurrence of bugs in its use.

Oh, there's a good thing to bring up; regular expressions!  re.search
returns a match object on success, None on failure.  With this "failure
-> Exception" idea, shouldn't they raise exceptions instead?  And
goodness, defining a good regular expression can be quite hard, possibly
leading to not insignificant "my regular expression doesn't do what I
want it to do" bugs.  Just look at all of those escape sequences and the
syntax! It's enough to make a new user of Python gasp.

Most of us are consenting adults here.  If someone writes buggy code
with str.find, that is unfortunate, maybe they should have used regular
expressions and tested for None, maybe they should have used
str.startswith (which is sometimes slower than m == n[:len(m)], but I
digress), maybe they should have used str.index. But just because buggy
code can be written with it, doesn't mean that it should be removed. 
Buggy code can, will, and has been written with every Python mechanism
that has ever existed or will ever exist.

With the existance of literally thousands of uses of .find and .rfind in
the wild, any removal consideration should be weighed heavily - which
honestly doesn't seem to be the case here with the ~15 minute reply time
yesterday (just my observation and opinion).  If you had been ruminating
over this previously, great, but that did not seem clear to me in your
original reply to Terry Reedy.

 - Josiah


From bcannon at gmail.com  Sat Aug 27 20:28:20 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Sat, 27 Aug 2005 11:28:20 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc20508261310809b1e3@mail.gmail.com>
References: <dens12$5kg$1@sea.gmane.org>
	<ca471dc20508261310809b1e3@mail.gmail.com>
Message-ID: <bbaeab10050827112877ab7a1b@mail.gmail.com>

On 8/26/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
> On 8/26/05, Terry Reedy <tjreedy at udel.edu> wrote:
> > Can str.find be listed in PEP 3000 (under builtins) for removal?
> 
> Yes please. (Except it's not technically a builtin but a string method.)
> 

Done.  Added an "Atomic Types" section to the PEP as well.

-Brett

From aahz at pythoncraft.com  Sat Aug 27 20:51:47 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 27 Aug 2005 11:51:47 -0700
Subject: [Python-Dev]  Python 3.0 blocks?
In-Reply-To: <ca471dc20508270736c4fe03@mail.gmail.com>
References: <op.sv5zqejk0gn541@theta>
	<000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
	<ca471dc20508270736c4fe03@mail.gmail.com>
Message-ID: <20050827185146.GA28094@panix.com>

On Sat, Aug 27, 2005, Guido van Rossum wrote:
>
> if vi in ('=',':'):
>   try: pos = optval.index(';')
>   except ValueError: pass
>   else:
>     if pos > 0 and optval[pos-1].isspace():
>       optval = optval[:pos]

IIRC, one of your proposals for Python 3.0 was that single-line blocks
would be banned.  Is my memory wrong?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From Scott.Daniels at Acm.Org  Sat Aug 27 23:08:08 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat, 27 Aug 2005 14:08:08 -0700
Subject: [Python-Dev] empty string api for files
In-Reply-To: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>
References: <ca471dc20508270729502f22f@mail.gmail.com>
	<000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>
Message-ID: <deqkno$6gf$1@sea.gmane.org>

Raymond Hettinger wrote:
> We would use:
>     for block in f.readblocks(20):
>         . . .
What would be nice is a reader that allows a range
of bytes.  Often when you read a chunk, you don't care
about the exact size you get, example uses include the
re-blocking that makes reading from compressed data
sources unnecessarily inefficient.

--Scott David Daniels
Scott.Daniels at Acm.Org


From gvanrossum at gmail.com  Sat Aug 27 23:54:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 14:54:13 -0700
Subject: [Python-Dev] empty string api for files
In-Reply-To: <000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>
References: <ca471dc20508270729502f22f@mail.gmail.com>
	<000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>
Message-ID: <ca471dc2050827145430417645@mail.gmail.com>

On 8/27/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > For reading bytes, I *know* that a lot of code would become uglier if
> > the API changed to raise EOFError exceptions
> 
> I had StopIteration in mind.  Instead of writing:
> 
>     while 1:
>         block = f.read(20)
>         if line == '':
>             break
>         . . .
> 
> We would use:
> 
>     for block in f.readblocks(20):
>         . . .
> 
> More beauty, a little faster, more concise, and less error-prone.  Of
> course, there are likely better choices for the method name, but you get
> the gist of it.

I'm not convinced. Where would you ever care about reading a file in
N-bytes chucks? I really think you've got a solution in search of a
problem by the horns here.

While this would be useful for a copying loop, it falls down for most
practical uses of reading bytes (e.g. reading GIF or WAVE file).

I've thought a lot about redesigning the file/stream API, but the
problems thi API change would solve just aren't high on my list. Much
more important are transparency of the buffering (for better
integration with select()), and various translations like universal
newlines or character set encodings. Some of my work on this is
nondist/sandbox/sio/.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sat Aug 27 23:58:23 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 14:58:23 -0700
Subject: [Python-Dev] Python 3.0 blocks?
In-Reply-To: <20050827185146.GA28094@panix.com>
References: <op.sv5zqejk0gn541@theta>
	<000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
	<ca471dc20508270736c4fe03@mail.gmail.com>
	<20050827185146.GA28094@panix.com>
Message-ID: <ca471dc20508271458207509d5@mail.gmail.com>

On 8/27/05, Aahz <aahz at pythoncraft.com> wrote:
> On Sat, Aug 27, 2005, Guido van Rossum wrote:
> >
> > if vi in ('=',':'):
> >   try: pos = optval.index(';')
> >   except ValueError: pass
> >   else:
> >     if pos > 0 and optval[pos-1].isspace():
> >       optval = optval[:pos]
> 
> IIRC, one of your proposals for Python 3.0 was that single-line blocks
> would be banned.  Is my memory wrong?

It's a proposal. I'm on the fence about it. I was just trying to get
the posting out quick before my family came knowcking on my door. :)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gvanrossum at gmail.com  Sun Aug 28 00:54:41 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat, 27 Aug 2005 15:54:41 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <20050827095203.7E0C.JCARLSON@uci.edu>
References: <20050826184634.7E06.JCARLSON@uci.edu>
	<ca471dc205082708035df3cfb7@mail.gmail.com>
	<20050827095203.7E0C.JCARLSON@uci.edu>
Message-ID: <ca471dc205082715542abee518@mail.gmail.com>

On 8/27/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> With the existance of literally thousands of uses of .find and .rfind in
> the wild, any removal consideration should be weighed heavily - which
> honestly doesn't seem to be the case here with the ~15 minute reply time
> yesterday (just my observation and opinion).  If you had been ruminating
> over this previously, great, but that did not seem clear to me in your
> original reply to Terry Reedy.

I hadn't been ruminating about deleting it previously, but I was well
aware of the likelihood of writing buggy tests for find()'s return
value. I believe that str.find() is not just something that can be
used to write buggy code, but something that *causes* bugs over and
over again. (However, see below.)

The argument that there are thousands of usages in the wild doesn't
carry much weight when we're talking about Python 3.0.

There are at least a similar number of modules that expect
dict.keys(), zip() and range() to return lists, or that depend on the
distinction between Unicode strings and 8-bit strings, or on bare
except:, on any other feature that is slated for deletion in Python
3.0 for which the replacement requires careful rethinking of the code
rather than a mechanical translation.

The *premise* of Python 3.0 is that it drops backwards compatibility
in order to make the language better in the long term. Surely you
believe that the majority of all Python programs have yet to be
written?

The only argument in this thread in favor of find() that made sense to
me was Tim Peters' observation that the requirement to use a
try/except clause leads to another kind of sloppy code. It's hard to
judge which is worse -- the buggy find() calls or the buggy/cumbersome
try/except code.

Note that all code (unless it needs to be backwards compatible to
Python 2.2 and before) which is using find() to merely detect whether
a given substring is present should be using 's1 in s2' instead.

Another observation: despite the derogatory remarks about regular
expressions, they have one thing going for them: they provide a higher
level of abstraction for string parsing, which this is all about.
(They are higher level in that you don't have to be counting
characters, which is about the lowest-level activity in programming --
only counting bytes is lower!)

Maybe if we had a *good* way of specifying string parsing we wouldn't
be needing to call find() or index() so much at all! (A good example
is the code that Raymond lifted from ConfigParser: a semicolon
preceded by whitespace starts a comment, other semicolons don't.
Surely there ought to be a better way to write that.)

All in all, I'm still happy to see find() go in Python 3.0, but I'm
leaving the door ajar: if you read this post carefully, you'll know
what arguments can be used to persuade me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From abo at minkirri.apana.org.au  Sun Aug 28 03:52:25 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Sun, 28 Aug 2005 11:52:25 +1000
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <20050827095203.7E0C.JCARLSON@uci.edu>
References: <20050826184634.7E06.JCARLSON@uci.edu>
	<ca471dc205082708035df3cfb7@mail.gmail.com>
	<20050827095203.7E0C.JCARLSON@uci.edu>
Message-ID: <1125193945.5215.26.camel@localhost>

On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote:
> Guido van Rossum <gvanrossum at gmail.com> wrote:
[...]
> Oh, there's a good thing to bring up; regular expressions!  re.search
> returns a match object on success, None on failure.  With this "failure
> -> Exception" idea, shouldn't they raise exceptions instead?  And
> goodness, defining a good regular expression can be quite hard, possibly
> leading to not insignificant "my regular expression doesn't do what I
> want it to do" bugs.  Just look at all of those escape sequences and the
> syntax! It's enough to make a new user of Python gasp.

I think re.match() returning None is an example of 1b (as categorised by
Terry Reedy). In this particular case a 1b style response is OK. Why;

1) any successful match evaluates to "True", and None evaluates to
"False". This allows simple code like;

  if myreg.match(s):
    do something.

Note you can't do this for find, as 0 is a successful "find" and
evaluates to False, whereas other results including -1 evaluate to True.
Even worse, -1 is a valid index.

2) exceptions are for unexpected events, where unexpected means "much
less likely than other possibilities". The re.match() operation asks
"does this match this", which implies you have an about even chance of
not matching... ie a failure to match is not unexpected. The result None
makes sense... "what match did we get? None, OK".

For str.index() you are asking "give me the index of this inside this",
which implies you expect it to be in there... ie not finding it _is_
unexpected, and should raise an exception.

Note that re.match() returning None will raise exceptions if the rest of
your code doesn't expect it;

index = myreg.match(s).start()
tail = s[index:]

This will raise an exception if there was no match.

Unlike str.find();

index = s.find(r)
tail = s[index:]

Which will happily return the last character if there was no match. This
is why find() should return None instead of -1.

> With the existance of literally thousands of uses of .find and .rfind in
> the wild, any removal consideration should be weighed heavily - which
> honestly doesn't seem to be the case here with the ~15 minute reply time
> yesterday (just my observation and opinion).  If you had been ruminating
> over this previously, great, but that did not seem clear to me in your
> original reply to Terry Reedy.

bare in mind they are talking about Python 3.0... I think :-)

-- 
Donovan Baarda <abo at minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/


From sharprazor at gmail.com  Sun Aug 28 06:02:02 2005
From: sharprazor at gmail.com (FAN)
Date: Sun, 28 Aug 2005 12:02:02 +0800
Subject: [Python-Dev] Any detail list of change between version
	2.1-2.2-2.3-2.4 of Python?
Message-ID: <51ec6a95050827210210a408e9@mail.gmail.com>

hi, all

You know Jython (Java version of Python) has only a stable version of
2.1, and two alpha version was release after 3 years.
So if it wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail
change list was need, and it's great if there are some test case
script to test the new implemention version.
So does Python has this kinds of things? Where can I find them or
something like this?


Regards

FAN

From jcarlson at uci.edu  Sun Aug 28 07:52:31 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 27 Aug 2005 22:52:31 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <1125193945.5215.26.camel@localhost>
References: <20050827095203.7E0C.JCARLSON@uci.edu>
	<1125193945.5215.26.camel@localhost>
Message-ID: <20050827215414.7E27.JCARLSON@uci.edu>


Donovan Baarda <abo at minkirri.apana.org.au> wrote:
> 
> On Sat, 2005-08-27 at 10:16 -0700, Josiah Carlson wrote:
> > Guido van Rossum <gvanrossum at gmail.com> wrote:
> [...]
> > Oh, there's a good thing to bring up; regular expressions!  re.search
> > returns a match object on success, None on failure.  With this "failure
> > -> Exception" idea, shouldn't they raise exceptions instead?  And
> > goodness, defining a good regular expression can be quite hard, possibly
> > leading to not insignificant "my regular expression doesn't do what I
> > want it to do" bugs.  Just look at all of those escape sequences and the
> > syntax! It's enough to make a new user of Python gasp.
> 
> I think re.match() returning None is an example of 1b (as categorised by
> Terry Reedy). In this particular case a 1b style response is OK. Why;

My tongue was firmly planted in my cheek during my discussion of regular
expressions.  I was using it as an example of when one starts applying
some arbitrary rule to one example, and not noticing other examples that
do very similar, if not the same thing.

[snip discussion of re.match, re.search, str.find]

If you are really going to compare re.match, re.search and str.find, you
need to point out that neither re.match nor re.search raise an exception
when something isn't found (only when you try to work with None).  This
puts str.index as the odd-man-out in this discussion of searching a
string - so the proposal of tossing str.find as the 'weird one' is a
little strange.


One thing that has gotten my underwear in a twist is that no one has
really offered up a transition mechanism from "str.find working like now"
and some future "str.find or lack of" other than "use str.index". 
Obviously, I personally find the removal of str.find to be a nonstarter
(don't make me catch exceptions or use regular expressions when both are
unnecessary, please), but a proper transition of str.find from -1 to
None on failure would be beneficial (can which one be chosen at runtime
via __future__ import?).

During a transition which uses __future__, it would encourage the
/proper/ use of str.find in all modules and extensions in which use it...

    x = y.find(z)
    if x >= 0:
        #...

Forcing people to use the proper semantic in their modules so as to be
compatible with other modules which may or may not use str.find returns
None, would (I believe) result in an overall reduction (if not
elimination) of bugs stemming from str.find, and would prevent former
str.find users from stumbling down the try/except/else misuse that Tim
Peters highlighted.

Heck, if you can get the __future__ import working for choosing which
str.find to use (on a global, not per-module basis), I say toss it into
2.6, or even 2.5 if there is really a push for this prior to 3.0 .

 - Josiah


From tjreedy at udel.edu  Sun Aug 28 07:51:48 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 28 Aug 2005 01:51:48 -0400
Subject: [Python-Dev] Any detail list of change between
	version2.1-2.2-2.3-2.4 of Python?
References: <51ec6a95050827210210a408e9@mail.gmail.com>
Message-ID: <derjdk$tm4$1@sea.gmane.org>


"FAN" <sharprazor at gmail.com> wrote in message 
news:51ec6a95050827210210a408e9 at mail.gmail.com...
> You know Jython (Java version of Python) has only a stable version of
> 2.1, and two alpha version was release after 3 years.
> So if it wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail
> change list was need, and it's great if there are some test case
> script to test the new implemention version.
> So does Python has this kinds of things? Where can I find them or
> something like this?

I believe this question is off-topic here, which is for dicussion of future 
changes.  If you ask the same question on comp.lang.python or the mail or 
gmane.org  equivalent, or perhaps in the search box at python.org, I am 
sure you will get an answer.

Terry J. Reedy


From steve at holdenweb.com  Sun Aug 28 08:58:39 2005
From: steve at holdenweb.com (Steve Holden)
Date: Sun, 28 Aug 2005 02:58:39 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <20050827215414.7E27.JCARLSON@uci.edu>
References: <20050827095203.7E0C.JCARLSON@uci.edu>	<1125193945.5215.26.camel@localhost>
	<20050827215414.7E27.JCARLSON@uci.edu>
Message-ID: <dernce$45k$1@sea.gmane.org>

Josiah Carlson wrote:
> Donovan Baarda <abo at minkirri.apana.org.au> wrote:
[...]
> 
> One thing that has gotten my underwear in a twist is that no one has
> really offered up a transition mechanism from "str.find working like now"
> and some future "str.find or lack of" other than "use str.index". 
> Obviously, I personally find the removal of str.find to be a nonstarter
> (don't make me catch exceptions or use regular expressions when both are
> unnecessary, please), but a proper transition of str.find from -1 to
> None on failure would be beneficial (can which one be chosen at runtime
> via __future__ import?).
> 
> During a transition which uses __future__, it would encourage the
> /proper/ use of str.find in all modules and extensions in which use it...
> 
>     x = y.find(z)
>     if x >= 0:
>         #...
> 
It does seem rather fragile to rely on the continuation of the current 
behavior

  >>> None >= 0
False

for the correctness of "proper usage". Is this guaranteed in future 
implementations? Especially when:

  >>> type(None) >= 0
True

> Forcing people to use the proper semantic in their modules so as to be
> compatible with other modules which may or may not use str.find returns
> None, would (I believe) result in an overall reduction (if not
> elimination) of bugs stemming from str.find, and would prevent former
> str.find users from stumbling down the try/except/else misuse that Tim
> Peters highlighted.
> 
Once "str.find() returns None to fail" becomes the norm then surely the 
correct usage would be

     x = y.find(z)
     if x is not None:
         #...

which is still a rather ugly paradigm, but acceptable. So the transition 
is bound to be troubling.

> Heck, if you can get the __future__ import working for choosing which
> str.find to use (on a global, not per-module basis), I say toss it into
> 2.6, or even 2.5 if there is really a push for this prior to 3.0 .

The real problem is surely that one of find()'s legitimate return values 
evaluates false in a Boolean context. It's especially troubling that the 
value that does so doesn't indicate search failure. I'd prefer to live 
with the wart until 3.0 introduces something more satisfactory, or 
simply removes find() altogether. Otherwise the resulting code breakage 
when the future arrives just causes unnecessary pain.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/


From jcarlson at uci.edu  Sun Aug 28 12:50:17 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 28 Aug 2005 03:50:17 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <dernce$45k$1@sea.gmane.org>
References: <20050827215414.7E27.JCARLSON@uci.edu> <dernce$45k$1@sea.gmane.org>
Message-ID: <20050828030405.7E2D.JCARLSON@uci.edu>


Steve Holden <steve at holdenweb.com> wrote:
> 
> Josiah Carlson wrote:
> > Donovan Baarda <abo at minkirri.apana.org.au> wrote:
> [...]
> > 
> > One thing that has gotten my underwear in a twist is that no one has
> > really offered up a transition mechanism from "str.find working like now"
> > and some future "str.find or lack of" other than "use str.index". 
> > Obviously, I personally find the removal of str.find to be a nonstarter
> > (don't make me catch exceptions or use regular expressions when both are
> > unnecessary, please), but a proper transition of str.find from -1 to
> > None on failure would be beneficial (can which one be chosen at runtime
> > via __future__ import?).
> > 
> > During a transition which uses __future__, it would encourage the
> > /proper/ use of str.find in all modules and extensions in which use it...
> > 
> >     x = y.find(z)
> >     if x >= 0:
> >         #...
> > 
> It does seem rather fragile to rely on the continuation of the current 
> behavior
> 
>   >>> None >= 0
> False

Please see this previous post on None comparisons and why it is unlikely
to change:
http://mail.python.org/pipermail/python-dev/2003-December/041374.html


> for the correctness of "proper usage". Is this guaranteed in future 
> implementations? Especially when:
> 
>   >>> type(None) >= 0
> True

That is an interesting, but subjectively useless comparison:

>>> type(0) >= 0
True
>>> type(int) >= 0
True

When do you ever compare the type of an object with the value of another
object?


> > Forcing people to use the proper semantic in their modules so as to be
> > compatible with other modules which may or may not use str.find returns
> > None, would (I believe) result in an overall reduction (if not
> > elimination) of bugs stemming from str.find, and would prevent former
> > str.find users from stumbling down the try/except/else misuse that Tim
> > Peters highlighted.
> > 
> Once "str.find() returns None to fail" becomes the norm then surely the 
> correct usage would be
> 
>      x = y.find(z)
>      if x is not None:
>          #...
> 
> which is still a rather ugly paradigm, but acceptable. So the transition 
> is bound to be troubling.

Perhaps, which is why I offered "x >= 0".


> > Heck, if you can get the __future__ import working for choosing which
> > str.find to use (on a global, not per-module basis), I say toss it into
> > 2.6, or even 2.5 if there is really a push for this prior to 3.0 .
> 
> The real problem is surely that one of find()'s legitimate return values 
> evaluates false in a Boolean context. It's especially troubling that the 
> value that does so doesn't indicate search failure. I'd prefer to live 
> with the wart until 3.0 introduces something more satisfactory, or 
> simply removes find() altogether. Otherwise the resulting code breakage 
> when the future arrives just causes unnecessary pain.

Here's a current (horrible but common) solution:

x = string.find(substring) + 1
if x:
    x -= 1
    ...


...I'm up way to late.
 - Josiah


From gregory.lielens at fft.be  Sun Aug 28 13:09:02 2005
From: gregory.lielens at fft.be (Gregory Lielens)
Date: Sun, 28 Aug 2005 13:09:02 +0200
Subject: [Python-Dev] info/advices about python readline implementation
Message-ID: <1125227342.13393.6.camel@Athlon64.home>

Hi,

Lisandro Dalcin and me are working on a common version of our patches
([1232343],[955928]) that we plan to submit soon (this would close the
two previously proposed patches, and we plan also to review 5 other
patches to push this one before 2.5 ;-) ). 

We would like this new patch to be as clean and as safe as possible, but
to do so we would need some infos/advices from the list, and in
particular peoples having worked in the readline C implementation, i.e.
in modules Modules/readline.c,  Parser/myreadline.c (PyOS_StdioReadline,
PyOS_StdioReadline, vms__StdioReadline), Python/bltinmodule.c
(builtin_raw_input).

First a general question about implementation guidelines for CPython:
   -is it ok to initialize a static pointer to a non-null value (the
address of a predefined function) at compile time? ANSI-C (or even
pre-ansi C afaik) accept this, we have tested it on various linux and
unix, and there are occurrences of similar construct in the python C
sources, but not so many and not for function pointers (or I did not
found it ;) ).
We wonder if this can cause problem on some platforms not correctly
implementing C standard(s) but that python have to support nonetheless,
or if there is a feeling against it...The idea is to initialize
PyOS_ReadlineFunctionPointer this way.

Then something about the VMS platform support:
  -readline seems to make uses of the extern function
vms__StdioReadline() on VMS...Where can we find the source or doc about
this function? In particular, we would like to know if this function
call (or can call) PyOS_StdioReadline, which would cause infinite
recursion in some version of our patch....without havind access to VMS
for testing or info about vms__StdioReadline, this is impossible to
know...

Thanks for any info,

Greg.


From mozbugbox at yahoo.com.au  Sun Aug 28 12:17:21 2005
From: mozbugbox at yahoo.com.au (JustFillBug)
Date: Sun, 28 Aug 2005 10:17:21 +0000 (UTC)
Subject: [Python-Dev] Remove str.find in 3.0?
References: <dens12$5kg$1@sea.gmane.org>
Message-ID: <slrndh34lh.5ua.mozbugbox@mozbugbox.somehost.org>

On 2005-08-26, Terry Reedy <tjreedy at udel.edu> wrote:
> Can str.find be listed in PEP 3000 (under builtins) for removal?
> Would anyone really object?
>

With all the discussion, I think you guys should realize that the
find/index method are actually convenient function which do 2 things in
one call:
1) If the key exists?
2) If the key exists, find it out.

But whether you use find or index, at the end, you *have to* break it into
2 step at then end in order to make bug free code. Without find, you can
do:

if s in txt:
   i = txt.index(s)
   ...
else:
   pass

or:
try:
   i = txt.index(s)
   ...
except ValueError:
   pass

With find:
i = txt.index(s)
if i >= 0:
  ...
else:
  pass

The code is about the same except with exception, the test of Exception
is pushed far apart instead of immediately. No much coding was saved.


From abkhd at hotmail.com  Sun Aug 28 13:24:36 2005
From: abkhd at hotmail.com (A.B., Khalid)
Date: Sun, 28 Aug 2005 11:24:36 +0000
Subject: [Python-Dev] test_bz2 and Python 2.4.1
Message-ID: <BAY23-F3320BE1788E21BA5E5F054ABAC0@phx.gbl>

Okay. Even though I know that most people here would probably find it
difficult to give input when MinGW is used to build Python, I am going
to post what I found out so far anyway concerning the test_bz2 situation
for referencing purposes.

--------------------------------------------------------------------
Python version     Mode used        Location of test        Result
  from CVS
--------------------------------------------------------------------
Python 2.5a0        normal           ../Lib/test/            PASS
Python 2.5a0        normal           CWD of Python           PASS
Python 2.5a0      interactive        ../Lib/test/            PASS
Python 2.5a0      interactive        CWD of Python           PASS

Python 2.4.1        normal           ../Lib/test/            FAIL
Python 2.4.1        normal           CWD of Python           PASS
Python 2.4.1      interactive        ../Lib/test/            PASS
Python 2.4.1      interactive        CWD of Python           PASS
--------------------------------------------------------------------


For python 2.4.1, tried both bzip2-1.0.2, and bzip2-1.0.3 on Win98 SE,
and WinXP Pro SP2, using MinGW 3.4.4.

I'll try to see what else I can find out.

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


From raymond.hettinger at verizon.net  Sun Aug 28 14:05:54 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 28 Aug 2005 08:05:54 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc205082715542abee518@mail.gmail.com>
Message-ID: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer>

[Guido]
> Another observation: despite the derogatory remarks about regular
> expressions, they have one thing going for them: they provide a higher
> level of abstraction for string parsing, which this is all about.
> (They are higher level in that you don't have to be counting
> characters, which is about the lowest-level activity in programming --
> only counting bytes is lower!)
> 
> Maybe if we had a *good* way of specifying string parsing we wouldn't
> be needing to call find() or index() so much at all! (A good example
> is the code that Raymond lifted from ConfigParser: a semicolon
> preceded by whitespace starts a comment, other semicolons don't.
> Surely there ought to be a better way to write that.)

A higher level abstraction is surely the way to go.

I looked over the use cases for find and index.  As from cases which are
now covered by the "in" operator, it looks like you almost always want
the index to support a subsequent partition of the string.

That suggests that we need a variant of split() that has been customized
for typical find/index use cases.  Perhaps introduce a new pair of
methods, partition() and rpartition() which work like this:

    >>> s = 'http://www.python.org'
    >>> s.partition('://')
    ('http', '://', 'www.python.org')
    >>> s.rpartition('.')
    ('http://www.python', '.', 'org')
    >>> s.partition('?')
    (''http://www.python.org', '', '')

The idea is still preliminary and I have only applied it to a handful of
the library's find() and index() examples.  Here are some of the design
considerations:

* The function always succeeds unless the separator argument is not a
string type or is an empty string.  So, a typical call doesn't have to
be wrapped in a try-suite for normal usage.

* The split invariant is:   s == ''.join(s.partition(t))

* The result of the partition is always a three element tuple.  This
allows the results to be unpacked directly:

   head, sep, tail = s.partition(t)

* The use cases for find() indicates a need to both test for the
presence of the split element and to then to make a slice at that point.
If we used a contains test for the first step, we could end-up having to
search the string twice (once for detection and once for splitting).
However, by providing the middle element of the result tuple, we can
determine found or not-found without an additional search.  Accordingly,
the middle element has a nice Boolean interpretation with '' for
not-found and a non-empty string meaning found.  Given
(a,b,c)=s.partition(p), the following invariant holds:

   b == '' or b is p
   
* Returning the left, center, and right portions of the split supports a
simple programming pattern for repeated partitions:

   while s:
       head, part, s = s.partition(t)
       . . .

Of course, if this idea survives the day, then I'll meet my own
requirements and write a context diff on the standard library.  That
ought to give a good indication of how well the new methods meet
existing needs and whether the resulting code is better, cleaner,
clearer, faster, etc.


Raymond


From pinard at iro.umontreal.ca  Sun Aug 28 14:21:05 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Sun, 28 Aug 2005 08:21:05 -0400
Subject: [Python-Dev] Python 3.0 blocks?
In-Reply-To: <ca471dc20508271458207509d5@mail.gmail.com>
References: <op.sv5zqejk0gn541@theta>
	<000101c5ab06$bdff3620$a8bb9d8d@oemcomputer>
	<ca471dc20508270736c4fe03@mail.gmail.com>
	<20050827185146.GA28094@panix.com>
	<ca471dc20508271458207509d5@mail.gmail.com>
Message-ID: <20050828122105.GA6786@phenix.progiciels-bpi.ca>

[Guido van Rossum]
> [Aahz]

> > IIRC, one of your proposals for Python 3.0 was that single-line
> > blocks would be banned.  Is my memory wrong?

> It's a proposal. I'm on the fence about it.

A difficult decision indeed.  Most single line blocks I've seen would be
more legible if they were written with two lines each, so I'm carefully
avoiding them as a personal rule.

But each rule has exceptions.  There are a few rare cases, usually
sequences of repetitive code, where single line blocks well succeed in
stressing both the repetitive structure and the differences, making the
code more legible then.

As someone well put it already, this is all about Python helping good
coders at writing good code, against Python preventing bad coders from
writing bad code.  Sadly enough, looking around, it seems Python could
be a bit more aggressive against bad practices in this particular case,
even if this might hurt good coders once in a while.  But I'm not sure!

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From mal at egenix.com  Sun Aug 28 15:10:14 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 28 Aug 2005 15:10:14 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer>
References: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer>
Message-ID: <4311B7B6.8070503@egenix.com>

Raymond Hettinger wrote:
> [Guido]
> 
>>Another observation: despite the derogatory remarks about regular
>>expressions, they have one thing going for them: they provide a higher
>>level of abstraction for string parsing, which this is all about.
>>(They are higher level in that you don't have to be counting
>>characters, which is about the lowest-level activity in programming --
>>only counting bytes is lower!)
>>
>>Maybe if we had a *good* way of specifying string parsing we wouldn't
>>be needing to call find() or index() so much at all! (A good example
>>is the code that Raymond lifted from ConfigParser: a semicolon
>>preceded by whitespace starts a comment, other semicolons don't.
>>Surely there ought to be a better way to write that.)
>  
> A higher level abstraction is surely the way to go.

I may be missing something, but why invent yet another parsing
method - we already have the re module. I'd suggest to
use it :-)

If re is not fast enough or you want more control over the
parsing process, you could also have a look at mxTextTools:

    http://www.egenix.com/files/python/mxTextTools.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 28 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From stephen at xemacs.org  Sun Aug 28 15:27:04 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 28 Aug 2005 22:27:04 +0900
Subject: [Python-Dev] [Python-checkins] python/dist/src setup.py, 1.219,
 1.220
In-Reply-To: <430B9186.3010106@v.loewis.de> (
	=?iso-8859-1?q?Martin_v=2E_L=F6wis's_message_of?= "Tue, 23 Aug 2005
	23:13:42 +0200")
References: <000201c5a816$3caacaa0$8901a044@oemcomputer>
	<430B9186.3010106@v.loewis.de>
Message-ID: <87hddambtj.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin at v.loewis.de> writes:

    Martin> Raymond Hettinger wrote:

    >> Do you have an ANSI-strict option with your compiler?

    Martin> gcc does have an option to force c89 compliance, but there
    Martin> is a good chance that Python stops compiling with option:
    Martin> on many systems, essential system headers fail to comply
    Martin> with C89 (in addition, activating that mode also makes
    Martin> many extensions unavailable).

However, it might be a reasonable pre-checkin test to try compiling
changed files with the option enabled, depending on the number of
nonconforming system headers, etc., and grep the output for whinging
about c89-nonconformance.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From aahz at pythoncraft.com  Sun Aug 28 16:00:41 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 28 Aug 2005 07:00:41 -0700
Subject: [Python-Dev] Any detail list of change between
	version2.1-2.2-2.3-2.4 of Python?
In-Reply-To: <derjdk$tm4$1@sea.gmane.org>
References: <51ec6a95050827210210a408e9@mail.gmail.com>
	<derjdk$tm4$1@sea.gmane.org>
Message-ID: <20050828140041.GA25264@panix.com>

On Sun, Aug 28, 2005, Terry Reedy wrote:
> "FAN" <sharprazor at gmail.com> wrote in message 
> news:51ec6a95050827210210a408e9 at mail.gmail.com...
>>
>> You know Jython (Java version of Python) has only a stable version
>> of 2.1, and two alpha version was release after 3 years.  So if it
>> wants to evolve to 2.2 , 2.3 or 2.4 as Python, some detail change
>> list was need, and it's great if there are some test case script to
>> test the new implemention version.  So does Python has this kinds of
>> things? Where can I find them or something like this?

All changes are supposed to be in Misc/NEWS.  You should also be able to
use most of the test cases in Python itself, which are in Lib/test/

However, you should also read
http://www.catb.org/~esr/faqs/smart-questions.html
Had you read the various docs about Python development, you would
certainly have figured out Lib/test/ on your own.

> I believe this question is off-topic here, which is for dicussion of
> future changes.  If you ask the same question on comp.lang.python or
> the mail or gmane.org equivalent, or perhaps in the search box at
> python.org, I am sure you will get an answer.

Because this is about the future of Jython, it's entirely appropriate
for discussion here -- python-dev is *NOT* just for CPython.  (It's
similar to questions about porting.)  As long as people ask questions of
the appropriate level, that is.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From tjreedy at udel.edu  Sun Aug 28 16:29:59 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 28 Aug 2005 10:29:59 -0400
Subject: [Python-Dev] empty string api for files
References: <ca471dc20508270729502f22f@mail.gmail.com><000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>
	<ca471dc2050827145430417645@mail.gmail.com>
Message-ID: <deshp7$thj$1@sea.gmane.org>

> I'm not convinced. Where would you ever care about reading a file in
> N-bytes chucks?

This was once a standard paradigm for IBM mainframe files.  I vaguely 
remember having to specify the block/record size when opening such files. 
I have no idea of today's practice though.

Terry J. Reedy
 

From raymond.hettinger at verizon.net  Sun Aug 28 16:32:24 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 28 Aug 2005 10:32:24 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <4311B7B6.8070503@egenix.com>
Message-ID: <000701c5abdd$4f0b7440$d206a044@oemcomputer>

[Marc-Andre Lemburg]
> I may be missing something, but why invent yet another parsing
> method - we already have the re module. I'd suggest to
> use it :-)
> 
> If re is not fast enough or you want more control over the
> parsing process, you could also have a look at mxTextTools:
> 
>     http://www.egenix.com/files/python/mxTextTools.html

Both are excellent tools.  Neither is as lightweight, as trivial to
learn, or as transparently obvious as the proposed s.partition(sep).
The idea is to find a viable replacement for s.find().

Looking at sample code transformations shows that the high-power
mxTextTools and re approaches do not simplify code that currently uses
s.find().  In contrast, the proposed partition() method is a joy to use
and has no surprises.  The following code transformation shows
unbeatable simplicity and clarity.


--- From CGIHTTPServer.py ---------------

def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    i = rest.rfind('?')
    if i >= 0:
        rest, query = rest[:i], rest[i+1:]
    else:
        query = ''
    i = rest.find('/')
    if i >= 0:
        script, rest = rest[:i], rest[i:]
    else:
        script, rest = rest, ''
    . . .


def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    rest, _, query = rest.rpartition('?')
    script, _, rest = rest.partition('/')
    . . .


The new proposal does not help every use case though.  In
ConfigParser.py, the problem description reads, "a semi-colon is a
comment delimiter only if it follows a spacing character".  This cries
out for a regular expression.  In StringIO.py, since the task at hand IS
calculating an index, an indexless higher level construct doesn't help.
However, many of the other s.find() use cases in the library simplify as
readily and directly as the above cgi server example.


Raymond


-------------------------------------------------------

P.S.  FWIW, if you want to experiment with it, here a concrete
implementation of partition() expressed as a function:

def partition(s, t):
    """ Returns a three element tuple, (head, sep, tail) where:

        head + sep + tail == s
        t not in head
        sep == '' or sep is t
        bool(sep) == (t in s)       # sep indicates if the string was
found

    >>> s = 'http://www.python.org'
    >>> partition(s, '://')
    ('http', '://', 'www.python.org')
    >>> partition(s, '?')
    ('http://www.python.org', '', '')
    >>> partition(s, 'http://')
    ('', 'http://', 'www.python.org')
    >>> partition(s, 'org')
    ('http://www.python.', 'org', '')

    """
    if not isinstance(t, basestring) or not t:
        raise ValueError('partititon argument must be a non-empty
string')
    parts = s.split(t, 1)
    if len(parts) == 1:
        result = (s, '', '')
    else:
        result = (parts[0], t, parts[1])
    assert len(result) == 3
    assert ''.join(result) == s
    assert result[1] == '' or result[1] is t
    assert t not in result[0]
    return result


import doctest
print doctest.testmod()


From raymond.hettinger at verizon.net  Sun Aug 28 16:35:05 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 28 Aug 2005 10:35:05 -0400
Subject: [Python-Dev] empty string api for files
In-Reply-To: <deshp7$thj$1@sea.gmane.org>
Message-ID: <000801c5abdd$af18fe20$d206a044@oemcomputer>

> > I'm not convinced. Where would you ever care about reading a file in
> > N-bytes chucks?
> 
> This was once a standard paradigm for IBM mainframe files.  I vaguely
> remember having to specify the block/record size when opening such
files.
> I have no idea of today's practice though.

I believe this still comes up in 100% of the cases where you're
buffering reads of large binary files.  Given lot of RAM, this probably
doesn't come up as much nowadays.


Raymond


From guido at python.org  Sun Aug 28 17:23:23 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 28 Aug 2005 08:23:23 -0700
Subject: [Python-Dev] info/advices about python readline implementation
In-Reply-To: <1125227342.13393.6.camel@Athlon64.home>
References: <1125227342.13393.6.camel@Athlon64.home>
Message-ID: <ca471dc205082808236c035b5d@mail.gmail.com>

On 8/28/05, Gregory Lielens <gregory.lielens at fft.be> wrote:
>    -is it ok to initialize a static pointer to a non-null value (the
> address of a predefined function) at compile time?

Yes. All of Python's standard types and modules use this idiom.

> We wonder if this can cause problem on some platforms not correctly
> implementing C standard(s) but that python have to support nonetheless,

If a platform doesn't have a working C89 compiler, we generally wait
for the compiler to be fixed (or for GCC to be ported). We might
compromise when a platform doesn't support full POSIX, but this seems
purely a language issue and there can be no excuses -- C89 is older
than Python!

> Then something about the VMS platform support:
>   -readline seems to make uses of the extern function
> vms__StdioReadline() on VMS...Where can we find the source or doc about
> this function? In particular, we would like to know if this function
> call (or can call) PyOS_StdioReadline, which would cause infinite
> recursion in some version of our patch....without havind access to VMS
> for testing or info about vms__StdioReadline, this is impossible to
> know...

I have no idea; Googling for it only showed up discussions of
readline.c. You might write the authors of the patch that introduced
it (the same Google query will find the info); if they don't respond,
I'm not sure that it's worth worrying about.

My personal guess is that it's probably a VMS internal function, which
would reduce the probability of it calling back to PyOS_StdioReadline
to zero. It can't be a Python specific thing, because it doesn't have
a 'Py' prefix.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From arigo at tunes.org  Sun Aug 28 18:05:03 2005
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 28 Aug 2005 18:05:03 +0200
Subject: [Python-Dev] PyPy release 0.7.0
Message-ID: <20050828160503.GA4908@code1.codespeak.net>

Hi Python-dev'ers,

The first Python implementation of Python is now also the
second C implementation of Python :-)


Samuele & Armin (& the rest of the team)

-+-+-


pypy-0.7.0: first PyPy-generated Python Implementations
==============================================================

What was once just an idea between a few people discussing 
on some nested mailing list thread and in a pub became reality ... 
the PyPy development team is happy to announce its first
public release of a fully translatable self contained Python
implementation.  The 0.7 release showcases the results of our
efforts in the last few months since the 0.6 preview release
which have been partially funded by the European Union:

- whole program type inference on our Python Interpreter 
  implementation with full translation to two different 
  machine-level targets: C and LLVM 

- a translation choice of using a refcounting or Boehm 
  garbage collectors

- the ability to translate with or without thread support 

- very complete language-level compliancy with CPython 2.4.1 


What is PyPy (about)? 
------------------------------------------------

PyPy is a MIT-licensed research-oriented reimplementation of
Python written in Python itself, flexible and easy to
experiment with.  It translates itself to lower level
languages.  Our goals are to target a large variety of
platforms, small and large, by providing a compilation toolsuite
that can produce custom Python versions.  Platform, Memory and
Threading models are to become aspects of the translation
process - as opposed to encoding low level details into a
language implementation itself.  Eventually, dynamic
optimization techniques - implemented as another translation
aspect - should become robust against language changes.

Note that PyPy is mainly a research and development project
and does not by itself focus on getting a production-ready
Python implementation although we do hope and expect it to
become a viable contender in that area sometime next year. 


Where to start? 
-----------------------------

Getting started:    http://codespeak.net/pypy/dist/pypy/doc/getting-started.html

PyPy Documentation: http://codespeak.net/pypy/dist/pypy/doc/ 

PyPy Homepage:      http://codespeak.net/pypy/

The interpreter and object model implementations shipped with
the 0.7 version can run on their own and implement the core
language features of Python as of CPython 2.4.  However, we still
do not recommend using PyPy for anything else than for education, 
playing or research purposes.  

Ongoing work and near term goals
---------------------------------

PyPy has been developed during approximately 15 coding sprints
across Europe and the US.  It continues to be a very
dynamically and incrementally evolving project with many
one-week meetings to follow.  You are invited to consider coming to 
the next such meeting in Paris mid October 2005 where we intend to 
plan and head for an even more intense phase of the project
involving building a JIT-Compiler and enabling unique
features not found in other Python language implementations.

PyPy has been a community effort from the start and it would
not have got that far without the coding and feedback support
from numerous people.   Please feel free to give feedback and 
raise questions. 

    contact points: http://codespeak.net/pypy/dist/pypy/doc/contact.html

    contributor list: http://codespeak.net/pypy/dist/pypy/doc/contributor.html

have fun, 
    
    the pypy team, of which here is a partial snapshot
    of mainly involved persons: 

    Armin Rigo, Samuele Pedroni, 
    Holger Krekel, Christian Tismer, 
    Carl Friedrich Bolz, Michael Hudson, 
    Eric van Riet Paap, Richard Emslie, 
    Anders Chrigstroem, Anders Lehmann, 
    Ludovic Aubry, Adrien Di Mascio, 
    Niklaus Haldimann, Jacob Hallen, 
    Bea During, Laura Creighton, 
    and many contributors ... 

PyPy development and activities happen as an open source project  
and with the support of a consortium partially funded by a two 
year European Union IST research grant. Here is a list of 
the full partners of that consortium: 
        
    Heinrich-Heine University (Germany), AB Strakt (Sweden)
    merlinux GmbH (Germany), tismerysoft GmbH(Germany) 
    Logilab Paris (France), DFKI GmbH (Germany)
    ChangeMaker (Sweden), Impara (Germany)

From gregory.lielens at fft.be  Sun Aug 28 18:06:53 2005
From: gregory.lielens at fft.be (Gregory Lielens)
Date: Sun, 28 Aug 2005 18:06:53 +0200
Subject: [Python-Dev] info/advices about python readline implementation
In-Reply-To: <ca471dc205082808236c035b5d@mail.gmail.com>
References: <1125227342.13393.6.camel@Athlon64.home>
	<ca471dc205082808236c035b5d@mail.gmail.com>
Message-ID: <1125245214.13393.18.camel@Athlon64.home>


> > Then something about the VMS platform support:
> >   -readline seems to make uses of the extern function
> > vms__StdioReadline() on VMS...Where can we find the source or doc about
> > this function? In particular, we would like to know if this function
> > call (or can call) PyOS_StdioReadline, which would cause infinite
> > recursion in some version of our patch....without havind access to VMS
> > for testing or info about vms__StdioReadline, this is impossible to
> > know...
> 
> I have no idea; Googling for it only showed up discussions of
> readline.c. You might write the authors of the patch that introduced
> it (the same Google query will find the info); if they don't respond,
> I'm not sure that it's worth worrying about.

Googling only returned comments or queries by either Lisandro or me, but
it was loewis (Martin v. L?wis ?) that comited this change in May 2003
with the comment Patch #708495: Port more stuff to OpenVMS.

Tha patch was introduced by Jean-Fran?ois Pi?ronne, who explained: 

myreadline.c
Use of vms__StdioReadline

> My personal guess is that it's probably a VMS internal function, which
> would reduce the probability of it calling back to PyOS_StdioReadline
> to zero. It can't be a Python specific thing, because it doesn't have
> a 'Py' prefix.

My guess too, especially as using PyOS_StdioReadline (which is not in
the python API) would be asking for trouble...We will thus consider that
there is no risk of infinite recursion, except if someone say
otherwise...

Thanks a lot for these fast and usefull informations!


Greg.


From jcarlson at uci.edu  Sun Aug 28 20:31:46 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 28 Aug 2005 11:31:46 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000301c5abc8$d7fe3fe0$d206a044@oemcomputer>
References: <ca471dc205082715542abee518@mail.gmail.com>
	<000301c5abc8$d7fe3fe0$d206a044@oemcomputer>
Message-ID: <20050828105627.7E33.JCARLSON@uci.edu>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote:
> [Guido]
> > Another observation: despite the derogatory remarks about regular
> > expressions, they have one thing going for them: they provide a higher
> > level of abstraction for string parsing, which this is all about.
> > (They are higher level in that you don't have to be counting
> > characters, which is about the lowest-level activity in programming --
> > only counting bytes is lower!)
> > 
> > Maybe if we had a *good* way of specifying string parsing we wouldn't
> > be needing to call find() or index() so much at all! (A good example
> > is the code that Raymond lifted from ConfigParser: a semicolon
> > preceded by whitespace starts a comment, other semicolons don't.
> > Surely there ought to be a better way to write that.)
> 
> A higher level abstraction is surely the way to go.

Perhaps...

> Of course, if this idea survives the day, then I'll meet my own
> requirements and write a context diff on the standard library.  That
> ought to give a good indication of how well the new methods meet
> existing needs and whether the resulting code is better, cleaner,
> clearer, faster, etc.


My first thought when reading the proposal was "that's just
str.split/str.rsplit with maxsplit=1, returning the thing you split on,
with 3 items always returned, what's the big deal?"  Two second later it
hit me, that is the big deal.

Right now it is a bit of a pain to get string.split to return consistant
numbers of return values; I myself have used:
  l,r = (x.split(y, 1)+[''])[:2]
...around 10 times - 10 times more than I really should have.

Taking a wander through my code, this improves the look and flow in
almost every case (the exceptions being where I should have rewritten to
use 'substr in str' after I started using Python 2.3). Taking a walk
through examples of str.rfind at koders.com leads me to believe that
.partition/.rpartition would generally improve the flow, correctness,
and beauty of code which had previously been using .find/.rfind.


I hope the idea survives the day.
 - Josiah


From mal at egenix.com  Sun Aug 28 20:33:58 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 28 Aug 2005 20:33:58 +0200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer>
References: <000701c5abdd$4f0b7440$d206a044@oemcomputer>
Message-ID: <43120396.30406@egenix.com>

Raymond Hettinger wrote:
> [Marc-Andre Lemburg]
> 
>>I may be missing something, but why invent yet another parsing
>>method - we already have the re module. I'd suggest to
>>use it :-)
>>
>>If re is not fast enough or you want more control over the
>>parsing process, you could also have a look at mxTextTools:
>>
>>    http://www.egenix.com/files/python/mxTextTools.html
> 
> 
> Both are excellent tools.  Neither is as lightweight, as trivial to
> learn, or as transparently obvious as the proposed s.partition(sep).
> The idea is to find a viable replacement for s.find().

Your partition idea could be had with an additional argument
to .split() (e.g. keepsep=1); no need to learn a new method.

Also, as I understand Terry's request, .find() should be removed
in favor of just leaving .index() (which is the identical method
without the funny -1 return code logic).

So your proposal really doesn't have all that much to do
with Terry's request, but is a new and separate proposal
(which does have some value in few cases, but not enough
to warrant a new method).

> Looking at sample code transformations shows that the high-power
> mxTextTools and re approaches do not simplify code that currently uses
> s.find().  In contrast, the proposed partition() method is a joy to use
> and has no surprises.  The following code transformation shows
> unbeatable simplicity and clarity.
> 
> 
> --- From CGIHTTPServer.py ---------------
> 
> def run_cgi(self):
>     """Execute a CGI script."""
>     dir, rest = self.cgi_info
>     i = rest.rfind('?')
>     if i >= 0:
>         rest, query = rest[:i], rest[i+1:]
>     else:
>         query = ''
>     i = rest.find('/')
>     if i >= 0:
>         script, rest = rest[:i], rest[i:]
>     else:
>         script, rest = rest, ''
>     . . .
> 
> 
> def run_cgi(self):
>     """Execute a CGI script."""
>     dir, rest = self.cgi_info
>     rest, _, query = rest.rpartition('?')
>     script, _, rest = rest.partition('/')

Wouldn't this do the same ?! ...

rest, query = rest.rsplit('?', maxsplit=1)
script, rest = rest.split('/', maxsplit=1)

>     . . .
> 
> 
> The new proposal does not help every use case though.  In
> ConfigParser.py, the problem description reads, "a semi-colon is a
> comment delimiter only if it follows a spacing character".  This cries
> out for a regular expression.  In StringIO.py, since the task at hand IS
> calculating an index, an indexless higher level construct doesn't help.
> However, many of the other s.find() use cases in the library simplify as
> readily and directly as the above cgi server example.
> 
> 
> 
> Raymond
> 
> 
> -------------------------------------------------------
> 
> P.S.  FWIW, if you want to experiment with it, here a concrete
> implementation of partition() expressed as a function:
> 
> def partition(s, t):
>     """ Returns a three element tuple, (head, sep, tail) where:
> 
>         head + sep + tail == s
>         t not in head
>         sep == '' or sep is t
>         bool(sep) == (t in s)       # sep indicates if the string was
> found
> 
>     >>> s = 'http://www.python.org'
>     >>> partition(s, '://')
>     ('http', '://', 'www.python.org')
>     >>> partition(s, '?')
>     ('http://www.python.org', '', '')
>     >>> partition(s, 'http://')
>     ('', 'http://', 'www.python.org')
>     >>> partition(s, 'org')
>     ('http://www.python.', 'org', '')
> 
>     """
>     if not isinstance(t, basestring) or not t:
>         raise ValueError('partititon argument must be a non-empty
> string')
>     parts = s.split(t, 1)
>     if len(parts) == 1:
>         result = (s, '', '')
>     else:
>         result = (parts[0], t, parts[1])
>     assert len(result) == 3
>     assert ''.join(result) == s
>     assert result[1] == '' or result[1] is t
>     assert t not in result[0]
>     return result
> 
> 
> import doctest
> print doctest.testmod()

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 28 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From rrr at ronadam.com  Sun Aug 28 20:46:53 2005
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 28 Aug 2005 14:46:53 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer>
References: <000701c5abdd$4f0b7440$d206a044@oemcomputer>
Message-ID: <4312069D.9030109@ronadam.com>

Raymond Hettinger wrote:

> Looking at sample code transformations shows that the high-power
> mxTextTools and re approaches do not simplify code that currently uses
> s.find().  In contrast, the proposed partition() method is a joy to use
> and has no surprises.  The following code transformation shows
> unbeatable simplicity and clarity.

+1

This doesn't cause any backward compatible issues as well!

> --- From CGIHTTPServer.py ---------------
> 
> def run_cgi(self):
>     """Execute a CGI script."""
>     dir, rest = self.cgi_info
>     i = rest.rfind('?')
>     if i >= 0:
>         rest, query = rest[:i], rest[i+1:]
>     else:
>         query = ''
>     i = rest.find('/')
>     if i >= 0:
>         script, rest = rest[:i], rest[i:]
>     else:
>         script, rest = rest, ''
>     . . .
> 
> 
> def run_cgi(self):
>     """Execute a CGI script."""
>     dir, rest = self.cgi_info
>     rest, _, query = rest.rpartition('?')
>     script, _, rest = rest.partition('/')
>     . . .

+1

Much easier to read and understand!

Cheers,
    Ron


From raymond.hettinger at verizon.net  Sun Aug 28 21:04:10 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 28 Aug 2005 15:04:10 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <43120396.30406@egenix.com>
Message-ID: <001501c5ac03$4679e480$d206a044@oemcomputer>

[M.-A. Lemburg]
> Also, as I understand Terry's request, .find() should be removed
> in favor of just leaving .index() (which is the identical method
> without the funny -1 return code logic).
> 
> So your proposal really doesn't have all that much to do
> with Terry's request, but is a new and separate proposal
> (which does have some value in few cases, but not enough
> to warrant a new method).

It is new and separate, but it is also related.  The core of Terry's
request is the assertion that str.find() is bug-prone and should not be
used.  The principal arguments against accepting his request (advanced
by Tim) are that the str.index() alternative is slightly more awkward to
code, more likely to result in try-suites that catch more than intended,
and that the resulting code is slower.  Those arguments fall to the
wayside if str.partition() becomes available as a superior alternative.
IOW, it makes Terry's request much more palatable.


> > def run_cgi(self):
> >     """Execute a CGI script."""
> >     dir, rest = self.cgi_info
> >     rest, _, query = rest.rpartition('?')
> >     script, _, rest = rest.partition('/')

[MAL]
> Wouldn't this do the same ?! ...
> 
> rest, query = rest.rsplit('?', maxsplit=1)
> script, rest = rest.split('/', maxsplit=1)

No.  The split() versions are buggy.  They fail catastrophically when
the original string does not contain '?' or does not contain '/':

    >>> rest = 'http://www.example.org/subdir'
    >>> rest, query = rest.rsplit('?', 1)

    Traceback (most recent call last):
      File "<pyshell#10>", line 1, in -toplevel-
        rest, query = rest.rsplit('?', 1)
    ValueError: need more than 1 value to unpack


The whole point of str.partition() is to repackage str.split() in a way
that is conducive to fulfilling many of the existing use cases for
str.find() and str.index().

In going through the library examples, I've not found a single case
where a direct use of str.split() would improve code that currently uses
str.find().


Raymond


From steve at holdenweb.com  Sun Aug 28 23:03:26 2005
From: steve at holdenweb.com (Steve Holden)
Date: Sun, 28 Aug 2005 17:03:26 -0400
Subject: [Python-Dev] empty string api for files
In-Reply-To: <deshp7$thj$1@sea.gmane.org>
References: <ca471dc20508270729502f22f@mail.gmail.com><000c01c5ab18$aee5fcc0$a8bb9d8d@oemcomputer>	<ca471dc2050827145430417645@mail.gmail.com>
	<deshp7$thj$1@sea.gmane.org>
Message-ID: <det8sf$p8i$2@sea.gmane.org>

Terry Reedy wrote:
>>I'm not convinced. Where would you ever care about reading a file in
>>N-bytes chucks?
> 
> 
> This was once a standard paradigm for IBM mainframe files.  I vaguely 
> remember having to specify the block/record size when opening such files. 
> I have no idea of today's practice though.
> 
Indeed. Something like:

SYSIN   DD  *,BLKSIZE=80

IIRC (which I may well not do after thirty years or so). People used to 
solve generic programming problems in JCL just for the hell of it.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/


From pinard at iro.umontreal.ca  Mon Aug 29 01:05:25 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Sun, 28 Aug 2005 19:05:25 -0400
Subject: [Python-Dev] empty string api for files
In-Reply-To: <det8sf$p8i$2@sea.gmane.org>
References: <ca471dc2050827145430417645@mail.gmail.com>
	<deshp7$thj$1@sea.gmane.org> <det8sf$p8i$2@sea.gmane.org>
Message-ID: <20050828230525.GA14562@alcyon.progiciels-bpi.ca>

[Steve Holden]

> Terry Reedy wrote:

> > This was once a standard paradigm for IBM mainframe files.  I
> > vaguely remember having to specify the block/record size when
> > opening such files.  I have no idea of today's practice though.

> Indeed. Something like:

> SYSIN   DD  *,BLKSIZE=80

Oh!  The "*" is pretty magical, and came from HASP (Houston Automatic
Spooling Program, if I remember well), and not from IBM.  It took a
lot of years before IBM even acknowledged the existence of HASP (in
dark times when salesmen and engineers ought to strictly obey company
mandated attitudes).  Nevertheless, almost every IBM customer was
installing HASP under the scene, because without the "*", people ought
to specify on their DD cards the preallocation of disk space, even
for spool files, as a number of cylinders and sectors for the primary
extent, and a number of cylinders and sectors for all secondary extents.
I later learned that IBM gave in, including HASP facilities as standard.

> People used to solve generic programming problems in JCL just for the
> hell of it.

The hell is the right word to describe it! :-) I wonder if JCL could
emulate a Turing Machine, but it at least addressed the Halting Problem!

                    One-who-happily-forgot-all-bout-this-ly yours...

P.S. - How is this related to Python?  Luckily! -- that is: *not*! :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From raymond.hettinger at verizon.net  Mon Aug 29 07:48:57 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 29 Aug 2005 01:48:57 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
Message-ID: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>

As promised, here is a full set of real-world comparative code
transformations using str.partition().  The patch isn't intended to be
applied; rather, it is here to test/demonstrate whether the new
construct offers benefits under a variety of use cases.

Overall, I found that partition() usefully encapsulated commonly
occurring low-level programming patterns.  In most cases, it completely
eliminated the need for slicing and indices.  In several cases, code was
simplified dramatically; in some, the simplification was minor; and in a
few cases, the complexity was about the same.  No cases were made worse.

Most patterns using str.find() directly translated into an equivalent
using partition.  The only awkwardness that arose was in cases where the
original code had a test like, "if s.find(pat) > 0".  That case
translated to a double-term test, "if found and head".  Also, some
pieces of code needed a tail that included the separator.  That need was
met by inserting a line like "tail = sep + tail".  And that solution led
to a minor naming discomfort for the middle term of the result tuple, it
was being used as both a Boolean found flag and as a string containing
the separator (hence conflicting the choice of names between "found" and
"sep").

In most cases, there was some increase in efficiency resulting fewer
total steps and tests, and from eliminating double searches.  However,
in a few cases, the new code was less efficient because the fragment
only needed either the head or tail but not both as provided by
partition().

In every case, the code was clearer after the transformation.  Also,
none of the transformations required str.partition() to be used in a
tricky way.  In contrast, I found many contortions using str.find()
where I had to diagram every possible path to understand what the code
was trying to do or to assure myself that it worked.

The new methods excelled at reducing cyclomatic complexity by
eliminating conditional paths.  The methods were especially helpful in
the context of multiple finds (i.e. split at the leftmost colon if
present within a group following the rightmost forward slash if
present).  In several cases, the replaced code exactly matched the pure
python version of str.partition() -- this confirms that people are
routinely writing multi-step low-level in-line code that duplicates was
str.partition() does in a single step.

The more complex transformations were handled by first figuring out
exactly was the original code did under all possible cases and then
writing the partition() version to match that spec.  The lesson was that
it is much easier to program from scratch using partition() than it is
to code using find().  The new method more naturally expresses a series
of parsing steps interleaved with other code.

With further ado, here are the comparative code fragments:

Index: CGIHTTPServer.py
===================================================================
*** 106,121 ****
      def run_cgi(self):
          """Execute a CGI script."""
          dir, rest = self.cgi_info
!         i = rest.rfind('?')
!         if i >= 0:
!             rest, query = rest[:i], rest[i+1:]
!         else:
!             query = ''
!         i = rest.find('/')
!         if i >= 0:
!             script, rest = rest[:i], rest[i:]
!         else:
!             script, rest = rest, ''
          scriptname = dir + '/' + script
          scriptfile = self.translate_path(scriptname)
          if not os.path.exists(scriptfile):
--- 106,113 ----
      def run_cgi(self):
          """Execute a CGI script."""
          dir, rest = self.cgi_info
!         rest, _, query = rest.rpartition('?')
!         script, _, rest = rest.partition('/')
          scriptname = dir + '/' + script
          scriptfile = self.translate_path(scriptname)
          if not os.path.exists(scriptfile):
Index: ConfigParser.py
===================================================================
*** 599,612 ****
          if depth > MAX_INTERPOLATION_DEPTH:
              raise InterpolationDepthError(option, section, rest)
          while rest:
!             p = rest.find("%")
!             if p < 0:
!                 accum.append(rest)
                  return
!             if p > 0:
!                 accum.append(rest[:p])
!                 rest = rest[p:]
!             # p is no longer used
              c = rest[1:2]
              if c == "%":
                  accum.append("%")
--- 599,611 ----
          if depth > MAX_INTERPOLATION_DEPTH:
              raise InterpolationDepthError(option, section, rest)
          while rest:
!             head, sep, rest = rest.partition("%")
!             if not sep:
!                 accum.append(head)
                  return
!             rest = sep + rest
!             if found and head:
!                 accum.append(head)
              c = rest[1:2]
              if c == "%":
                  accum.append("%")
Index: cgi.py
===================================================================
*** 337,346 ****
      key = plist.pop(0).lower()
      pdict = {}
      for p in plist:
!         i = p.find('=')
!         if i >= 0:
!             name = p[:i].strip().lower()
!             value = p[i+1:].strip()
              if len(value) >= 2 and value[0] == value[-1] == '"':
                  value = value[1:-1]
                  value = value.replace('\\\\', '\\').replace('\\"',
'"')
--- 337,346 ----
      key = plist.pop(0).lower()
      pdict = {}
      for p in plist:
!         name, found, value = p.partition('=')
!         if found:
!             name = name.strip().lower()
!             value = value.strip()
              if len(value) >= 2 and value[0] == value[-1] == '"':
                  value = value[1:-1]
                  value = value.replace('\\\\', '\\').replace('\\"',
'"')
Index: cookielib.py
===================================================================
*** 610,618 ****
  
  def request_port(request):
      host = request.get_host()
!     i = host.find(':')
!     if i >= 0:
!         port = host[i+1:]
          try:
              int(port)
          except ValueError:
--- 610,617 ----
  
  def request_port(request):
      host = request.get_host()
!     _, sep, port = host.partition(':')
!     if sep:
          try:
              int(port)
          except ValueError:
***************
*** 670,681 ****
      '.local'
  
      """
!     i = h.find(".")
!     if i >= 0:
!         #a = h[:i]  # this line is only here to show what a is
!         b = h[i+1:]
!         i = b.find(".")
!         if is_HDN(h) and (i >= 0 or b == "local"):
              return "."+b
      return h
  
--- 669,677 ----
      '.local'
  
      """
!     a, found, b = h.partition('.')
!     if found:
!         if is_HDN(h) and ('.' in b or b == "local"):
              return "."+b
      return h
  
***************
*** 1451,1463 ****
          else:
              path_specified = False
              path = request_path(request)
!             i = path.rfind("/")
!             if i != -1:
                  if version == 0:
                      # Netscape spec parts company from reality here
!                     path = path[:i]
                  else:
!                     path = path[:i+1]
              if len(path) == 0: path = "/"
  
          # set default domain
--- 1447,1459 ----
          else:
              path_specified = False
              path = request_path(request)
!             head, sep, _ = path.rpartition('/')
!             if sep:
                  if version == 0:
                      # Netscape spec parts company from reality here
!                     path = head
                  else:
!                     path = head + sep
              if len(path) == 0: path = "/"
  
          # set default domain
Index: gopherlib.py
===================================================================
*** 57,65 ****
      """Send a selector to a given host and port, return a file with
the reply."""
      import socket
      if not port:
!         i = host.find(':')
!         if i >= 0:
!             host, port = host[:i], int(host[i+1:])
      if not port:
          port = DEF_PORT
      elif type(port) == type(''):
--- 57,65 ----
      """Send a selector to a given host and port, return a file with
the reply."""
      import socket
      if not port:
!         head, found, tail = host.partition(':')
!         if found:
!             host, port = head, int(tail)
      if not port:
          port = DEF_PORT
      elif type(port) == type(''):
Index: httplib.py
===================================================================
*** 490,498 ****
          while True:
              if chunk_left is None:
                  line = self.fp.readline()
!                 i = line.find(';')
!                 if i >= 0:
!                     line = line[:i] # strip chunk-extensions
                  chunk_left = int(line, 16)
                  if chunk_left == 0:
                      break
--- 490,496 ----
          while True:
              if chunk_left is None:
                  line = self.fp.readline()
!                 line, _, _ = line.partition(';')  # strip
chunk-extensions
                  chunk_left = int(line, 16)
                  if chunk_left == 0:
                      break
***************
*** 586,599 ****
  
      def _set_hostport(self, host, port):
          if port is None:
!             i = host.rfind(':')
!             j = host.rfind(']')         # ipv6 addresses have [...]
!             if i > j:
                  try:
!                     port = int(host[i+1:])
                  except ValueError:
!                     raise InvalidURL("nonnumeric port: '%s'" %
host[i+1:])
!                 host = host[:i]
              else:
                  port = self.default_port
              if host and host[0] == '[' and host[-1] == ']':
--- 584,595 ----
  
      def _set_hostport(self, host, port):
          if port is None:
!             host, _, port = host.rpartition(':')
!             if ']' not in port:         # ipv6 addresses have [...]
                  try:
!                     port = int(port)
                  except ValueError:
!                     raise InvalidURL("nonnumeric port: '%s'" % port)
              else:
                  port = self.default_port
              if host and host[0] == '[' and host[-1] == ']':
***************
*** 976,998 ****
          L = [self._buf]
          self._buf = ''
          while 1:
!             i = L[-1].find("\n")
!             if i >= 0:
                  break
              s = self._read()
              if s == '':
                  break
              L.append(s)
!         if i == -1:
              # loop exited because there is no more data
              return "".join(L)
          else:
!             all = "".join(L)
!             # XXX could do enough bookkeeping not to do a 2nd search
!             i = all.find("\n") + 1
!             line = all[:i]
!             self._buf = all[i:]
!             return line
  
      def readlines(self, sizehint=0):
          total = 0
--- 972,990 ----
          L = [self._buf]
          self._buf = ''
          while 1:
!             head, found, tail = L[-1].partition('\n')
!             if found:
                  break
              s = self._read()
              if s == '':
                  break
              L.append(s)
!         if not found:
              # loop exited because there is no more data
              return "".join(L)
          else:
!             self._buf = found + tail
!             return "".join(L) + head
  
      def readlines(self, sizehint=0):
          total = 0
Index: ihooks.py
===================================================================
*** 426,438 ****
          return None
  
      def find_head_package(self, parent, name):
!         if '.' in name:
!             i = name.find('.')
!             head = name[:i]
!             tail = name[i+1:]
!         else:
!             head = name
!             tail = ""
          if parent:
              qname = "%s.%s" % (parent.__name__, head)
          else:
--- 426,432 ----
          return None
  
      def find_head_package(self, parent, name):
!         head, _, tail = name.partition('.')
          if parent:
              qname = "%s.%s" % (parent.__name__, head)
          else:
***************
*** 449,457 ****
      def load_tail(self, q, tail):
          m = q
          while tail:
!             i = tail.find('.')
!             if i < 0: i = len(tail)
!             head, tail = tail[:i], tail[i+1:]
              mname = "%s.%s" % (m.__name__, head)
              m = self.import_it(head, mname, m)
              if not m:
--- 443,449 ----
      def load_tail(self, q, tail):
          m = q
          while tail:
!             head, _, tail = tail.partition('.')
              mname = "%s.%s" % (m.__name__, head)
              m = self.import_it(head, mname, m)
              if not m:
Index: locale.py
===================================================================
*** 98,106 ****
      seps = 0
      spaces = ""
      if s[-1] == ' ':
!         sp = s.find(' ')
!         spaces = s[sp:]
!         s = s[:sp]
      while s and grouping:
          # if grouping is -1, we are done
          if grouping[0]==CHAR_MAX:
--- 98,105 ----
      seps = 0
      spaces = ""
      if s[-1] == ' ':
!         spaces, sep, tail = s.partition(' ')
!         s = sep + tail
      while s and grouping:
          # if grouping is -1, we are done
          if grouping[0]==CHAR_MAX:
***************
*** 148,156 ****
          # so, kill as much spaces as there where separators.
          # Leading zeroes as fillers are not yet dealt with, as it is
          # not clear how they should interact with grouping.
!         sp = result.find(" ")
!         if sp==-1:break
!         result = result[:sp]+result[sp+1:]
          seps -= 1
  
      return result
--- 147,156 ----
          # so, kill as much spaces as there where separators.
          # Leading zeroes as fillers are not yet dealt with, as it is
          # not clear how they should interact with grouping.
!         head, found, tail = result.partition(' ')
!         if not found:
!             break
!         result = head + tail
          seps -= 1
  
      return result
Index: mailcap.py
===================================================================
*** 105,117 ****
      key, view, rest = fields[0], fields[1], fields[2:]
      fields = {'view': view}
      for field in rest:
!         i = field.find('=')
!         if i < 0:
!             fkey = field
!             fvalue = ""
!         else:
!             fkey = field[:i].strip()
!             fvalue = field[i+1:].strip()
          if fkey in fields:
              # Ignore it
              pass
--- 105,113 ----
      key, view, rest = fields[0], fields[1], fields[2:]
      fields = {'view': view}
      for field in rest:
!         fkey, found, fvalue = field.partition('=')
!         fkey = fkey.strip()
!         fvalue = fvalue.strip()
          if fkey in fields:
              # Ignore it
              pass
Index: mhlib.py
===================================================================
*** 356,364 ****
          if seq == 'all':
              return all
          # Test for X:Y before X-Y because 'seq:-n' matches both
!         i = seq.find(':')
!         if i >= 0:
!             head, dir, tail = seq[:i], '', seq[i+1:]
              if tail[:1] in '-+':
                  dir, tail = tail[:1], tail[1:]
              if not isnumeric(tail):
--- 356,364 ----
          if seq == 'all':
              return all
          # Test for X:Y before X-Y because 'seq:-n' matches both
!         head, found, tail = seq.partition(':')
!         if found:
!             dir = ''
              if tail[:1] in '-+':
                  dir, tail = tail[:1], tail[1:]
              if not isnumeric(tail):
***************
*** 394,403 ****
                      i = bisect(all, anchor-1)
                      return all[i:i+count]
          # Test for X-Y next
!         i = seq.find('-')
!         if i >= 0:
!             begin = self._parseindex(seq[:i], all)
!             end = self._parseindex(seq[i+1:], all)
              i = bisect(all, begin-1)
              j = bisect(all, end)
              r = all[i:j]
--- 394,403 ----
                      i = bisect(all, anchor-1)
                      return all[i:i+count]
          # Test for X-Y next
!         head, found, tail = seq.find('-')
!         if found:
!             begin = self._parseindex(head, all)
!             end = self._parseindex(tail, all)
              i = bisect(all, begin-1)
              j = bisect(all, end)
              r = all[i:j]
Index: modulefinder.py
===================================================================
*** 140,148 ****
              assert caller is parent
              self.msgout(4, "determine_parent ->", parent)
              return parent
!         if '.' in pname:
!             i = pname.rfind('.')
!             pname = pname[:i]
              parent = self.modules[pname]
              assert parent.__name__ == pname
              self.msgout(4, "determine_parent ->", parent)
--- 140,147 ----
              assert caller is parent
              self.msgout(4, "determine_parent ->", parent)
              return parent
!         pname, found, _ = pname.rpartition('.')
!         if found:
              parent = self.modules[pname]
              assert parent.__name__ == pname
              self.msgout(4, "determine_parent ->", parent)
***************
*** 152,164 ****
  
      def find_head_package(self, parent, name):
          self.msgin(4, "find_head_package", parent, name)
!         if '.' in name:
!             i = name.find('.')
!             head = name[:i]
!             tail = name[i+1:]
!         else:
!             head = name
!             tail = ""
          if parent:
              qname = "%s.%s" % (parent.__name__, head)
          else:
--- 151,157 ----
  
      def find_head_package(self, parent, name):
          self.msgin(4, "find_head_package", parent, name)
!         head, _, tail = name.partition('.')
          if parent:
              qname = "%s.%s" % (parent.__name__, head)
          else:
Index: pdb.py
===================================================================
*** 189,200 ****
          # split into ';;' separated commands
          # unless it's an alias command
          if args[0] != 'alias':
!             marker = line.find(';;')
!             if marker >= 0:
!                 # queue up everything after marker
!                 next = line[marker+2:].lstrip()
                  self.cmdqueue.append(next)
!                 line = line[:marker].rstrip()
          return line
  
      # Command definitions, called by cmdloop()
--- 189,200 ----
          # split into ';;' separated commands
          # unless it's an alias command
          if args[0] != 'alias':
!             line, found, next = line.partition(';;')
!             if found:
!                 # queue up everything after command separator
!                 next = next.lstrip()
                  self.cmdqueue.append(next)
!                 line = line.rstrip()
          return line
  
      # Command definitions, called by cmdloop()
***************
*** 217,232 ****
          filename = None
          lineno = None
          cond = None
!         comma = arg.find(',')
!         if comma > 0:
              # parse stuff after comma: "condition"
!             cond = arg[comma+1:].lstrip()
!             arg = arg[:comma].rstrip()
          # parse stuff before comma: [filename:]lineno | function
-         colon = arg.rfind(':')
          funcname = None
!         if colon >= 0:
!             filename = arg[:colon].rstrip()
              f = self.lookupmodule(filename)
              if not f:
                  print '*** ', repr(filename),
--- 217,232 ----
          filename = None
          lineno = None
          cond = None
!         arg, found, cond = arg.partition(',')
!         if found and arg:
              # parse stuff after comma: "condition"
!             arg = arg.rstrip()
!             cond = cond.lstrip()
          # parse stuff before comma: [filename:]lineno | function
          funcname = None
!         filename, found, arg = arg.rpartition(':')
!         if found:
!             filename = filename.rstrip()
              f = self.lookupmodule(filename)
              if not f:
                  print '*** ', repr(filename),
***************
*** 234,240 ****
                  return
              else:
                  filename = f
!             arg = arg[colon+1:].lstrip()
              try:
                  lineno = int(arg)
              except ValueError, msg:
--- 234,240 ----
                  return
              else:
                  filename = f
!             arg = arg.lstrip()
              try:
                  lineno = int(arg)
              except ValueError, msg:
***************
*** 437,445 ****
              return
          if ':' in arg:
              # Make sure it works for "clear C:\foo\bar.py:12"
!             i = arg.rfind(':')
!             filename = arg[:i]
!             arg = arg[i+1:]
              try:
                  lineno = int(arg)
              except:
--- 437,443 ----
              return
          if ':' in arg:
              # Make sure it works for "clear C:\foo\bar.py:12"
!             filename, _, arg = arg.rpartition(':')
              try:
                  lineno = int(arg)
              except:
Index: rfc822.py
===================================================================
*** 197,205 ****
          You may override this method in order to use Message parsing
on tagged
          data in RFC 2822-like formats with special header formats.
          """
!         i = line.find(':')
!         if i > 0:
!             return line[:i].lower()
          return None
  
      def islast(self, line):
--- 197,205 ----
          You may override this method in order to use Message parsing
on tagged
          data in RFC 2822-like formats with special header formats.
          """
!         head, found, tail = line.partition(':')
!         if found and head:
!             return head.lower()
          return None
  
      def islast(self, line):
***************
*** 340,348 ****
              else:
                  if raw:
                      raw.append(', ')
!                 i = h.find(':')
!                 if i > 0:
!                     addr = h[i+1:]
                  raw.append(addr)
          alladdrs = ''.join(raw)
          a = AddressList(alladdrs)
--- 340,348 ----
              else:
                  if raw:
                      raw.append(', ')
!                 head, found, tail = h.partition(':')
!                 if found and head:
!                     addr = tail
                  raw.append(addr)
          alladdrs = ''.join(raw)
          a = AddressList(alladdrs)
***************
*** 859,867 ****
              data = stuff + data[1:]
      if len(data) == 4:
          s = data[3]
!         i = s.find('+')
!         if i > 0:
!             data[3:] = [s[:i], s[i+1:]]
          else:
              data.append('') # Dummy tz
      if len(data) < 5:
--- 859,867 ----
              data = stuff + data[1:]
      if len(data) == 4:
          s = data[3]
!         head, found, tail = s.partition('+')
!         if found and head:
!             data[3:] = [head, tail]
          else:
              data.append('') # Dummy tz
      if len(data) < 5:
Index: robotparser.py
===================================================================
*** 104,112 ****
                      entry = Entry()
                      state = 0
              # remove optional comment and strip line
!             i = line.find('#')
!             if i>=0:
!                 line = line[:i]
              line = line.strip()
              if not line:
                  continue
--- 104,110 ----
                      entry = Entry()
                      state = 0
              # remove optional comment and strip line
!             line, _, _ = line.partition('#')
              line = line.strip()
              if not line:
                  continue
Index: smtpd.py
===================================================================
*** 144,156 ****
                  self.push('500 Error: bad syntax')
                  return
              method = None
!             i = line.find(' ')
!             if i < 0:
!                 command = line.upper()
                  arg = None
              else:
!                 command = line[:i].upper()
!                 arg = line[i+1:].strip()
              method = getattr(self, 'smtp_' + command, None)
              if not method:
                  self.push('502 Error: command "%s" not implemented' %
command)
--- 144,155 ----
                  self.push('500 Error: bad syntax')
                  return
              method = None
!             command, found, arg = line.partition(' ')
!             command = command.upper()            
!             if not found:
                  arg = None
              else:
!                 arg = tail.strip()
              method = getattr(self, 'smtp_' + command, None)
              if not method:
                  self.push('502 Error: command "%s" not implemented' %
command)
***************
*** 495,514 ****
          usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args))
  
      # split into host/port pairs
!     i = localspec.find(':')
!     if i < 0:
          usage(1, 'Bad local spec: %s' % localspec)
!     options.localhost = localspec[:i]
      try:
!         options.localport = int(localspec[i+1:])
      except ValueError:
          usage(1, 'Bad local port: %s' % localspec)
!     i = remotespec.find(':')
!     if i < 0:
          usage(1, 'Bad remote spec: %s' % remotespec)
!     options.remotehost = remotespec[:i]
      try:
!         options.remoteport = int(remotespec[i+1:])
      except ValueError:
          usage(1, 'Bad remote port: %s' % remotespec)
      return options
--- 494,513 ----
          usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args))
  
      # split into host/port pairs
!     head, found, tail = localspec.partition(':')
!     if not found:
          usage(1, 'Bad local spec: %s' % localspec)
!     options.localhost = head
      try:
!         options.localport = int(tail)
      except ValueError:
          usage(1, 'Bad local port: %s' % localspec)
!     head, found, tail = remotespec.partition(':')        
!     if not found:
          usage(1, 'Bad remote spec: %s' % remotespec)
!     options.remotehost = head
      try:
!         options.remoteport = int(tail)
      except ValueError:
          usage(1, 'Bad remote port: %s' % remotespec)
      return options
Index: smtplib.py
===================================================================
*** 276,284 ****
  
          """
          if not port and (host.find(':') == host.rfind(':')):
!             i = host.rfind(':')
!             if i >= 0:
!                 host, port = host[:i], host[i+1:]
                  try: port = int(port)
                  except ValueError:
                      raise socket.error, "nonnumeric port"
--- 276,283 ----
  
          """
          if not port and (host.find(':') == host.rfind(':')):
!             host, found, port = host.rpartition(':')
!             if found:
                  try: port = int(port)
                  except ValueError:
                      raise socket.error, "nonnumeric port"
Index: urllib2.py
===================================================================
*** 289,301 ****
      def add_handler(self, handler):
          added = False
          for meth in dir(handler):
!             i = meth.find("_")
!             protocol = meth[:i]
!             condition = meth[i+1:]
! 
              if condition.startswith("error"):
!                 j = condition.find("_") + i + 1
!                 kind = meth[j+1:]
                  try:
                      kind = int(kind)
                  except ValueError:
--- 289,297 ----
      def add_handler(self, handler):
          added = False
          for meth in dir(handler):
!             protocol, _, condition = meth.partition('_')
              if condition.startswith("error"):
!                 _, _, kind = condition.partition('_')
                  try:
                      kind = int(kind)
                  except ValueError:
Index: zipfile.py
===================================================================
*** 117,125 ****
          self.orig_filename = filename   # Original file name in
archive
  # Terminate the file name at the first null byte.  Null bytes in file
  # names are used as tricks by viruses in archives.
!         null_byte = filename.find(chr(0))
!         if null_byte >= 0:
!             filename = filename[0:null_byte]
  # This is used to ensure paths in generated ZIP files always use
  # forward slashes as the directory separator, as required by the
  # ZIP format specification.
--- 117,123 ----
          self.orig_filename = filename   # Original file name in
archive
  # Terminate the file name at the first null byte.  Null bytes in file
  # names are used as tricks by viruses in archives.
!         filename, _, _ = filename.partition(chr(0))
  # This is used to ensure paths in generated ZIP files always use
  # forward slashes as the directory separator, as required by the
  # ZIP format specification.


From jcarlson at uci.edu  Mon Aug 29 08:29:58 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 28 Aug 2005 23:29:58 -0700
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
Message-ID: <20050828231650.7E4B.JCARLSON@uci.edu>

"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote:
> As promised, here is a full set of real-world comparative code
> transformations using str.partition().  The patch isn't intended to be
> applied; rather, it is here to test/demonstrate whether the new
> construct offers benefits under a variety of use cases.

Having looked at many of Raymond's transformations earlier today (just
emailing him a copy of my thoughts and changes minutes ago), I agree
that this simplifies essentially every example I have seen translated,
and translated myself.

There are a handful of errors I found during my pass, most of which seem
corrected in the version he has sent to python-dev (though not all). 
To those who are to reply in this thread, rather than nitpicking about
the correctness of individual transformations (though perhaps you should
email him directly about those), comment about how much better/worse
they look.

Vote to add str.partition to 2.5: +1
Vote to dump str.find sometime later if str.partition makes it: +1

 - Josiah


From ncoghlan at gmail.com  Mon Aug 29 13:16:18 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 29 Aug 2005 21:16:18 +1000
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
Message-ID: <4312EE82.90207@gmail.com>

Raymond Hettinger wrote:
> Most patterns using str.find() directly translated into an equivalent
> using partition.  The only awkwardness that arose was in cases where the
> original code had a test like, "if s.find(pat) > 0".  That case
> translated to a double-term test, "if found and head".

That said, the latter would give me much greater confidence that the test for 
"found, but not right at the start" was deliberate. With the original version 
I would need to study the surrounding code to satisfy myself that it wasn't a 
simple typo that resulted in '>' being written where '>=' was intended.

> With further ado, here are the comparative code fragments:

There's another one below that you previously tried rewriting to use str.index 
that also benefits from str.partition. This rewrite makes it easier to avoid 
the bug that afflicts the current code, and would make that bug raise an 
exception if it wasn't fixed - "head[-1]" would raise IndexError if the head 
was empty.

Cheers,
Nick.

--- From ConfigParser.py (current) ---------------

optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':') and ';' in optval:
     # ';' is a comment delimiter only if it follows
     # a spacing character
     pos = optval.find(';')
     if pos != -1 and optval[pos-1].isspace():
         optval = optval[:pos]
optval = optval.strip()

--- From ConfigParser.py (with str.partition) ---------------

optname, vi, optval = mo.group('option', 'vi', 'value')
if vi in ('=', ':'):
     # ';' is a comment delimiter only if it follows
     # a spacing character
     head, found, _ = optval.partition(';')
     if found and head and head[-1].isspace():
         optval = head
optval = optval.strip()


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From python at dynkin.com  Mon Aug 29 15:07:55 2005
From: python at dynkin.com (George Yoshida)
Date: Mon, 29 Aug 2005 22:07:55 +0900
Subject: [Python-Dev] [Python-checkins] python/dist/src/Doc/whatsnew
 whatsnew25.tex, 1.18, 1.19
In-Reply-To: <20050827184558.6A9981E401F@bag.python.org>
References: <20050827184558.6A9981E401F@bag.python.org>
Message-ID: <431308AB.40401@dynkin.com>

akuchling at users.sourceforge.net wrote:
> Update of /cvsroot/python/python/dist/src/Doc/whatsnew
> In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv29055
> 
> Modified Files:
> 	whatsnew25.tex 
> Log Message:
> Write section on PEP 342
> 
> Index: whatsnew25.tex
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Doc/whatsnew/whatsnew25.tex,v
> retrieving revision 1.18
> retrieving revision 1.19
> diff -u -d -r1.18 -r1.19
> --- whatsnew25.tex	23 Aug 2005 00:56:06 -0000	1.18
> +++ whatsnew25.tex	27 Aug 2005 18:45:47 -0000	1.19
 > [snip]
> +\begin{verbatim}
> +>>> it = counter(10)
> +>>> print it.next()
> +0
> +>>> print it.next()
> +1
> +>>> print it.send(8)
> +8
> +>>> print it.next()
> +9
> +>>> print it.next()
> +Traceback (most recent call last):
> +  File ``t.py'', line 15, in ?
> +    print it.next()
> +StopIteration
>  
> +Because \keyword{yield} will often be returning \constant{None}, 
> +you shouldn't just use its value in expressions unless you're sure 
> +that only the \method{send()} method will be used.

This part creates a syntax error. \begin{verbatim} does not have its
end tag.

- george

From mcherm at mcherm.com  Mon Aug 29 21:53:08 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon, 29 Aug 2005 12:53:08 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
Message-ID: <20050829125308.uylb8wyc0yw4sosc@login.werra.lunarpages.com>

Raymond writes:
> That suggests that we need a variant of split() that has been customized
> for typical find/index use cases.  Perhaps introduce a new pair of
> methods, partition() and rpartition()

+1

My only suggestion is that when you're about to make a truly
inspired suggestion like this one, that you use a new subject
header. It will make it easier for the Python-Dev summary
authors and for the people who look back in 20 years to ask
"That str.partition() function is really swiggy! It's everywhere
now, but I wonder what language had it first and who came up with
it?"

-- Michael Chermside

[PS: To explain what "swiggy" means I'd probably have to borrow
  the time machine.]


From tdelaney at avaya.com  Tue Aug 30 01:31:34 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 30 Aug 2005 09:31:34 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com>

Michael Chermside wrote:

> Raymond writes:
>> That suggests that we need a variant of split() that has been
>> customized for typical find/index use cases.  Perhaps introduce a
>> new pair of methods, partition() and rpartition()
> 
> +1
> 
> My only suggestion is that when you're about to make a truly
> inspired suggestion like this one, that you use a new subject
> header. It will make it easier for the Python-Dev summary
> authors and for the people who look back in 20 years to ask
> "That str.partition() function is really swiggy! It's everywhere
> now, but I wonder what language had it first and who came up with
> it?"

+1

This is very useful behaviour IMO.

Have the precise return values of partition() been defined?
Specifically, given:

    'a'.split('b')

we could get back:

    ('a', '', '')
    ('a', None, None)

Similarly:

    'ab'.split('b')

could be either:

    ('a', 'b', '')
    ('a', 'b', None)

IMO the most useful (and intuitive) behaviour is to return strings in
all cases.

My major issue is with the names - partition() doesn't sound right to
me. split() of course sounds best, but it has additional stuff we don't
necessarily want. However, I think we should aim to get the idea
accepted first, then work out the best name.

Tim Delaney

From t-meyer at ihug.co.nz  Tue Aug 30 02:05:03 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Tue, 30 Aug 2005 12:05:03 +1200
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ECBA357DDED63B4995F5C1F5CBE5B1E801B0F87B@its-xchg4.massey.ac.nz>
Message-ID: <ECBA357DDED63B4995F5C1F5CBE5B1E801DB04F3@its-xchg4.massey.ac.nz>

[Kay Schluehr]
>> The discourse about Python3000 has shrunken from the expectation
>> of the "next big thing" into a depressive rhetorics of feature 
>> elimination. The language doesn't seem to become deeper, smaller
>> and more powerfull but just smaller.
 
[Guido]
> There is much focus on removing things, because we want to be able 
> to add new stuff but we don't want the language to grow.

ISTM that a major reason that the Python 3.0 discussion seems 
focused more on removal than addition is that a lot of 
addition can be (and is being) done in Python 2.x.  This is a 
huge benefit, of course, since people can start doing things 
the "new and improved" way in 2.x, even though it's not until 
3.0 that the "old and evil" ;) way is actually removed.

Removal of map/filter/reduce is an example - there isn't 
discussion about addition of new features, because list 
comps/gen expressions are already here...

=Tony.Meyer


From greg.ewing at canterbury.ac.nz  Tue Aug 30 02:49:26 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 Aug 2005 12:49:26 +1200
Subject: [Python-Dev] Alternative name for  str.partition()
In-Reply-To: <4312EE82.90207@gmail.com>
References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
	<4312EE82.90207@gmail.com>
Message-ID: <4313AD16.3070608@canterbury.ac.nz>

A more descriptive name than 'partition' would be 'split_at'.

--
Greg

From raymond.hettinger at verizon.net  Tue Aug 30 03:26:35 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 29 Aug 2005 21:26:35 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com>
Message-ID: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>

[Delaney, Timothy (Tim)]
> +1
> 
> This is very useful behaviour IMO.

Thanks.  It seems to be getting +1s all around.


> Have the precise return values of partition() been defined?
 . . .
> IMO the most useful (and intuitive) behaviour is to return strings in
> all cases.

Yes, there is a precise spec and yes it always returns three strings.  

Movitation and spec:
http://mail.python.org/pipermail/python-dev/2005-August/055764.html

Pure python implementation, sample invocations, and tests:
http://mail.python.org/pipermail/python-dev/2005-August/055764.html


> My major issue is with the names - partition() doesn't sound right to
> me. 

FWIW, I am VERY happy with the name partition().  It has a long and
delightful history in conjunction with the quicksort algorithm where it
does something very similar to what we're doing here:  partitioning data
into three groups (left,center,right) with a small center element
(called a pivot in the quicksort context and called a separator in our
string parsing context).  This name has enjoyed great descriptive
success in communicating that the total data size is unchanged and that
the parts can be recombined to the whole.  IOW, it is exactly the right
word.  I won't part with it easily.

    http://www.google.com/search?q=quicksort+partition


Raymond


From tdelaney at avaya.com  Tue Aug 30 03:46:37 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 30 Aug 2005 11:46:37 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BB@au3010avexu1.global.avaya.com>

Raymond Hettinger wrote:

> Yes, there is a precise spec and yes it always returns three strings.
> 
> Movitation and spec:
> http://mail.python.org/pipermail/python-dev/2005-August/055764.html

Ah - thanks. Missed that in the mass of emails.

>> My major issue is with the names - partition() doesn't sound right to
>> me.
> 
> FWIW, I am VERY happy with the name partition().  It has a long and
> delightful history in conjunction with the quicksort algorithm where
> it does something very similar to what we're doing here:

I guessed that the motivation came from quicksort. My concern is that
"partition" is not something that most users would associate with
strings. I know I certainly wouldn't (at least, not immediately). The
behaviour is obvious from the name, but I don't feel the name is obvious
from the behaviour.

If I were explaining the behaviour of partition() to someone, the words
I would use are something like:

    partition() splits a string into 3 parts - the bit before the
    first occurrance of the separator, the separator, and the bit
    after the separator. If the separator isn't in the string at
    all then the entire string is returned as "the bit before" and
    the returned separator and bit after are empty strings.

I'd probably also explain that if the separator is the very last thing
in the string the "bit after" would be an empty string, but that is
fairly intuitive in any case IMO.

It's a pity split() is already taken - but then, you would want split()
to do more in any case (specifically, split multiple times).

Tim Delaney

From anthony at interlink.com.au  Tue Aug 30 04:09:16 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue, 30 Aug 2005 12:09:16 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
Message-ID: <200508301209.19693.anthony@interlink.com.au>

On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote:
> > My major issue is with the names - partition() doesn't sound right to
> > me.
>
> FWIW, I am VERY happy with the name partition().  

I'm +1 on the functionality, and +1 on the name partition(). The only other
name that comes to mind is 'separate()', but 
a) I always spell it 'seperate' (and I don't need another lamdba <wink>)
b) It's too similar in name to 'split()'

Anthony

-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From stephen at xemacs.org  Tue Aug 30 04:37:53 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 30 Aug 2005 11:37:53 +0900
Subject: [Python-Dev] partition()
In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer> (Raymond
	Hettinger's message of "Mon, 29 Aug 2005 21:26:35 -0400")
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
Message-ID: <87hdd8jgji.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Raymond" == Raymond Hettinger <raymond.hettinger at verizon.net> writes:

    Raymond> FWIW, I am VERY happy with the name partition().
    Raymond> ... [I]t is exactly the right word.  I won't part with it
    Raymond> easily.

+1

I note that Emacs has a split-string function which does not have
those happy properties.  In particular it never preserves the
separator, and (by default) it discards empty strings.

    Raymond> It has a long and delightful history in conjunction with
    Raymond> the quicksort algorithm

Now, that is a delightful mnemonic!

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From ldlandis at gmail.com  Tue Aug 30 05:29:16 2005
From: ldlandis at gmail.com (LD "Gus" Landis)
Date: Mon, 29 Aug 2005 22:29:16 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <200508301209.19693.anthony@interlink.com.au>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
Message-ID: <a1ddf57e05082920294af7e492@mail.gmail.com>

Hi,

  How about piece() ?  Anthony can have his "e"s that way too! ;-)
  and it's the same number of characters as .split().

Cheers,
  --ldl

On 8/29/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote:
> > > My major issue is with the names - partition() doesn't sound right to
> > > me.
> >
> > FWIW, I am VERY happy with the name partition().
> 
> I'm +1 on the functionality, and +1 on the name partition(). The only other
> name that comes to mind is 'separate()', but
> a) I always spell it 'seperate' (and I don't need another lamdba <wink>)
> b) It's too similar in name to 'split()'
> 
> Anthony
> 
> --
> Anthony Baxter     <anthony at interlink.com.au>
> It's never too late to have a happy childhood.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ldlandis%40gmail.com
> 


-- 
LD Landis - N0YRQ - from the St Paul side of Minneapolis

From ldlandis at gmail.com  Tue Aug 30 05:33:25 2005
From: ldlandis at gmail.com (LD "Gus" Landis)
Date: Mon, 29 Aug 2005 22:33:25 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <a1ddf57e05082920294af7e492@mail.gmail.com>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<a1ddf57e05082920294af7e492@mail.gmail.com>
Message-ID: <a1ddf57e0508292033557965af@mail.gmail.com>

Hi,

  Re: multiples, etc...

  Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See:
  http://www.jacquardsystems.com/Examples/function/piece.htm

Cheers,
  --ldl

On 8/29/05, LD Gus Landis <ldlandis at gmail.com> wrote:
> Hi,
> 
>   How about piece() ?  Anthony can have his "e"s that way too! ;-)
>   and it's the same number of characters as .split().
> 
> Cheers,
>   --ldl
> 

-- 
LD Landis - N0YRQ - from the St Paul side of Minneapolis

From fdrake at acm.org  Tue Aug 30 05:53:36 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 29 Aug 2005 23:53:36 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <200508301209.19693.anthony@interlink.com.au>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
Message-ID: <200508292353.36549.fdrake@acm.org>

On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote:
 > FWIW, I am VERY happy with the name partition().

I like it too.  +1


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From pje at telecommunity.com  Tue Aug 30 06:00:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 00:00:19 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <a1ddf57e0508292033557965af@mail.gmail.com>
References: <a1ddf57e05082920294af7e492@mail.gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<a1ddf57e05082920294af7e492@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com>

At 10:33 PM 8/29/2005 -0500, LD \"Gus\" Landis wrote:
>Hi,
>
>   Re: multiples, etc...
>
>   Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See:
>   http://www.jacquardsystems.com/Examples/function/piece.htm
>
>Cheers,
>   --ldl

As far as I can see, either you misunderstand what partition() does, or I'm 
completely misunderstanding what $PIECE does.  As far as I can tell, $PIECE 
and partition() have absolutely nothing in common except that they take 
strings as arguments.  :)

-1 on piece(), +1 for partition().


From tdelaney at avaya.com  Tue Aug 30 06:07:59 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 30 Aug 2005 14:07:59 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>

Phillip J. Eby wrote:

> +1 for partition().

Looks like I'm getting seriously outvoted here ... Still, as I said I
don't think the name is overly important until the idea has been
accepted anyway. How long did we go with people in favour of "resource
manager" until "context manager" came up?

Of course, if I (or someone else) can't come up with an obviously better
name, partition() will win by default. I don't think it's a *bad* name -
just don't think it's a particularly *obvious* name.

I think that one of the things I have against it is that most times I
type it, I get a typo. If this function is accepted, I think it will
(and should!) become one of the most used string functions around. As
such, the name should be *very* easy to type.

Tim Delaney

From shane at hathawaymix.org  Tue Aug 30 06:47:44 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Mon, 29 Aug 2005 22:47:44 -0600
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
Message-ID: <4313E4F0.6070906@hathawaymix.org>

Delaney, Timothy (Tim) wrote:
> I think that one of the things I have against it is that most times I
> type it, I get a typo. If this function is accepted, I think it will
> (and should!) become one of the most used string functions around. As
> such, the name should be *very* easy to type.

FWIW, the analogy with quicksort convinced me that partition is a good 
name, even though I'm a terirlbe tpyist.  I'm a pretty good proofreader, 
though. ;-)

Shane

From aahz at pythoncraft.com  Tue Aug 30 06:50:05 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 29 Aug 2005 21:50:05 -0700
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
Message-ID: <20050830045005.GA7988@panix.com>

On Tue, Aug 30, 2005, Delaney, Timothy (Tim) wrote:
>
> Looks like I'm getting seriously outvoted here ... Still, as I said I
> don't think the name is overly important until the idea has been
> accepted anyway. How long did we go with people in favour of "resource
> manager" until "context manager" came up?

In that case, though, it was more, "Well, I'm not that happy with
'context manager', but there doesn't seem to be anything better."  This
time, it's closer to, "That's a good name for the concept, yup."  As you
say, if someone comes up with a clearly better name, it likely will win;
however, partition has been blessed by enough people that it's not worth
putting much effort into finding anything better.

> Of course, if I (or someone else) can't come up with an obviously
> better name, partition() will win by default. I don't think it's a
> *bad* name - just don't think it's a particularly *obvious* name.

It's at least as obvious as translate().  <shrug>
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.

From tjreedy at udel.edu  Tue Aug 30 06:54:50 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 30 Aug 2005 00:54:50 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <43120396.30406@egenix.com>
	<001501c5ac03$4679e480$d206a044@oemcomputer>
Message-ID: <df0oqr$78n$1@sea.gmane.org>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote in message 
news:001501c5ac03$4679e480$d206a044 at oemcomputer...
> [M.-A. Lemburg]
>> Also, as I understand Terry's request, .find() should be removed
>> in favor of just leaving .index() (which is the identical method
>> without the funny -1 return code logic).

My proposal is to use the 3.0 opportunity to improve the language in this 
particular area.  I considered and ranked five alternatives more or less as 
follows.

1. Keep .index and delete .find.
2. Keep .index and repair .find to return None instead of -1.
3.5 Delete .index and repair .find.
3.5 Keep .index and .find as is.
5. Delete .index and keep .find as is.

> It is new and separate, but it is also related.

I see it as a 6th option: keep.index, delete .find, and replace with 
.partition.  I rank this at least second and maybe first.  It is separable 
in that the replacement can be done now, while the deletion has to wait.

> The core of Terry's request is the assertion that str.find()
> is bug-prone and should not be used.

That and the redundancy, both of which bothered me a bit since I first 
learned the string module functions.

Terry J. Reedy


From tjreedy at udel.edu  Tue Aug 30 07:12:41 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 30 Aug 2005 01:12:41 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
References: <2773CAC687FD5F4689F526998C7E4E5F0742BA@au3010avexu1.global.avaya.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
Message-ID: <df0ps9$9bi$1@sea.gmane.org>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote in
> Yes, there is a precise spec and yes it always returns three strings.

While the find/index discussion was about "what is the best way to indicate 
'cannot answer'", part of the conclusion is that any way can be awkward. 
So I am generally in favor of defining a function, when possible, so that 
it can always deliver an answer (giving inputs of the appropriate types) 
and so that the 'best way' question is moot.  Nicely done.

I think the name 'partition' is fine too.  It does not preclude putting a 
quicksort-type partition function in a module of list functions.  The only 
alternative I can think of is 'tripart', but I do *not* prefer that.

Terry J. Reedy


From mozbugbox at yahoo.com.au  Tue Aug 30 07:43:43 2005
From: mozbugbox at yahoo.com.au (JustFillBug)
Date: Tue, 30 Aug 2005 05:43:43 +0000 (UTC)
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
Message-ID: <slrndh7tcv.7gl.mozbugbox@mozbugbox.somehost.org>

On 2005-08-30, Anthony Baxter <anthony at interlink.com.au> wrote:
> On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote:
>> > My major issue is with the names - partition() doesn't sound right to
>> > me.
>>
>> FWIW, I am VERY happy with the name partition().  
>
> I'm +1 on the functionality, and +1 on the name partition(). The only other
> name that comes to mind is 'separate()', but 
> a) I always spell it 'seperate' (and I don't need another lamdba <wink>)
> b) It's too similar in name to 'split()'
>

trisplit()


From tdelaney at avaya.com  Tue Aug 30 07:50:21 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 30 Aug 2005 15:50:21 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CCAA@au3010avexu1.global.avaya.com>

Raymond Hettinger wrote:

> Heh!  Maybe AttributeError and NameError should be renamed to
> TypoError ;-)   Afterall, the only time I get these exceptions is
> when the fingers press different buttons than the brain requested.

You misspelled TyopError ;)

Tim Delaney

From mwh at python.net  Tue Aug 30 09:54:46 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 30 Aug 2005 08:54:46 +0100
Subject: [Python-Dev] partition()
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	(Timothy Delaney's message of "Tue, 30 Aug 2005 14:07:59 +1000")
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
Message-ID: <2my86jq2pl.fsf@starship.python.net>

"Delaney, Timothy (Tim)" <tdelaney at avaya.com> writes:

> Phillip J. Eby wrote:
>
>> +1 for partition().
>
> Looks like I'm getting seriously outvoted here ... Still, as I said I
> don't think the name is overly important until the idea has been
> accepted anyway. How long did we go with people in favour of "resource
> manager" until "context manager" came up?

Certainly no longer than until I got up the morning after the
discussion started :)

partition() works for me.  It's not perfect, but it'll do.  The idea
works for me rather more; it even simplifies the 

if s.startswith(prefix):
    t = s[len(prefix):]
    ...

idiom I occasionally wince at.

Cheers,
mwh

-- 
  Gullible editorial staff continues to post links to any and all
  articles that vaguely criticize Linux in any way.
         -- Reason #4 for quitting slashdot today, from
            http://www.cs.washington.edu/homes/klee/misc/slashdot.html

From fredrik at pythonware.com  Tue Aug 30 10:01:03 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 10:01:03 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
References: <a1ddf57e05082920294af7e492@mail.gmail.com><000a01c5ad01$dd2e51a0$8832c797@oemcomputer><200508301209.19693.anthony@interlink.com.au><a1ddf57e05082920294af7e492@mail.gmail.com>
	<a1ddf57e0508292033557965af@mail.gmail.com>
	<5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com>
Message-ID: <df13nq$jb$1@sea.gmane.org>

Phillip J. Eby wrote:

>>   Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See:
>>   http://www.jacquardsystems.com/Examples/function/piece.htm
>
> As far as I can see, either you misunderstand what partition() does, or 
> I'm
> completely misunderstanding what $PIECE does.  As far as I can tell, 
> $PIECE
> and partition() have absolutely nothing in common except that they take
> strings as arguments.  :)

both split on a given token.  partition splits once, and returns all three
parts, while piece returns the part you ask for (the 3-argument form is
similar to x.split(s)[i])

</F> 


From pierre.barbier at cirad.fr  Tue Aug 30 10:11:23 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Tue, 30 Aug 2005 10:11:23 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
Message-ID: <431414AB.4010005@cirad.fr>

Well, I want to come back on a point that wasn't discussed. I only found
one positive comment here :
http://mail.python.org/pipermail/python-dev/2005-August/055775.html

It's about that :

Raymond Hettinger wrote:
> * The function always succeeds unless the separator argument is not a
> string type or is an empty string.  So, a typical call doesn't have to
> be wrapped in a try-suite for normal usage.

Well, I wonder if it's so good ! Almost all the use case I find would
require something like:

head, sep, tail = s.partition(t)
if sep:
   do something
else:
   do something else

Like, if you want to extract the drive letter from a windows path :

drive, sep, tail = path.partition(":")
if not sep:
   drive = get_current_drive() # Because it's a local path

Or, if I want to iterate over all the path parts in a UNIX path:

sep = '/'
while sep:
  head, sep, path = path.partition(sep)

IMO, that read strange ... partitionning until sep is None :S
Then, testing with "if" in Python is always a lot slower than having an
exception launched from C extension inside a try...except block.

So both construct would read like already a lot of Python code:

try:
  head,sep,tail = s.partition(t)
  do something
except SeparatorException:
  do something else

and:

sep='/'
try:
   while 1:
      head, drop, path = path.partition(sep)
except SeparatorException:
  The end

To me, the try..except block to test end or error conditions are just
part of Python design. So I don't understand why you don't want it !

For the separator, keeping it in the return values may be very useful,
mainly because I would really like to use this function replacing string
with a regexp (like a simplified version of the Qt method
QStringList::split) and, in that case, the separator would be the actual
matched separator string.

Pierre

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68

From oren.tirosh at gmail.com  Tue Aug 30 10:17:10 2005
From: oren.tirosh at gmail.com (Oren Tirosh)
Date: Tue, 30 Aug 2005 11:17:10 +0300
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <slrndh7tcv.7gl.mozbugbox@mozbugbox.somehost.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<slrndh7tcv.7gl.mozbugbox@mozbugbox.somehost.org>
Message-ID: <7168d65a0508300117411a04ad@mail.gmail.com>

On 30/08/05, JustFillBug <mozbugbox at yahoo.com.au> wrote:
> On 2005-08-30, Anthony Baxter <anthony at interlink.com.au> wrote:
> > On Tuesday 30 August 2005 11:26, Raymond Hettinger wrote:
> >> > My major issue is with the names - partition() doesn't sound right to
> >> > me.
> >>
> >> FWIW, I am VERY happy with the name partition().
> >
> > I'm +1 on the functionality, and +1 on the name partition(). The only other
> > name that comes to mind is 'separate()', but
> > a) I always spell it 'seperate' (and I don't need another lamdba <wink>)
> > b) It's too similar in name to 'split()'
> >
>
> trisplit()

split3() ?

I'm +1 on the name "partition" but I think this is shorter,
communicates the similarity to split and the fact that it always
returns exactly three parts.

  Oren

From jcarlson at uci.edu  Tue Aug 30 10:42:22 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 Aug 2005 01:42:22 -0700
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431414AB.4010005@cirad.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr>
Message-ID: <20050830011440.7E5E.JCARLSON@uci.edu>


Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
> Well, I want to come back on a point that wasn't discussed. I only found
> one positive comment here :
> http://mail.python.org/pipermail/python-dev/2005-August/055775.html

You apparently haven't been reading python-dev for around 36 hours,
because there have been over a dozen positive comments in regards to
str.partition().


> Raymond Hettinger wrote:
> > * The function always succeeds unless the separator argument is not a
> > string type or is an empty string.  So, a typical call doesn't have to
> > be wrapped in a try-suite for normal usage.
> 
> Well, I wonder if it's so good ! Almost all the use case I find would
> require something like:
> 
> head, sep, tail = s.partition(t)
> if sep:
>    do something
> else:
>    do something else

Why don't you pause for a second and read Raymond's post here:
http://mail.python.org/pipermail/python-dev/2005-August/055781.html

In that email there is a listing of standard library translations from
str.find to str.partition, and in every case, it is improved.  If you
believe that str.index would be better used, take a moment and do a few
translations of the sections provided and compare them with the
str.partition examples.


> Like, if you want to extract the drive letter from a windows path :
> 
> drive, sep, tail = path.partition(":")
> if not sep:
>    drive = get_current_drive() # Because it's a local path
> 
> Or, if I want to iterate over all the path parts in a UNIX path:
> 
> sep = '/'
> while sep:
>   head, sep, path = path.partition(sep)
> 
> IMO, that read strange ... partitionning until sep is None :S
> Then, testing with "if" in Python is always a lot slower than having an
> exception launched from C extension inside a try...except block.

In the vast majority of cases, all three portions of the returned
partition result are used.  The remaining few are generally split
between one or two instances.  In the microbenchmarks I've conducted,
manually generating the slicings are measureably slower than when Python
does it automatically.  Also, exceptions are actually quite slow in
relation to comparisons, specifically in the case of find vs. index
(using 2.4)...

>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             if x.find('i')>=0:
...                     pass
...     print time.time()-t
...
0.953000068665
>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             try:
...                     x.index('i')
...             except ValueError:
...                     pass
...     print time.time()-t
...
6.53100013733


I urge you to take some time to read Raymond's translations.

 - Josiah


From pierre.barbier at cirad.fr  Tue Aug 30 11:02:11 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Tue, 30 Aug 2005 11:02:11 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <20050830011440.7E5E.JCARLSON@uci.edu>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
Message-ID: <43142093.4080104@cirad.fr>


Josiah Carlson a ?crit :
> Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
> 
>>Well, I want to come back on a point that wasn't discussed. I only found
>>one positive comment here :
>>http://mail.python.org/pipermail/python-dev/2005-August/055775.html
> 
> 
> You apparently haven't been reading python-dev for around 36 hours,
> because there have been over a dozen positive comments in regards to
> str.partition().

Well, I wasn't criticizing the overall idea of str.partition, which I
found very useful ! I'm just discussing one particular idea, which is to
avoid the use of exceptions.

> 
>>Raymond Hettinger wrote:
>>
>>>* The function always succeeds unless the separator argument is not a
>>>string type or is an empty string.  So, a typical call doesn't have to
>>>be wrapped in a try-suite for normal usage.
>>
>>Well, I wonder if it's so good ! Almost all the use case I find would
>>require something like:
>>
>>head, sep, tail = s.partition(t)
>>if sep:
>>   do something
>>else:
>>   do something else
> 
> 
> Why don't you pause for a second and read Raymond's post here:
> http://mail.python.org/pipermail/python-dev/2005-August/055781.html
> 
> In that email there is a listing of standard library translations from
> str.find to str.partition, and in every case, it is improved.  If you
> believe that str.index would be better used, take a moment and do a few
> translations of the sections provided and compare them with the
> str.partition examples.


Well, what it does is exactly what I tought, you can express most of the
use-cases of partition with:

head, sep, tail = s.partition(sep)
if not sep:
  #do something when it does not work
else:
  #do something when it works

And I propose to replace it by :

try:
  head, sep, tail = s.partition(sep)
  # do something when it works
except SeparatorError:
  # do something when it does not work

What I'm talking about is consistency. In most cases in Python, or at
least AFAIU, error testing is avoided and exception launching is
preferred mainly for efficiency reasons. So my question remains: why
prefer for that specific method returning an "error" value (i.e. an
empty separator) against an exception ?

Pierre

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68

From ncoghlan at gmail.com  Tue Aug 30 14:27:27 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 Aug 2005 22:27:27 +1000
Subject: [Python-Dev] partition()
In-Reply-To: <2my86jq2pl.fsf@starship.python.net>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<2my86jq2pl.fsf@starship.python.net>
Message-ID: <431450AF.4020902@gmail.com>

Michael Hudson wrote:
> partition() works for me.  It's not perfect, but it'll do.  The idea
> works for me rather more; it even simplifies the 
> 
> if s.startswith(prefix):
>     t = s[len(prefix):]
>     ...

How would you do it? Something like:

   head, found, tail = s.partition(prefix)
   if found and not head:
     ...

I guess I agree that's an improvement - only a slight one, though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Aug 30 14:42:20 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 Aug 2005 22:42:20 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43142093.4080104@cirad.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>
	<20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
Message-ID: <4314542C.7080000@gmail.com>

Pierre Barbier de Reuille wrote:
> What I'm talking about is consistency. In most cases in Python, or at
> least AFAIU, error testing is avoided and exception launching is
> preferred mainly for efficiency reasons. So my question remains: why
> prefer for that specific method returning an "error" value (i.e. an
> empty separator) against an exception ?

Because, in many cases, there is more to it than just the separator not being 
found.

Given a non-empty some_str and some_sep:

   head, sep, tail = some_str.partition(some_sep)

There are actually five possible results:

   head and not sep and not tail (the separator was not found)
   head and sep and not tail (the separator is at the end)
   head and sep and tail (the separator is somewhere in the middle)
   not head and sep and tail (the separator is at the start)
   not head and sep and not tail (the separator is the whole string)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Aug 30 14:49:48 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 Aug 2005 22:49:48 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
Message-ID: <431455EC.6050402@gmail.com>

Delaney, Timothy (Tim) wrote:
> Of course, if I (or someone else) can't come up with an obviously better
> name, partition() will win by default. I don't think it's a *bad* name -
> just don't think it's a particularly *obvious* name.

What about simply "str.parts" and "str.rparts"? That is, rather than splitting 
the string on a separator, we are breaking it into parts - the part before the 
separator, the separator itself, and the part after the separator. Same 
concept as 'partition', just a shorter method name.

Another option would be simply "str.part()" and "str.rpart()". Then you could 
think of it as an abbreviation of either 'partition' or 'parts' depending on 
your inclination.

> I think that one of the things I have against it is that most times I
> type it, I get a typo. If this function is accepted, I think it will
> (and should!) become one of the most used string functions around. As
> such, the name should be *very* easy to type.

I've been typing 'partition' a lot lately at work, and Tim's right - typing 
this correctly is harder than you might think. It is very easy to only type 
the 'ti' in the middle once, so that you end up with 'partion'.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Aug 30 15:20:03 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 Aug 2005 23:20:03 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431455EC.6050402@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com>
Message-ID: <43145D03.6090801@gmail.com>

Nick Coghlan wrote:
> Another option would be simply "str.part()" and "str.rpart()". Then you could 
> think of it as an abbreviation of either 'partition' or 'parts' depending on 
> your inclination.

I momentarily forgot that "part" is also a verb in its own right, with the
right meaning, too (think "parting your hair" and "parting the Red Sea").

So call it +1 for str.part and str.rpart, and +0 for str.partition and
str.rpartition.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From solipsis at pitrou.net  Tue Aug 30 15:23:36 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 30 Aug 2005 15:23:36 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43145D03.6090801@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com>  <43145D03.6090801@gmail.com>
Message-ID: <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr>


(unlurking)

Le mardi 30 ao?t 2005 ? 23:20 +1000, Nick Coghlan a ?crit :
> I momentarily forgot that "part" is also a verb in its own right, with the
> right meaning, too (think "parting your hair" and "parting the Red Sea").

"parts" sounds more obvious than the verb "part" which is little known
to non-native English speakers (at least to me anyway).

Just my 2 cents.

Regards

Antoine.


From jason.orendorff at gmail.com  Tue Aug 30 15:51:13 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 30 Aug 2005 09:51:13 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com>
	<1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr>
Message-ID: <bb8868b9050830065111873606@mail.gmail.com>

Concerning names for partition(), I immediately thought of break(). 
Unfortunately it's taken.

So, how about snap()?

head, sep, tail = line.snap(':')

-j

From eric.nieuwland at xs4all.nl  Tue Aug 30 16:28:05 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 30 Aug 2005 16:28:05 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <4314542C.7080000@gmail.com>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>
	<20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
Message-ID: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>

I have some use cases with:
	cut_at = some_str.find(sep)
	head, tail = some_str[:cut_at], some_str[cut_at:]
and:
	cut_at = some_str.find(sep)
	head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset != 
len(sep)

So if partition() [or whatever it'll be called] could have an optional 
second argument that defines the width of the 'cut' made, I would be 
helped enormously. The default for this second argument would be 
len(sep), to preserve the current proposal.

--eric


From python at discworld.dyndns.org  Tue Aug 30 16:36:07 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Tue, 30 Aug 2005 08:36:07 -0600
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <bb8868b9050830065111873606@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com>
	<1125408216.17470.6.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<bb8868b9050830065111873606@mail.gmail.com>
Message-ID: <20050830143607.GB23985@discworld.dyndns.org>

Jason Orendorff <jason.orendorff at gmail.com> wrote:
> Concerning names for partition(), I immediately thought of break(). 
> Unfortunately it's taken.
> 
> So, how about snap()?

I like .part()/.rpart() (or failing that, .parts()/.rparts()).  But if you
really want something short that's similar in meaning, there's also .cut().

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From pierre.barbier at cirad.fr  Tue Aug 30 17:01:36 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Tue, 30 Aug 2005 17:01:36 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>	<43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
Message-ID: <431474D0.70300@cirad.fr>


Eric Nieuwland a ?crit :
> I have some use cases with:
> 	cut_at = some_str.find(sep)
> 	head, tail = some_str[:cut_at], some_str[cut_at:]
> and:
> 	cut_at = some_str.find(sep)
> 	head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset != 
> len(sep)
> 
> So if partition() [or whatever it'll be called] could have an optional 
> second argument that defines the width of the 'cut' made, I would be 
> helped enormously. The default for this second argument would be 
> len(sep), to preserve the current proposal.

Well, IMO, your example is much better written:

import re
rsep = re.compile(sep + '.'*offset)
lst = re.split(resp, some_str, 1)
head = lst[0]
tail = lst[1]

Or you want to have some "partition" method which accept regular
expressions:

head, sep, tail = some_str.partition(re.compile(sep+'.'*offset))

> 
> --eric
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr
> 

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68

From skip at pobox.com  Tue Aug 30 17:01:12 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 10:01:12 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431455EC.6050402@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com>
Message-ID: <17172.29880.406663.490117@montanaro.dyndns.org>


    Nick> What about simply "str.parts" and "str.rparts"? 

-1 because "parts" is not a verb.  When I see an attribute that is a noun I
generally expect it to be a data attribute.

Skip

From raymond.hettinger at verizon.net  Tue Aug 30 17:11:53 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 30 Aug 2005 11:11:53 -0400
Subject: [Python-Dev] partition()
In-Reply-To: <20050830143607.GB23985@discworld.dyndns.org>
Message-ID: <001d01c5ad75$2841b1a0$8832c797@oemcomputer>

Hey guys, don't get lost in random naming suggestions (cut, snap, part,
parts, yada yada yada).  Each of those is much less descriptive and
provides less differentiation from other string methods.  Saving a few
characters is not worth introducing ambiguity.

Also, the longer name provides a useful visual balance between the three
assigned variables and the separator argument.  As an extreme example,
contrast the following:

   head, found, tail = s.p(separator) 
   head, found, tail = s.partition(separator)

The verb gets lost if it doesn't have visual weight.

Also, for those suggesting alternate semantics (raising exceptions when
the separator is not found), I challenge you to prove their worth by
doing all the code transformations that I did.  It is a remarkably
informative exercise that quickly reveals that this alternative is
dead-on-arrival.

For the poster suggesting an optional length argument, I suggest writing
out the revised method invariants.  I think you'll find that it snarls
them into incomprehensibility and makes the tool much more difficult to
learn.  Also, I recommend scanning my sample library code
transformations to see if any of them would benefit from the length
argument.  I think you'll find that it comes up so infrequently and with
such differing needs that it would be a mistake to bake this into the
proposal.


Raymond


From pje at telecommunity.com  Tue Aug 30 17:22:25 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 11:22:25 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <df13nq$jb$1@sea.gmane.org>
References: <a1ddf57e05082920294af7e492@mail.gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<a1ddf57e05082920294af7e492@mail.gmail.com>
	<a1ddf57e0508292033557965af@mail.gmail.com>
	<5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com>

At 10:01 AM 8/30/2005 +0200, Fredrik Lundh wrote:
>Phillip J. Eby wrote:
>
> >>   Check out (and Pythonify) the ANSI M[UMPS] $PIECE(). See:
> >>   http://www.jacquardsystems.com/Examples/function/piece.htm
> >
> > As far as I can see, either you misunderstand what partition() does, or
> > I'm
> > completely misunderstanding what $PIECE does.  As far as I can tell,
> > $PIECE
> > and partition() have absolutely nothing in common except that they take
> > strings as arguments.  :)
>
>both split on a given token.  partition splits once, and returns all three
>parts, while piece returns the part you ask for

No, because looking at that URL, there is no piece that is the token split 
on.  partition() always returns 3 parts for 1 occurrence of the token, 
whereas $PIECE only has 2.


>(the 3-argument form is
>similar to x.split(s)[i])

Which is quite thoroughly unlike partition.


From pje at telecommunity.com  Tue Aug 30 17:27:54 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 11:27:54 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
References: <4314542C.7080000@gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
Message-ID: <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com>

At 04:28 PM 8/30/2005 +0200, Eric Nieuwland wrote:
>I have some use cases with:
>         cut_at = some_str.find(sep)
>         head, tail = some_str[:cut_at], some_str[cut_at:]
>and:
>         cut_at = some_str.find(sep)
>         head, tail = some_str[:cut_at], some_str[cut_at+offset:] # offset !=
>len(sep)
>
>So if partition() [or whatever it'll be called] could have an optional
>second argument that defines the width of the 'cut' made, I would be
>helped enormously. The default for this second argument would be
>len(sep), to preserve the current proposal.

Unrelated comment: maybe 'cut()' and rcut() would be nice short names.

I'm not seeing the offset parameter, though, because this:

     head,__,tail = some_str.cut(sep)
     tail = tail[offset:]

is still better than the original example.


From bjourne at gmail.com  Tue Aug 30 17:29:07 2005
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Tue, 30 Aug 2005 17:29:07 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <17172.29880.406663.490117@montanaro.dyndns.org>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com>
	<17172.29880.406663.490117@montanaro.dyndns.org>
Message-ID: <740c3aec050830082963aa8b42@mail.gmail.com>

I like partition() but maybe even better would be if strings supported
slicing by string indices.

key, sep, val = 'foo = 32'.partition('=')

would be:

key, val = 'foo = 32'[:'='], 'foo = 32'['=':]

To me it feels very natural to extend Python's slices to string
indices and would cover most of partition()'s use cases. The if sep:
idiom of parition() could be solved by throwing an IndexError: e.g:

_, sep, port = host.partition(':')
if sep:
    try:
        int(port)
    except ValueError:

becomes:

try:
    port = host[':':]
    int(port)
except IndexError:
    pass
except ValueError:

An advantage of using slices would be that you could specify both a
beginning and ending string like this:

>>> s
'http://192.168.12.22:8080'
>>> s['http://':':']
'192.168.12.22'

Sorry if this idea has already been discussed.

-- 
mvh Bj?rn

From eric.nieuwland at xs4all.nl  Tue Aug 30 17:35:24 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 30 Aug 2005 17:35:24 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431474D0.70300@cirad.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>	<43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
Message-ID: <f836662fc8ffa673250afeec9157b0d6@xs4all.nl>

Pierre Barbier de Reuille wrote:
> Or you want to have some "partition" method which accept regular
> expressions:
>
> head, sep, tail = some_str.partition(re.compile(sep+'.'*offset))

Neat!
+1 on regexps as an argument to partition().

--eric


From solipsis at pitrou.net  Tue Aug 30 17:40:26 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 30 Aug 2005 17:40:26 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
Message-ID: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>


> Neat!
> +1 on regexps as an argument to partition().

It sounds better to have a separate function and call it re.partition,
doesn't it ?
By the way, re.partition() is *really* useful compared to re.split()
because with the latter you don't which string precisely matched the
pattern (it isn't an issue with str.split() since matching is exact).

Regards

Antoine.


From shane at hathawaymix.org  Tue Aug 30 17:42:26 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Tue, 30 Aug 2005 09:42:26 -0600
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>	<43142093.4080104@cirad.fr>	<4314542C.7080000@gmail.com>	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
Message-ID: <43147E62.2060106@hathawaymix.org>

Eric Nieuwland wrote:
> Pierre Barbier de Reuille wrote:
> 
>>Or you want to have some "partition" method which accept regular
>>expressions:
>>
>>head, sep, tail = some_str.partition(re.compile(sep+'.'*offset))
> 
> 
> Neat!
> +1 on regexps as an argument to partition().

Are you sure?  I would instead expect to find a .partition method on a 
regexp object:

   head, sep, tail = re.compile(sep+'.'*offset).partition(some_str)

Shane

From pierre.barbier at cirad.fr  Tue Aug 30 17:50:13 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Tue, 30 Aug 2005 17:50:13 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43147E62.2060106@hathawaymix.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>	<43142093.4080104@cirad.fr>	<4314542C.7080000@gmail.com>	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org>
Message-ID: <43148035.7020007@cirad.fr>


Shane Hathaway a ?crit :
> Eric Nieuwland wrote:
> 
>> Pierre Barbier de Reuille wrote:
>>
>>> Or you want to have some "partition" method which accept regular
>>> expressions:
>>>
>>> head, sep, tail = some_str.partition(re.compile(sep+'.'*offset))
>>
>>
>>
>> Neat!
>> +1 on regexps as an argument to partition().
> 
> 
> Are you sure?  I would instead expect to find a .partition method on a
> regexp object:
> 
>   head, sep, tail = re.compile(sep+'.'*offset).partition(some_str)

Well, to be consistent with current re module, it would be better to
follow Antoine's suggestion :

head, sep, tail = re.partition(re.compile(sep+'.'*offset), some_str)

Pierre

> 
> Shane
> 

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68

From shane at hathawaymix.org  Tue Aug 30 17:55:28 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Tue, 30 Aug 2005 09:55:28 -0600
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43148035.7020007@cirad.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>	<43142093.4080104@cirad.fr>	<4314542C.7080000@gmail.com>	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>	<431474D0.70300@cirad.fr>	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>	<43147E62.2060106@hathawaymix.org>
	<43148035.7020007@cirad.fr>
Message-ID: <43148170.1020903@hathawaymix.org>

Pierre Barbier de Reuille wrote:
> 
> Shane Hathaway a ?crit :
>>Are you sure?  I would instead expect to find a .partition method on a
>>regexp object:
>>
>>  head, sep, tail = re.compile(sep+'.'*offset).partition(some_str)
> 
> 
> Well, to be consistent with current re module, it would be better to
> follow Antoine's suggestion :
> 
> head, sep, tail = re.partition(re.compile(sep+'.'*offset), some_str)

Actually, consistency with the current re module requires new methods to 
be added in *both* places.  Apparently Python believes TMTOWTDI is the 
right practice here. ;-)  See search, match, split, findall, finditer, 
sub, and subn:

http://docs.python.org/lib/node114.html
http://docs.python.org/lib/re-objects.html

Shane

From tim.peters at gmail.com  Tue Aug 30 18:14:55 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 Aug 2005 12:14:55 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <43148170.1020903@hathawaymix.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
Message-ID: <1f7befae05083009146a9c35ce@mail.gmail.com>

Anyone remember why setdefault's second argument is optional?

>>> d = {}
>>> d.setdefault(666)
>>> d
{666: None}

just doesn't seem useful.  In fact, it's so silly that someone calling
setdefault with just one arg seems far more likely to have a bug in
their code than to get an outcome they actually wanted.  Haven't found
any 1-arg uses of setdefault() either, except for test code verifying
that you _can_ omit the second arg.

This came up in ZODB-land, where someone volunteered to add
setdefault() to BTrees.  Some flavors of BTrees are specialized to
hold integer or float values, and then setting None as a value is
impossible.  I resolved it there by making BTree.setdefault() require
both arguments.  It was a surprise to me that dict.setdefault() didn't
also require both.

If there isn't a sane use case for leaving the second argument out,
I'd like to drop the possibility in P3K (assuming setdefault()
survives).

From jcarlson at uci.edu  Tue Aug 30 18:26:10 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 Aug 2005 09:26:10 -0700
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com>
References: <43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
Message-ID: <20050830092200.8B03.JCARLSON@uci.edu>


Tim Peters <tim.peters at gmail.com> wrote:
> 
> Anyone remember why setdefault's second argument is optional?
> 
> >>> d = {}
> >>> d.setdefault(666)
> >>> d
> {666: None}

For quick reference for other people,  d.setdefault(key [, value])
returns the value that is currently there, or just assigned.  The only
case where it makes sense to omit the value parameter is in the case
where value=None.

> just doesn't seem useful.  In fact, it's so silly that someone calling
> setdefault with just one arg seems far more likely to have a bug in
> their code than to get an outcome they actually wanted.  Haven't found
> any 1-arg uses of setdefault() either, except for test code verifying
> that you _can_ omit the second arg.
> 
> This came up in ZODB-land, where someone volunteered to add
> setdefault() to BTrees.  Some flavors of BTrees are specialized to
> hold integer or float values, and then setting None as a value is
> impossible.  I resolved it there by making BTree.setdefault() require
> both arguments.  It was a surprise to me that dict.setdefault() didn't
> also require both.
> 
> If there isn't a sane use case for leaving the second argument out,
> I'd like to drop the possibility in P3K (assuming setdefault()
> survives).

I agree, at least that in the case where people actually want None (the
only time where the second argument is really optional, I think that
they should have to specify it.  EIBTI and all that.

 - Josiah


From hoffman at ebi.ac.uk  Tue Aug 30 18:19:22 2005
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Tue, 30 Aug 2005 17:19:22 +0100
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43148170.1020903@hathawaymix.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr>
	<20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr> <f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
Message-ID: <Pine.LNX.4.62.0508301716270.31737@qnzvnan.rov.np.hx>

[Shane Hathaway writes about the existence of both module-level
functions and object methods to do the same regex operations]

> Apparently Python believes TMTOWTDI is the right practice here. ;-)
> See search, match, split, findall, finditer, sub, and subn:
>
> http://docs.python.org/lib/node114.html
> http://docs.python.org/lib/re-objects.html

Dare I ask whether the uncompiled versions should be considered for
removal in Python 3.0?

*puts on his asbestos jacket*
-- 
Michael Hoffman <hoffman at ebi.ac.uk>
European Bioinformatics Institute


From tim.peters at gmail.com  Tue Aug 30 18:38:55 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 Aug 2005 12:38:55 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <20050830092200.8B03.JCARLSON@uci.edu>
References: <43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
	<20050830092200.8B03.JCARLSON@uci.edu>
Message-ID: <1f7befae05083009384acec6c9@mail.gmail.com>

[Tim Peters]
>> Anyone remember why setdefault's second argument is optional?
>>
>> >>> d = {}
>> >>> d.setdefault(666)
>> >>> d
>> {666: None}
>> ...
 
[Josiah Carlson]
> For quick reference for other people,  d.setdefault(key [, value])
> returns the value that is currently there, or just assigned.  The only
> case where it makes sense to omit the value parameter is in the case
> where value=None.

Yes, that's right.  Overwhelmingly most often in the wild, a
just-constructed empty container object is passed as the second
argument.  Rarely, I see 0 passed.  I've found no case where None is
wanted (except in the test suite, verifying that the 1-argument form
does indeed default to using None).

> ...
> I agree, at least that in the case where people actually want None (the
> only time where the second argument is really optional, I think that
> they should have to specify it.  EIBTI and all that.

And since there apparently aren't any such cases outside of Python's
test suite, that wouldn't be much of a burden on them <wink>.

From raymond.hettinger at verizon.net  Tue Aug 30 18:35:05 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 30 Aug 2005 12:35:05 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com>
Message-ID: <003301c5ad80$c72c1020$8832c797@oemcomputer>

[Tim]
> Anyone remember why setdefault's second argument is optional?

IIRC, this is a vestige from its ancestor.  The proposal for
setdefault() described it as behaving like dict.get() but inserting the
key if not found.


> Haven't found
> any 1-arg uses of setdefault() either, except for test code verifying
> that you _can_ omit the second arg.

Likewise, I found zero occurrences in the library, in my cumulative code
base, and in the third-party packages on my system.


> If there isn't a sane use case for leaving the second argument out,
> I'd like to drop the possibility in P3K (assuming setdefault()
> survives).

Give a lack of legitimate use cases, do we have to wait to Py3.0?  It
could likely be fixed directly and not impact any code that people care
about.


Raymond


From mcherm at mcherm.com  Tue Aug 30 18:39:49 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 30 Aug 2005 09:39:49 -0700
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
Message-ID: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com>

Michael Hoffman writes:
> Dare I ask whether the uncompiled versions [of re object methods] should
> be considered for removal in Python 3.0?
>
> *puts on his asbestos jacket*

No flames here, but I'd rather leave them. The docs make it clear that
the two sets of functions/methods are equivalent, so the conceptual
overhead is small (at least it doesn't scale with the number of methods
in re). The docs make it clear that the compiled versions are faster, so
serious users should prefer them. But the uncompiled versions are
preferable in one special situation: short simple scripts -- the kind
of thing often done with shell scriping except that Python is Better (TM).
For these uses, performance is irrelevent and it turns a 2-line
construct into a single line.

Of course the uncompiled versions can be written as little 2-line
functions but that's even WORSE for short simple scripts.

Nearly everything I write these days is larger and more complex, but
I retain a soft spot for short simple scripts and want Python to
continue to be the best tool available for these tasks.

-- Michael Chermside

From tim.peters at gmail.com  Tue Aug 30 18:56:11 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 Aug 2005 12:56:11 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <003301c5ad80$c72c1020$8832c797@oemcomputer>
References: <1f7befae05083009146a9c35ce@mail.gmail.com>
	<003301c5ad80$c72c1020$8832c797@oemcomputer>
Message-ID: <1f7befae05083009565974978c@mail.gmail.com>

[Raymond]
> setdefault() described it as behaving like dict.get() but inserting the
> key if not found.

...

> Likewise, I found zero occurrences in the library, in my cumulative code
> base, and in the third-party packages on my system.

[Tim]
>> If there isn't a sane use case for leaving the second argument out,
>> I'd like to drop the possibility in P3K (assuming setdefault()
>> survives).

[Raymond]
> Give a lack of legitimate use cases, do we have to wait to Py3.0?  It
> could likely be fixed directly and not impact any code that people care
> about.

That would be fine by me, but any change really requires a
deprecation-warning release first.

Dang!  I may have just found a use, in Zope's
lib/python/docutils/parsers/rst/directives/images.py (which is part of
docutils, not really part of Zope):

    figwidth = options.setdefault('figwidth')
    figclass = options.setdefault('figclass')
    del options['figwidth']
    del options['figclass']

I'm still thinking about what that's trying to do <0.5 wink>. 
Assuming options is a dict-like thingie, it probably meant to do:

    figwidth = options.pop('figwidth', None)
    figclass = options.pop('figclass', None)

David, are you married to that bizarre use of setdefault <wink>?

Whatever, I can't claim there are _no_ uses of 1-arg setdefault() in
the wild any more.

From jcarlson at uci.edu  Tue Aug 30 19:06:55 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 Aug 2005 10:06:55 -0700
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43142093.4080104@cirad.fr>
References: <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
Message-ID: <20050830091228.8B00.JCARLSON@uci.edu>


Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
> Well, what it does is exactly what I tought, you can express most of the
> use-cases of partition with:
> 
> head, sep, tail = s.partition(sep)
> if not sep:
>   #do something when it does not work
> else:
>   #do something when it works
> 
> And I propose to replace it by :
> 
> try:
>   head, sep, tail = s.partition(sep)
>   # do something when it works
> except SeparatorError:
>   # do something when it does not work

No, you can't.  As Tim Peters pointed out, in order to be correct, you
need to use...

try:
    head, found, tail = s.partition(sep)
except ValueError:
    # do something when it can't find sep
else:
    # do something when it can find sep

By embedding the 'found' case inside the try/except clause as you offer,
you could be hiding another exception, which is incorrect.

> What I'm talking about is consistency. In most cases in Python, or at
> least AFAIU, error testing is avoided and exception launching is
> preferred mainly for efficiency reasons. So my question remains: why
> prefer for that specific method returning an "error" value (i.e. an
> empty separator) against an exception ?

It is known among those who tune their Python code that try/except is
relatively expensive when exceptions are raised, but not significantly
faster (if any) when they are not. I'll provide an updated set of
microbenchmarks...

>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             _ = x.find('h')
...             if _ >= 0:
...                     pass
...             else:
...                     pass
...     print time.time()-t
...
0.84299993515
>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             try:
...                     _ = x.index('h')
...             except ValueError:
...                     pass
...             else:
...                     pass
...     print time.time()-t
...
0.81299996376

BUT!
>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             try:
...                     _ = x.index('i')
...             except ValueError:
...                     pass
...             else:
...                     pass
...     print time.time()-t
...
4.29700016975

We should subtract the time of the for loop, the method call overhead,
perhaps the integer object creation/fetch, and the assignment.
str.__len__() is pretty fast (really just a member check, which is at a
constant offset...), let us use that.

>>> if 1:
...     x = 'h'
...     t = time.time()
...     for i in xrange(1000000):
...             _ = x.__len__()
...     print time.time()-t
...
0.5

So, subtracting that .5 seconds from all the cases gives us...

0.343 seconds for .find's comparison
0.313 seconds for .index's exception handling when an exception is not
raised
3.797 seconds for .index's exception handling when an exception is
raised.

In the case of a string being found, .index is about 10% faster than
.find .  In the case of a string not being found, .index's exception
handlnig mechanics are over 11 times slower than .find's comparison.

Those numbers should speak for themselves.  In terms of the strings
being automatically chopped up vs. manually chopping them up with slices,
it is obvious which will be faster: C-level slicing.


I agree with Raymond that if you are going to poo-poo on str.partition()
not raising an exception, you should do some translations using the
correct structure that Tim Peters provided, and post them here on
python-dev as 'proof' that raising an exception in the cases provided is
better.

 - Josiah


From rrr at ronadam.com  Tue Aug 30 19:09:41 2005
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 30 Aug 2005 13:09:41 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
Message-ID: <431492D5.6090102@ronadam.com>

Raymond Hettinger wrote:
> [Delaney, Timothy (Tim)]
> 
>>+1
>>
>>This is very useful behaviour IMO.
> 
> 
> Thanks.  It seems to be getting +1s all around.

Wow, a lot of approvals!  :)

>>Have the precise return values of partition() been defined?

+1 on the Name partition,  I considered split or parts, but i agree 
partition reads better and since it's not so generic as something like 
get_parts, it creates a stronger identity making the code clearer.


>>IMO the most useful (and intuitive) behaviour is to return strings in
>>all cases.

Wow, a lot of approvals!  :-)


A possibly to consider:

Instead of partition() and rpartition(), have just partition with an 
optional step or skip value which can be a positive or negative non zero 
integer.

    head, found, tail = partition(sep, [step=1])

step = -1 step would look for sep from the right.

step = 2, would look for the second sep from left.

step = -2, would look for the second sep from the right.

Default of course would be 1, find first step from the left.


This would allow creating an iterator that could iterate though a string 
splitting on each sep from either the left, or right.

Weather a 0 or a |value|>len(string) causes an exception would need to 
be decided.

I can't think of an obvious use for a partition iterator at the moment, 
maybe someone could find an example.  In any case, finding the second, 
or third sep is probably common enough.


Cheers,
Ron


From skip at pobox.com  Tue Aug 30 19:13:37 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 12:13:37 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43145D03.6090801@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com> <43145D03.6090801@gmail.com>
Message-ID: <17172.37825.153046.857408@montanaro.dyndns.org>


    Nick> I momentarily forgot that "part" is also a verb in its own right,
    Nick> with the right meaning, too (think "parting your hair" and
    Nick> "parting the Red Sea").

If I remember correctly from watching "The Ten Commandments" as a kid, I
believe Charlton Heston only parted the Red Sea in one place...

Skip

From barry at python.org  Tue Aug 30 19:15:59 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 30 Aug 2005 13:15:59 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com>
References: <4314542C.7080000@gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr>
	<20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com>
	<5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com>
Message-ID: <1125422159.10126.11.camel@geddy.wooz.org>

On Tue, 2005-08-30 at 11:27, Phillip J. Eby wrote:

> >So if partition() [or whatever it'll be called] could have an optional
> >second argument that defines the width of the 'cut' made, I would be
> >helped enormously. The default for this second argument would be
> >len(sep), to preserve the current proposal.

+1 on the concept -- very nice Raymond.

> Unrelated comment: maybe 'cut()' and rcut() would be nice short names.

FWIW, +1 on .cut(), +0 on .partition()

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/98bdbe33/attachment-0001.pgp

From skip at pobox.com  Tue Aug 30 19:29:04 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 12:29:04 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
Message-ID: <17172.38752.55260.62198@montanaro.dyndns.org>


    Antoine> By the way, re.partition() is *really* useful compared to
    Antoine> re.split() because with the latter you don't which string
    Antoine> precisely matched the pattern (it isn't an issue with
    Antoine> str.split() since matching is exact).

Just group your re:

    >>> import re
    >>>
    >>> re.split("ab", "abracadabra")
    ['', 'racad', 'ra']
    >>> re.split("(ab)", "abracadabra")
    ['', 'ab', 'racad', 'ab', 'ra']

and you get it in the return value.  In fact, re.split with a grouped re is
very much like Raymond's str.partition method without the guarantee of
returning a three-element list.

Skip

From skip at pobox.com  Tue Aug 30 19:30:26 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 12:30:26 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
Message-ID: <17172.38834.592674.741120@montanaro.dyndns.org>


    In fact, re.split with a grouped re is very much like Raymond's
    str.partition method without the guarantee of returning a three-element
    list.

Whoops...  Should also have included the maxsplit=1 constraint.

Skip

From s.percivall at chello.se  Tue Aug 30 19:30:18 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Tue, 30 Aug 2005 19:30:18 +0200
Subject: [Python-Dev] partition()
In-Reply-To: <001d01c5ad75$2841b1a0$8832c797@oemcomputer>
References: <001d01c5ad75$2841b1a0$8832c797@oemcomputer>
Message-ID: <8C21C161-7B36-45F1-AA64-9E21B3F2942E@chello.se>

On 30 aug 2005, at 17.11, Raymond Hettinger wrote:
> Hey guys, don't get lost in random naming suggestions (cut, snap,  
> part,
> parts, yada yada yada).  Each of those is much less descriptive and
> provides less differentiation from other string methods.  Saving a few
> characters is not worth introducing ambiguity.

Trisect would be pretty descriptive ...

//Simon


From python at discworld.dyndns.org  Tue Aug 30 19:32:35 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Tue, 30 Aug 2005 11:32:35 -0600
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com>
References: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com>
Message-ID: <20050830173235.GC25381@discworld.dyndns.org>

Michael Chermside <mcherm at mcherm.com> wrote:
> Michael Hoffman writes:
> > Dare I ask whether the uncompiled versions [of re object methods] should
> > be considered for removal in Python 3.0?
> >
> > *puts on his asbestos jacket*
> 
> No flames here, but I'd rather leave them.

Me too.  I have various programs that construct lots of large REs on the fly,
knowing they'll only be used once.  Not having to compile them to objects
inline makes the code cleaner and easier to read.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From skip at pobox.com  Tue Aug 30 19:34:01 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 12:34:01 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <Pine.LNX.4.62.0508301716270.31737@qnzvnan.rov.np.hx>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<Pine.LNX.4.62.0508301716270.31737@qnzvnan.rov.np.hx>
Message-ID: <17172.39049.164534.215793@montanaro.dyndns.org>


    >> http://docs.python.org/lib/re-objects.html

    Michael> Dare I ask whether the uncompiled versions should be considered
    Michael> for removal in Python 3.0?

It is quite convenient to not have to compile regular expressions in most
cases.  The module takes care of compiling your patterns and caching them
for you.

Skip

From skip at pobox.com  Tue Aug 30 19:40:18 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 30 Aug 2005 12:40:18 -0500
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <1125422159.10126.11.camel@geddy.wooz.org>
References: <4314542C.7080000@gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr>
	<5.1.1.6.0.20050830112541.01b25cb0@mail.telecommunity.com>
	<1125422159.10126.11.camel@geddy.wooz.org>
Message-ID: <17172.39426.532398.825596@montanaro.dyndns.org>


    >> Unrelated comment: maybe 'cut()' and rcut() would be nice short names.

    Barry> FWIW, +1 on .cut(), +0 on .partition()

As long as people are free associating: snip(), excise(), explode(),
invade_iraq()...

<wink>

Skip

From mwh at python.net  Tue Aug 30 19:43:18 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 30 Aug 2005 18:43:18 +0100
Subject: [Python-Dev] partition()
In-Reply-To: <431450AF.4020902@gmail.com> (Nick Coghlan's message of "Tue,
	30 Aug 2005 22:27:27 +1000")
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<2my86jq2pl.fsf@starship.python.net> <431450AF.4020902@gmail.com>
Message-ID: <2mu0h7pbgp.fsf@starship.python.net>

Nick Coghlan <ncoghlan at gmail.com> writes:

> Michael Hudson wrote:
>> partition() works for me.  It's not perfect, but it'll do.  The idea
>> works for me rather more; it even simplifies the 
>> 
>> if s.startswith(prefix):
>>     t = s[len(prefix):]
>>     ...
>
> How would you do it? Something like:
>
>    head, found, tail = s.partition(prefix)
>    if found and not head:
>      ...
>
> I guess I agree that's an improvement - only a slight one, though.

Yes.  I seem to fairly often[1] do this with prefix as a literal so
only having to mention it once would be a win for me.

Cheers,
mwh

[1] But not often enough to have defined a function to do this job, it
    seems.

-- 
  <teratorn> I must be missing something. It is not possible to be
             this stupid.  
  <Yhg1s> you don't meet a lot of actual people, do you?

From barry at python.org  Tue Aug 30 20:19:24 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 30 Aug 2005 14:19:24 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com>
References: <20050830093949.1rqenezh01cs0w0c@login.werra.lunarpages.com>
Message-ID: <1125425964.10961.3.camel@geddy.wooz.org>

On Tue, 2005-08-30 at 12:39, Michael Chermside wrote:
> Michael Hoffman writes:
> > Dare I ask whether the uncompiled versions [of re object methods] should
> > be considered for removal in Python 3.0?

> No flames here, but I'd rather leave them. The docs make it clear that
> the two sets of functions/methods are equivalent, so the conceptual
> overhead is small (at least it doesn't scale with the number of methods
> in re). The docs make it clear that the compiled versions are faster, so
> serious users should prefer them. But the uncompiled versions are
> preferable in one special situation: short simple scripts -- the kind
> of thing often done with shell scriping except that Python is Better (TM).
> For these uses, performance is irrelevent and it turns a 2-line
> construct into a single line.

Although it's mildly annoying that the docs describe the compiled method
names in terms of the uncompiled functions.  I always find myself
looking up the regexp object's API only to be shuffled off to the
module's API and then having to do the argument remapping myself.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/819018aa/attachment.pgp

From hyeshik at gmail.com  Tue Aug 30 20:24:40 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Wed, 31 Aug 2005 03:24:40 +0900
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <000701c5abdd$4f0b7440$d206a044@oemcomputer>
References: <4311B7B6.8070503@egenix.com>
	<000701c5abdd$4f0b7440$d206a044@oemcomputer>
Message-ID: <4f0b69dc0508301124423da48e@mail.gmail.com>

On 8/28/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
>     >>> s = 'http://www.python.org'
>     >>> partition(s, '://')
>     ('http', '://', 'www.python.org')
>     >>> partition(s, '?')
>     ('http://www.python.org', '', '')
>     >>> partition(s, 'http://')
>     ('', 'http://', 'www.python.org')
>     >>> partition(s, 'org')
>     ('http://www.python.', 'org', '')
> 

What would be a result for rpartition(s, '?') ?
('', '', 'http://www.python.org')
or
('http://www.python.org', '', '')

BTW, I wrote a somewhat preliminary patch for this functionality
to let you save little of your time. :-)

http://people.freebsd.org/~perky/partition-r1.diff


Hye-Shik

From raymond.hettinger at verizon.net  Tue Aug 30 20:25:17 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 30 Aug 2005 14:25:17 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431492D5.6090102@ronadam.com>
Message-ID: <004601c5ad90$2c847020$8832c797@oemcomputer>

[Ron Adam]
> This would allow creating an iterator that could iterate though a
string
> splitting on each sep from either the left, or right.

For uses more complex than basic partitioning, people should shift to
more powerful tools like re.finditer(), re.findall(), and re.split().


> I can't think of an obvious use for a partition iterator at the
moment,
> maybe someone could find an example.

I prefer to avoid variants that are searching of a purpose.


> In any case, finding the second,
> or third sep is probably common enough.

That case should be handled with consecutive partitions:

# keep everything after the second 'X'
head, found, s = s.partition('X')
head, found, s = s.partition('x')


Raymond


From raymond.hettinger at verizon.net  Tue Aug 30 20:30:46 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 30 Aug 2005 14:30:46 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <4f0b69dc0508301124423da48e@mail.gmail.com>
Message-ID: <004701c5ad90$f0faeec0$8832c797@oemcomputer>

[Hye-Shik Chang]
> What would be a result for rpartition(s, '?') ?
> ('', '', 'http://www.python.org')
> or
> ('http://www.python.org', '', '')

The former.  The invariants for rpartition() are a mirror image of those
for partition().


> BTW, I wrote a somewhat preliminary patch for this functionality
> to let you save little of your time. :-)
> 
> http://people.freebsd.org/~perky/partition-r1.diff

Thanks.  I've got one running already, but it is nice to have another
for comparison.


Raymond


From paragate at gmx.net  Tue Aug 30 20:37:10 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 30 Aug 2005 20:37:10 +0200
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <1f7befae05083009146a9c35ce@mail.gmail.com>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
Message-ID: <op.swb3b8mv0gn541@theta>

On Tue, 30 Aug 2005 18:14:55 +0200, Tim Peters <tim.peters at gmail.com>  
wrote:
>>>> d = {}
>>>> d.setdefault(666)
>>>> d
> {666: None}
>
> just doesn't seem useful.  In fact, it's so silly that someone calling
> setdefault with just one arg seems far more likely to have a bug in
> their code than to get an outcome they actually wanted.  Haven't found

reminds me of dict.get()... i think in both cases being explicit::

     beast = d.setdefault( 666, None )
     beast = d.get( 666, None )

just reads better, allthemore since at least in my code what comes
next is invariably a test 'if beast is None:...'. so

     beast = d.setdefault( 666 )
     if beast is None:
         ...
and

     beast = d.get( 666 )
     if beast is None:
         ...

a shorter but a tad too implicit for my feeling.

_wolf

From eric.nieuwland at xs4all.nl  Tue Aug 30 20:41:21 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 30 Aug 2005 20:41:21 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr>	<20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
Message-ID: <14e7b895b32acb6b09e992cb570f4b99@xs4all.nl>

On 30 aug 2005, at 17:40, Antoine Pitrou wrote:
>> Neat!
>> +1 on regexps as an argument to partition().
>
> It sounds better to have a separate function and call it re.partition,
> doesn't it ?
> By the way, re.partition() is *really* useful compared to re.split()
> because with the latter you don't which string precisely matched the
> pattern (it isn't an issue with str.split() since matching is exact).

Nice, too.
BUT, "spam! and eggs".partition(re.compile("!.*d"))
more closely resembles "xyz".split(), and that is the way things have 
evolved up-to now.

--eric


From pje at telecommunity.com  Tue Aug 30 20:44:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 14:44:44 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <004601c5ad90$2c847020$8832c797@oemcomputer>
References: <431492D5.6090102@ronadam.com>
Message-ID: <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com>

At 02:25 PM 8/30/2005 -0400, Raymond Hettinger wrote:
>That case should be handled with consecutive partitions:
>
># keep everything after the second 'X'
>head, found, s = s.partition('X')
>head, found, s = s.partition('x')

Or:

      s=s.partition('X')[2].partition('X')[2]

which actually suggests a shorter, clearer way to do it:

      s = s.after('X').after('X')

And the corresponding 'before' method, of course, such that if sep in s:

      s.before(sep), sep, s.after(sep) == s.partition(sep)

Technically, these should probably be before_first and after_first, with 
the corresponding before_last and after_last corresponding to rpartition.


From tim.peters at gmail.com  Tue Aug 30 20:55:45 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 Aug 2005 14:55:45 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <op.swb3b8mv0gn541@theta>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com> <op.swb3b8mv0gn541@theta>
Message-ID: <1f7befae0508301155a7baca3@mail.gmail.com>

[Wolfgang Lipp]
> reminds me of dict.get()... i think in both cases being explicit::
>
>     beast = d.setdefault( 666, None )
>     beast = d.get( 666, None )
>
> just reads better, allthemore since at least in my code what comes
> next is invariably a test 'if beast is None:...'. so
>
>     beast = d.setdefault( 666 )
>     if beast is None:
>         ...

Do you actually do this with setdefault()?  It's not at all the same
as the get() example next, because d.setdefault(666) may _also_ have
the side effect of permanently adding a 666->None mapping to d. 
d.get(...) never mutates d.

> and
>
>     beast = d.get( 666 )
>     if beast is None:
>         ...
> 
> a shorter but a tad too implicit for my feeling.

Nevertheless, 1-argument get() is used a lot.  Outside the test suite,
I've only found one use of 1-argument setdefault() so far, and it was
a poor use (used two lines of code to emulate what dict.pop() does
directly).

From fredrik at pythonware.com  Tue Aug 30 20:43:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 20:43:19 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><431414AB.4010005@cirad.fr><20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl><431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl><43147E62.2060106@hathawaymix.org>
	<43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org>
	<Pine.LNX.4.62.0508301716270.31737@qnzvnan.rov.np.hx>
Message-ID: <df29c0$2ue$1@sea.gmane.org>

Michael Hoffman wrote:

> Dare I ask whether the uncompiled versions should be considered for
> removal in Python 3.0?
>
> *puts on his asbestos jacket*

there are no uncompiled versions, so that's not a problem.

if you mean the function level api, it's there for convenience.  if you're
using less than 100 expressions in your program, you don't really have
to *explicitly* compile your expressions.  the function api will do that
for you all by itself.

</F> 


From fredrik at pythonware.com  Tue Aug 30 19:54:25 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 19:54:25 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
References: <a1ddf57e05082920294af7e492@mail.gmail.com><000a01c5ad01$dd2e51a0$8832c797@oemcomputer><200508301209.19693.anthony@interlink.com.au><a1ddf57e05082920294af7e492@mail.gmail.com><a1ddf57e0508292033557965af@mail.gmail.com><5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com>
	<df13nq$jb$1@sea.gmane.org>
	<5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com>
Message-ID: <df26gb$lkd$1@sea.gmane.org>

Phillip J. Eby wrote:

>>both split on a given token.  partition splits once, and returns all three
>>parts, while piece returns the part you ask for
>
> No, because looking at that URL, there is no piece that is the token split
> on.  partition() always returns 3 parts for 1 occurrence of the token,
> whereas $PIECE only has 2.

so "absolutely nothing in common" has now turned into "does the
same thing but doesn't return the value you passed to it" ?

sorry for wasting my time.

</F> 


From fredrik at pythonware.com  Tue Aug 30 20:53:23 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 20:53:23 +0200
Subject: [Python-Dev] setdefault's second argument
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr><f836662fc8ffa673250afeec9157b0d6@xs4all.nl><43147E62.2060106@hathawaymix.org>
	<43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
Message-ID: <df29ut$5e4$1@sea.gmane.org>

Tim Peters wrote:
> Anyone remember why setdefault's second argument is optional?

Some kind of symmetry with get, probably.  if

    d.get(x)

returns None if x doesn't exist, it makes some kind of sense that

    d.setdefault(x)

returns None as well.

Anyone remember why nobody managed to come up with a better name
for setdefault (which is probably the worst name ever given to a method
in the standard Python distribution) ?

(if I were in charge, I'd rename it to something more informative.  I'd also
add a "join" built-in (similar to the good old string.join) and a "textfile" 
built-
in (similar to open("U") plus support for encodings).  but that's me.  I 
want
my code nice and tidy.)

</F> 


From fredrik at pythonware.com  Tue Aug 30 21:15:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 21:15:19 +0200
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <001901c5ac5d$5985bcc0$eb29c797@oemcomputer>
Message-ID: <df2b80$asi$1@sea.gmane.org>

Raymond Hettinger wrote:

> Overall, I found that partition() usefully encapsulated commonly
> occurring low-level programming patterns.  In most cases, it completely
> eliminated the need for slicing and indices.  In several cases, code was
> simplified dramatically; in some, the simplification was minor; and in a
> few cases, the complexity was about the same.  No cases were made worse.

it is, however, a bit worrying that you end up ignoring one or more
of the values in about 50% of your examples...

> !         rest, _, query = rest.rpartition('?')
> !         script, _, rest = rest.partition('/')
> !     _, sep, port = host.partition(':')
> !             head, sep, _ = path.rpartition('/')
> !                 line, _, _ = line.partition(';')  # strip 
> chunk-extensions
> !             host, _, port = host.rpartition(':')
> !         head, _, tail = name.partition('.')
> !             head, _, tail = tail.partition('.')
> !         pname, found, _ = pname.rpartition('.')
> !         head, _, tail = name.partition('.')
> !             filename, _, arg = arg.rpartition(':')
> !             line, _, _ = line.partition('#')
> !             protocol, _, condition = meth.partition('_')
> !         filename, _, _ = filename.partition(chr(0))

this is also a bit worrying

> !         head, found, tail = seq.find('-')

but that's more a problem with the test suite.

</F> 


From rrr at ronadam.com  Tue Aug 30 21:37:18 2005
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 30 Aug 2005 15:37:18 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com>
References: <431492D5.6090102@ronadam.com>
	<5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com>
Message-ID: <4314B56E.6070609@ronadam.com>

Phillip J. Eby wrote:
> At 02:25 PM 8/30/2005 -0400, Raymond Hettinger wrote:
> 
>> That case should be handled with consecutive partitions:
>>
>> # keep everything after the second 'X'
>> head, found, s = s.partition('X')
>> head, found, s = s.partition('x')

I was thinking of cases where head is everything before the second 'X'.

A posible use case might be getting items in comma delimited string.


> Or:
> 
>      s=s.partition('X')[2].partition('X')[2]
> 
> which actually suggests a shorter, clearer way to do it:
> 
>      s = s.after('X').after('X')
> 
> And the corresponding 'before' method, of course, such that if sep in s:
> 
>      s.before(sep), sep, s.after(sep) == s.partition(sep)
> 
> Technically, these should probably be before_first and after_first, with 
> the corresponding before_last and after_last corresponding to rpartition.


Do you really think these are easer than:

     head, found, tail = s.partition('X',2)

I don't feel there is a need to avoid numbers entirely. In this case I
think it's the better way to find the n'th seperator and since it's an
optional value I feel it doesn't add a lot of complication.  Anyway...
It's just a suggestion.

Cheers,
Ron


From paragate at gmx.net  Tue Aug 30 21:45:23 2005
From: paragate at gmx.net (Wolfgang Lipp)
Date: Tue, 30 Aug 2005 21:45:23 +0200
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <1f7befae0508301155a7baca3@mail.gmail.com>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
	<op.swb3b8mv0gn541@theta>
	<1f7befae0508301155a7baca3@mail.gmail.com>
Message-ID: <op.swb6hxsk0gn541@theta>

On Tue, 30 Aug 2005 20:55:45 +0200, Tim Peters <tim.peters at gmail.com>  
wrote:

> [Wolfgang Lipp]
>> reminds me of dict.get()... i think in both cases being explicit::
>>
>>     beast = d.setdefault( 666, None )
>>         ...
>
> Do you actually do this with setdefault()?

well, actually more like::

     def f( x ): return x % 3
     R = {}
     for x in range( 30 ):
         R.setdefault( f( x ), [] ).append( x )

still contrived, but you get the idea. i was really excited when finding  
out
that d.pop, d.get and d.setdefault work in very much the same way in  
respect
to the default argument, and my code has greatly benefitted from that. e.g.

     def f( **Q ):
         myoption = Q.pop( 'myoption', 42 )
         if Q:
             raise TypeError(...)

_w


From pje at telecommunity.com  Tue Aug 30 21:55:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 15:55:52 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <df26gb$lkd$1@sea.gmane.org>
References: <a1ddf57e05082920294af7e492@mail.gmail.com>
	<000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<a1ddf57e05082920294af7e492@mail.gmail.com>
	<a1ddf57e0508292033557965af@mail.gmail.com>
	<5.1.1.6.0.20050829235726.029224c0@mail.telecommunity.com>
	<df13nq$jb$1@sea.gmane.org>
	<5.1.1.6.0.20050830112036.01b21aa8@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050830154853.01b29a30@mail.telecommunity.com>

At 07:54 PM 8/30/2005 +0200, Fredrik Lundh wrote:
>Phillip J. Eby wrote:
>
> >>both split on a given token.  partition splits once, and returns all three
> >>parts, while piece returns the part you ask for
> >
> > No, because looking at that URL, there is no piece that is the token split
> > on.  partition() always returns 3 parts for 1 occurrence of the token,
> > whereas $PIECE only has 2.
>
>so "absolutely nothing in common" has now turned into "does the
>same thing but doesn't return the value you passed to it" ?

$PIECE returns exactly one value.  partition returns exactly 3.  partition 
always returns the separator as one of the three values.  $PIECE never 
does.  How many more differences does it have to have before you consider 
them to be nothing alike?


>sorry for wasting my time.

And sorry for you being either illiterate or humor-impaired, to have missed 
the smiley on the sentence that said "absolutely nothing in common except 
having string arguments".  You quoted it in your first reply, so it's not 
like it didn't make it into your email client.


From barry at python.org  Tue Aug 30 22:18:38 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 30 Aug 2005 16:18:38 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <df29ut$5e4$1@sea.gmane.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
	<4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr><f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
	<df29ut$5e4$1@sea.gmane.org>
Message-ID: <1125433118.10961.11.camel@geddy.wooz.org>

On Tue, 2005-08-30 at 14:53, Fredrik Lundh wrote:

> Some kind of symmetry with get, probably.  if
> 
>     d.get(x)
> 
> returns None if x doesn't exist, it makes some kind of sense that
> 
>     d.setdefault(x)

I think that's right, and IIRC the specific detail about the optional
second argument was probably hashed out in private Pythonlabs email, or
over a tasty lunch of kung pao chicken.  I don't have access to my
private archives at the moment, though the public record seems to start
about here:

http://mail.python.org/pipermail/python-dev/2000-August/007819.html

> Anyone remember why nobody managed to come up with a better name
> for setdefault (which is probably the worst name ever given to a method
> in the standard Python distribution) ?

Heh.

http://mail.python.org/pipermail/python-dev/2000-August/008059.html

> (if I were in charge, I'd rename it to something more informative.

Maybe like getorset() <wink>.

Oh, and yeah, I don't care if we change .setdefault() to require its
second argument -- I've never used it without one.  But don't remove the
method, it's quite handy.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/e7a714ee/attachment.pgp

From tim.peters at gmail.com  Tue Aug 30 22:27:44 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 Aug 2005 16:27:44 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <df29ut$5e4$1@sea.gmane.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com>
	<df29ut$5e4$1@sea.gmane.org>
Message-ID: <1f7befae05083013274c1b7521@mail.gmail.com>

[Fredrik Lundh]
> ...
> Anyone remember why nobody managed to come up with a better name
> for setdefault (which is probably the worst name ever given to a method
> in the standard Python distribution) ?

I suggested a perfect name at the time:

    http://mail.python.org/pipermail/python-dev/2000-August/008036.html

To save you from following that link, to this day I still mentally
translate "setdefault" to "getorset" whenever I see it.  That it
didn't get that name is probably Skip's fault, for whining that
"getorsetandget" would be "more accurate" <wink>.  Actually, there's
no evidence that Guido noticed:

    http://mail.python.org/pipermail/python-dev/2000-August/008059.html

> (if I were in charge, I'd rename it to something more informative.  I'd also
> add a "join" built-in (similar to the good old string.join) and a "textfile"
> built-in (similar to open("U") plus support for encodings).  but that's me.  I
> want my code nice and tidy.)

I'm not sure who is in charge, but I am sure they can be bribed ;-)

From raymond.hettinger at verizon.net  Tue Aug 30 22:27:48 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue, 30 Aug 2005 16:27:48 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df2b80$asi$1@sea.gmane.org>
Message-ID: <005301c5ada1$4a52afc0$8832c797@oemcomputer>

[Fredrik Lundh]
> it is, however, a bit worrying that you end up ignoring one or more
> of the values in about 50% of your examples...
 
It drops to about 25% when you skip the ones that don't care about the
found/not-found field:


> > !     _, sep, port = host.partition(':')
> > !             head, sep, _ = path.rpartition('/')
> > !                 line, _, _ = line.partition(';')  # strip
> > !         pname, found, _ = pname.rpartition('.')
> > !             line, _, _ = line.partition('#')
> > !         filename, _, _ = filename.partition(chr(0))

The remaining cases don't bug me much.  They clearly say, ignore the
left piece or ignore the right piece.  We could, of course, make these
clearer and more efficient by introducing more methods:
   
   s.before(sep)  --> (left, sep)
   s.after(sep)   --> (right, sep)
   s.rbefore(sep) --> (left, sep)
   s.r_after(sep) --> (right, sep)

But who wants all of that?


Raymond


From fredrik at pythonware.com  Tue Aug 30 22:46:45 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 Aug 2005 22:46:45 +0200
Subject: [Python-Dev] setdefault's second argument
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer><4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr><f836662fc8ffa673250afeec9157b0d6@xs4all.nl><43147E62.2060106@hathawaymix.org>
	<43148035.7020007@cirad.fr><43148170.1020903@hathawaymix.org><1f7befae05083009146a9c35ce@mail.gmail.com><df29ut$5e4$1@sea.gmane.org>
	<1f7befae05083013274c1b7521@mail.gmail.com>
Message-ID: <df2gjf$tge$1@sea.gmane.org>

Tim Peters wrote:

>> Anyone remember why nobody managed to come up with a better name
>> for setdefault (which is probably the worst name ever given to a method
>> in the standard Python distribution) ?
>
> I suggested a perfect name at the time:
>
>    http://mail.python.org/pipermail/python-dev/2000-August/008036.html
>
> To save you from following that link, to this day I still mentally
> translate "setdefault" to "getorset" whenever I see it.

from this day, I'll do that as well.

I have to admit that I had to follow that link anyway, just to make sure
I wasn't involved in the decision at that time (which I wasn't, from what
I can tell).

But I stumbled upon this little naming protocol

    Protocol: if you have a suggestion for a name for this function, mail
    it to me.  DON'T MAIL THE LIST.  (If you mail it to the list, that
    name is disqualified.)  Don't explain me why the name is good -- if
    it's good, I'll know, if it needs an explanation, it's not good.

which I thought was most excellent, and something that we might PEP:ify
for future use, until I realized that it gave us the "worst name ever"... 
oh
well.

</F> 


From benji at benjiyork.com  Tue Aug 30 23:06:06 2005
From: benji at benjiyork.com (Benji York)
Date: Tue, 30 Aug 2005 17:06:06 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <005301c5ada1$4a52afc0$8832c797@oemcomputer>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>
Message-ID: <4314CA3E.3020606@benjiyork.com>

Raymond Hettinger wrote:
> [Fredrik Lundh]
> 
>>it is, however, a bit worrying that you end up ignoring one or more
>>of the values in about 50% of your examples...
> 
> It drops to about 25% when you skip the ones that don't care about the
> found/not-found field:
> 
>>>!     _, sep, port = host.partition(':')
>>>!             head, sep, _ = path.rpartition('/')
>>>!                 line, _, _ = line.partition(';')  # strip
>>>!         pname, found, _ = pname.rpartition('.')
>>>!             line, _, _ = line.partition('#')
>>>!         filename, _, _ = filename.partition(chr(0))

I know it's been discussed in the past, but this makes me wonder about 
language support for "dummy" or "don't care" placeholders for tuple 
unpacking.  Would the above cases benefit from that, or (as has been 
suggested in the past) should slicing be used instead?

Original:
      _, sep, port = host.partition(':')
      head, sep, _ = path.rpartition('/')
      line, _, _ = line.partition(';')
      pname, found, _ = pname.rpartition('.')
      line, _, _ = line.partition('#')

Slicing:
      sep, port = host.partition(':')[1:]
      head, sep = path.rpartition('/')[:2]
      line = line.partition(';')[0]
      pname, found = pname.rpartition('.')[:2]
      line = line.partition('#')[0]

I think I like the original better, but can't use "_" in my code because 
it's used for internationalization.
--
Benji York


From barry at python.org  Tue Aug 30 23:05:52 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 30 Aug 2005 17:05:52 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <df2gjf$tge$1@sea.gmane.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<4314542C.7080000@gmail.com><65c606d6ef54240378726f4e4ad91f3d@xs4all.nl>
	<431474D0.70300@cirad.fr><f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<43147E62.2060106@hathawaymix.org> <43148035.7020007@cirad.fr>
	<43148170.1020903@hathawaymix.org>
	<1f7befae05083009146a9c35ce@mail.gmail.com><df29ut$5e4$1@sea.gmane.org>
	<1f7befae05083013274c1b7521@mail.gmail.com>
	<df2gjf$tge$1@sea.gmane.org>
Message-ID: <1125435952.10961.13.camel@geddy.wooz.org>

On Tue, 2005-08-30 at 16:46, Fredrik Lundh wrote:
> But I stumbled upon this little naming protocol
> 
>     Protocol: if you have a suggestion for a name for this function, mail
>     it to me.  DON'T MAIL THE LIST.  (If you mail it to the list, that
>     name is disqualified.)  Don't explain me why the name is good -- if
>     it's good, I'll know, if it needs an explanation, it's not good.
> 
> which I thought was most excellent, and something that we might PEP:ify
> for future use, until I realized that it gave us the "worst name ever"... 

/And/ the rule was self-admittedly broken by Guido not a few posts after
that one. ;)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/db154c7e/attachment.pgp

From mcherm at mcherm.com  Tue Aug 30 23:35:42 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue, 30 Aug 2005 14:35:42 -0700
Subject: [Python-Dev] Revising RE docs (was: partition() (was: Remove
	str.find in 3.0?))
Message-ID: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com>

Barry Warsaw writes:
> Although it's mildly annoying that the docs describe the compiled method
> names in terms of the uncompiled functions.  I always find myself
> looking up the regexp object's API only to be shuffled off to the
> module's API and then having to do the argument remapping myself.

An excellent point. Obviously, EITHER (1) the module functions ought to
be documented by reference to the RE object methods, or vice versa:
(2) document the RE object methods by reference to the module functions.

(2) is what we have today, but I would prefer (1) to gently encourage
people to use the precompiled objects (which are distinctly faster when
re-used).

Does anyone else think we ought to swap that around in the documentation?
I'm not trying to assign more work to Fred... but if there were a
python-dev consensus that this would be desirable, then perhaps someone
would be encouraged to supply a patch.

-- Michael Chermside


From fdrake at acm.org  Tue Aug 30 23:41:28 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 30 Aug 2005 17:41:28 -0400
Subject: [Python-Dev] Revising RE docs (was: partition() (was: Remove
	str.find in 3.0?))
In-Reply-To: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com>
References: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com>
Message-ID: <200508301741.28656.fdrake@acm.org>

On Tuesday 30 August 2005 17:35, Michael Chermside wrote:
 > An excellent point. Obviously, EITHER (1) the module functions ought to
 > be documented by reference to the RE object methods, or vice versa:
 > (2) document the RE object methods by reference to the module functions.

Agreed.  I think the current arrangement is primarily a historical accident 
more than anything else, but I didn't write that section, so could be wrong.

 > Does anyone else think we ought to swap that around in the documentation?
 > I'm not trying to assign more work to Fred... but if there were a
 > python-dev consensus that this would be desirable, then perhaps someone
 > would be encouraged to supply a patch.

I'd rather see it reversed from what it is as well.  While I don't have the 
time myself (and don't consider it a critical issue), I certainly won't 
revert a patch to make the change without good reason.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From goodger at python.org  Tue Aug 30 22:24:44 2005
From: goodger at python.org (David Goodger)
Date: Tue, 30 Aug 2005 16:24:44 -0400
Subject: [Python-Dev] setdefault's second argument
In-Reply-To: <1f7befae05083009565974978c@mail.gmail.com>
References: <1f7befae05083009146a9c35ce@mail.gmail.com>
	<003301c5ad80$c72c1020$8832c797@oemcomputer>
	<1f7befae05083009565974978c@mail.gmail.com>
Message-ID: <4314C08C.6060302@python.org>

[Tim Peters]
> Dang!  I may have just found a use, in Zope's
> lib/python/docutils/parsers/rst/directives/images.py (which is part
> of docutils, not really part of Zope):
>
>     figwidth = options.setdefault('figwidth')
>     figclass = options.setdefault('figclass')
>     del options['figwidth']
>     del options['figclass']

If a feature is available, it *will* eventually be used!
Whose law is that?

> I'm still thinking about what that's trying to do <0.5 wink>.

The code needs to store the values of certain dict entries, then
delete them.  This is because the "options" dict is passed on to
another function, where those entries are not welcome.  The code above
is simply shorter than this:

    if options.has_key('figwidth'):
        figwidth = options['figwidth']
        del options['figwidth']
    # again for 'figclass'

Alternatively,

    try:
        figwidth = options['figwidth']
        del options['figwidth']
    except KeyError:
        pass

It saves between one line and three lines of code per entry.  But
since those entries are probably not so common, it would actually be
faster to use one of the above patterns.

> Assuming options is a dict-like thingie, it probably meant to do:
>
>     figwidth = options.pop('figwidth', None)
>     figclass = options.pop('figclass', None)

Yes, but the "pop" method was only added in Python 2.3.  Docutils
currently maintains compatibility with Python 2.1, so that's RIGHT
OUT!

> David, are you married to that bizarre use of setdefault <wink>?

No, not at all.  In fact, I will vehemently deny that I ever wrote
such code, and will continue to do so until someone looks up its
history and proves that I'm guilty, which I probably am.

--
David Goodger <http://python.net/~goodger>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 253 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20050830/8588c318/signature.pgp

From rrr at ronadam.com  Wed Aug 31 00:45:54 2005
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 30 Aug 2005 18:45:54 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <4314CA3E.3020606@benjiyork.com>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>
	<4314CA3E.3020606@benjiyork.com>
Message-ID: <4314E1A2.4060409@ronadam.com>

Benji York wrote:

> Raymond Hettinger wrote:
> 
>>[Fredrik Lundh]
>>
>>
>>>it is, however, a bit worrying that you end up ignoring one or more
>>>of the values in about 50% of your examples...
>>
>>It drops to about 25% when you skip the ones that don't care about the
>>found/not-found field:
>>
>>
>>>>!     _, sep, port = host.partition(':')
>>>>!             head, sep, _ = path.rpartition('/')
>>>>!                 line, _, _ = line.partition(';')  # strip
>>>>!         pname, found, _ = pname.rpartition('.')
>>>>!             line, _, _ = line.partition('#')
>>>>!         filename, _, _ = filename.partition(chr(0))
> 
> 
> I know it's been discussed in the past, but this makes me wonder about 
> language support for "dummy" or "don't care" placeholders for tuple 
> unpacking.  Would the above cases benefit from that, or (as has been 
> suggested in the past) should slicing be used instead?
> 
> Original:
>       _, sep, port = host.partition(':')
>       head, sep, _ = path.rpartition('/')
>       line, _, _ = line.partition(';')
>       pname, found, _ = pname.rpartition('.')
>       line, _, _ = line.partition('#')
> 
> Slicing:
>       sep, port = host.partition(':')[1:]
>       head, sep = path.rpartition('/')[:2]
>       line = line.partition(';')[0]
>       pname, found = pname.rpartition('.')[:2]
>       line = line.partition('#')[0]
> 
> I think I like the original better, but can't use "_" in my code because 
> it's used for internationalization.
> --
> Benji York

For cases where single values are desired, attribues could work.

Slicing:
        line = line.partition(';').head
        line = line.partition('#').head

But it gets awkward as soon as you want more than one.

        sep, port = host.partition(':').head, host.partition(':').sep


Ron


From shane at hathawaymix.org  Wed Aug 31 01:00:43 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Tue, 30 Aug 2005 17:00:43 -0600
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <4314E1A2.4060409@ronadam.com>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>	<4314CA3E.3020606@benjiyork.com>
	<4314E1A2.4060409@ronadam.com>
Message-ID: <4314E51B.1050507@hathawaymix.org>

Ron Adam wrote:
> For cases where single values are desired, attribues could work.
> 
> Slicing:
>         line = line.partition(';').head
>         line = line.partition('#').head
> 
> But it gets awkward as soon as you want more than one.
> 
>         sep, port = host.partition(':').head, host.partition(':').sep

You can do both: make partition() return a sequence with attributes, 
similar to os.stat().  However, I would call the attributes "before", 
"sep", and "after".

Shane

From tdelaney at avaya.com  Wed Aug 31 01:06:22 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 31 Aug 2005 09:06:22 +1000
Subject: [Python-Dev] Proof of the pudding:  str.partition()
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F05CCAE@au3010avexu1.global.avaya.com>

Shane Hathaway wrote:

> Ron Adam wrote:
>> For cases where single values are desired, attribues could work.
>> 
>> Slicing:
>>         line = line.partition(';').head
>>         line = line.partition('#').head
>> 
>> But it gets awkward as soon as you want more than one.
>> 
>>         sep, port = host.partition(':').head, host.partition(':').sep
> 
> You can do both: make partition() return a sequence with attributes,
> similar to os.stat().  However, I would call the attributes "before",
> "sep", and "after".

+0

I thought the same thing. I don't see a lot of use cases for it, but it
could be useful. I don't see how it could hurt.

Tim Delaney

From fredrik at pythonware.com  Wed Aug 31 01:05:20 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 Aug 2005 01:05:20 +0200
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com>
	<4314E1A2.4060409@ronadam.com>
Message-ID: <df2ona$kmm$1@sea.gmane.org>

Ron Adam wrote:

> For cases where single values are desired, attribues could work.
>
> Slicing:
>        line = line.partition(';').head
>        line = line.partition('#').head
>
> But it gets awkward as soon as you want more than one.
>
>        sep, port = host.partition(':').head, host.partition(':').sep

unless you go for the piece approach

    host, port = host.piece(":", 1, 2)

(which, of course, is short for

    host, port = host.piece(":").group(1, 2)

)

and wait for Mr Eby to tell you that piece has nothing whatsoever
to do with string splitting.

</F> 


From tony at lownds.com  Wed Aug 31 03:09:39 2005
From: tony at lownds.com (tony@lownds.com)
Date: Tue, 30 Aug 2005 18:09:39 -0700 (PDT)
Subject: [Python-Dev] Proof of the pudding: str.partition()
Message-ID: <44572.67.127.59.200.1125450579.squirrel@lownds.com>

I once wrote a similar method called cleave(). My use case involved a
string-like class (Substr)  whose instances could report their position in
the original string. The re module wasn't preserving
my class so I had to provide a different API.

  def cleave(self, pattern, start=0):
    """return Substr until match, match, Substr after match

    If there is no match, return Substr, None, ''
    """

Here are some observations/questions on Raymond's partition() idea. First
of all, partition() is a  much better name than cleave()!

Substr didn't copy as partition() will have to, won't many of uses of
partition() end up being
O(N^2)?

One way that gives the programmer a way avoid the copying would be to
provide a string method
findspan(). findspan() would returns the start and end of the found
position in the string. start >
end could signal no match; and since 0-character strings are disallowed in
partition, end == 0
could also signal no match. partition() could be defined in terms of
findspan():

start, end = s.findspan(sep)
before, sep, after = s[:start], s[start:end], s[end:]

Just a quick thought,

-Tony


From greg.ewing at canterbury.ac.nz  Wed Aug 31 03:27:13 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Aug 2005 13:27:13 +1200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <slrndh7tcv.7gl.mozbugbox@mozbugbox.somehost.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<200508301209.19693.anthony@interlink.com.au>
	<slrndh7tcv.7gl.mozbugbox@mozbugbox.somehost.org>
Message-ID: <43150771.1000102@canterbury.ac.nz>

JustFillBug wrote:

> trisplit()

And then for when you need to record the result
somewhere, tricord(). :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jcarlson at uci.edu  Wed Aug 31 03:35:37 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 Aug 2005 18:35:37 -0700
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: <44572.67.127.59.200.1125450579.squirrel@lownds.com>
References: <44572.67.127.59.200.1125450579.squirrel@lownds.com>
Message-ID: <20050830182304.8B11.JCARLSON@uci.edu>


tony at lownds.com wrote:
> Substr didn't copy as partition() will have to, won't many of uses of
> partition() end up being O(N^2)?

Yes.  But if you look at most cases provided for in the standard library,
that isn't an issue.  In the case where it becomes an issue, it is
generally because a user wants to do repeated splitting on the same
token...which is better suited for str.split or re.split.


> One way that gives the programmer a way avoid the copying would be to
> provide a string method
> findspan(). findspan() would returns the start and end of the found
> position in the string. start >
> end could signal no match; and since 0-character strings are disallowed in
> partition, end == 0
> could also signal no match. partition() could be defined in terms of
> findspan():

> start, end = s.findspan(sep)
> before, sep, after = s[:start], s[start:end], s[end:]

Actually no.  When str.parition() doesn't find the separator, you get s, '', ''. 
Yours would produce '', '', s.  On not found, you would need to use
start==end==len(s).

Further, findspan could be defined in terms of find...

def findspan(s, sep):
    if len(sep) == 0:
        raise ValueError("null separator strings are not allowed")
    x = s.find(sep)
    if x >= 0:
        return x, x+len(sep)
    return len(s),len(s)

Conceptually they are all the same.  The trick with partition is that in
the vast majority of use cases, one wants 2 or 3 of the resulting
strings, and constructing the strings in the C-level code is far faster
than manually slicing (which can be error prone).  I will say the same
thing that I've said at least three times already (with a bit of an
expansion):

    IF YOU ARE GOING TO PROPOSE AN ALTERNATIVE, SHOW SOME
    COMPARATIVE CODE SAMPLES WHERE YOUR PROPOSAL DEFINITELY
    WINS OVER BOTH str.find AND str.partition.  IF YOU CAN'T
    PROVIDE SUCH SAMPLES, THEN YOUR PROPOSAL ISN'T BETTER,
    AND YOU PROBABLY SHOULDN'T PROPOSE IT.  bonus points for
    those who take the time to compare all of those that
    Raymond provided.

 - Josiah


From greg.ewing at canterbury.ac.nz  Wed Aug 31 03:43:59 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Aug 2005 13:43:59 +1200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <431455EC.6050402@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F0742BC@au3010avexu1.global.avaya.com>
	<431455EC.6050402@gmail.com>
Message-ID: <43150B5F.5070102@canterbury.ac.nz>

Nick Coghlan wrote:

> Another option would be simply "str.part()" and "str.rpart()". Then you could 
> think of it as an abbreviation of either 'partition' or 'parts' depending on 
> your inclination.

Or simply as the verb 'part', which also makes sense!

Also it's short and snappy, whereas 'partition' seems
rather too long-winded for such a useful little function.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From tony at lownds.com  Wed Aug 31 03:58:00 2005
From: tony at lownds.com (tony@lownds.com)
Date: Tue, 30 Aug 2005 18:58:00 -0700 (PDT)
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: <20050830182304.8B11.JCARLSON@uci.edu>
References: <44572.67.127.59.200.1125450579.squirrel@lownds.com>
	<20050830182304.8B11.JCARLSON@uci.edu>
Message-ID: <44836.67.127.59.200.1125453480.squirrel@lownds.com>

> Actually no.  When str.parition() doesn't find the separator, you get s,
> '', ''.
> Yours would produce '', '', s.  On not found, you would need to use
> start==end==len(s).
>

You're right. Nevermind, then.


> I will say the same
> thing that I've said at least three times already (with a bit of an
> expansion):
>

Thanks for the re-re-emphasis.

-Tony


From adurdin at gmail.com  Wed Aug 31 04:23:08 2005
From: adurdin at gmail.com (Andrew Durdin)
Date: Wed, 31 Aug 2005 12:23:08 +1000
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <004701c5ad90$f0faeec0$8832c797@oemcomputer>
References: <4f0b69dc0508301124423da48e@mail.gmail.com>
	<004701c5ad90$f0faeec0$8832c797@oemcomputer>
Message-ID: <59e9fd3a050830192344ffeafd@mail.gmail.com>

On 8/31/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> [Hye-Shik Chang]
> > What would be a result for rpartition(s, '?') ?
> > ('', '', 'http://www.python.org')
> > or
> > ('http://www.python.org', '', '')
> 
> The former.  The invariants for rpartition() are a mirror image of those
> for partition().

Just to put my spoke in the wheel, I find the difference in the
ordering of return values for partition() and rpartition() confusing:

head, sep, remainder = partition(s)
remainder, sep, head = rpartition(s)

My first expectation for rpartition() was that it would return exactly
the same values as partition(), but just work from the end of the
string.

IOW, I expected "www.python.org".partition("python") to return exactly
the same as "www.python.org".rpartition("python")

To try out partition(), I wrote a quick version of split() using
partition, and using partition() was obvious and easy:

def mysplit(s, sep):
    l = []
    while s:
        part, _, s = s.partition(sep)
        l.append(part)
    return l

I tripped up when trying to make an rsplit() (I'm using Python 2.3),
because the return values were in "reverse order"; I had expected the
only change to be using rpartition() instead of partition().

For a second example: one of the "fixed stdlib" examples that Raymond
posted actually uses rpartition and partition in two consecutive lines
-- I found this example not immediately obvious for the above reason:

      def run_cgi(self):
         """Execute a CGI script."""
         dir, rest = self.cgi_info
         rest, _, query = rest.rpartition('?')
         script, _, rest = rest.partition('/')
         scriptname = dir + '/' + script
         scriptfile = self.translate_path(scriptname)
         if not os.path.exists(scriptfile):

Anyway, I'm definitely +1 on partition(), but -1 on rpartition()
returning in "reverse order".

Andrew

From tdelaney at avaya.com  Wed Aug 31 04:27:57 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 31 Aug 2005 12:27:57 +1000
Subject: [Python-Dev] Remove str.find in 3.0?
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>

Andrew Durdin wrote:

> Just to put my spoke in the wheel, I find the difference in the
> ordering of return values for partition() and rpartition() confusing:
> 
> head, sep, remainder = partition(s)
> remainder, sep, head = rpartition(s)

This is the confusion - you've got the terminology wrong.

before, sep, after = s.partition('?')
('http://www.python.org', '', '')

before, sep, after = s.rpartition('?')
('', '', 'http://www.python.org')

Tim Delaney

From adurdin at gmail.com  Wed Aug 31 05:23:25 2005
From: adurdin at gmail.com (Andrew Durdin)
Date: Wed, 31 Aug 2005 13:23:25 +1000
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>
Message-ID: <59e9fd3a050830202334c69649@mail.gmail.com>

On 8/31/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> Andrew Durdin wrote:
> 
> > Just to put my spoke in the wheel, I find the difference in the
> > ordering of return values for partition() and rpartition() confusing:
> >
> > head, sep, remainder = partition(s)
> > remainder, sep, head = rpartition(s)
> 
> This is the confusion - you've got the terminology wrong.
> 
> before, sep, after = s.partition('?')
> ('http://www.python.org', '', '')
> 
> before, sep, after = s.rpartition('?')
> ('', '', 'http://www.python.org')

That's still confusing (to me), though -- when the string is being
processed, what comes before the separator is the stuff at the end of
the string, and what comes after is the bit at the beginning of the
string.  It's not the terminology that's confusing me, though I find
it hard to describe exactly what is. Maybe it's just me -- does anyone
else have the same confusion?

From guido at python.org  Wed Aug 31 05:27:40 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 30 Aug 2005 20:27:40 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <59e9fd3a050830202334c69649@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>
	<59e9fd3a050830202334c69649@mail.gmail.com>
Message-ID: <ca471dc20508302027314a7998@mail.gmail.com>

On 8/30/05, Andrew Durdin <adurdin at gmail.com> wrote:
> On 8/31/05, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> > Andrew Durdin wrote:
> >
> > > Just to put my spoke in the wheel, I find the difference in the
> > > ordering of return values for partition() and rpartition() confusing:
> > >
> > > head, sep, remainder = partition(s)
> > > remainder, sep, head = rpartition(s)
> >
> > This is the confusion - you've got the terminology wrong.
> >
> > before, sep, after = s.partition('?')
> > ('http://www.python.org', '', '')
> >
> > before, sep, after = s.rpartition('?')
> > ('', '', 'http://www.python.org')
> 
> That's still confusing (to me), though -- when the string is being
> processed, what comes before the separator is the stuff at the end of
> the string, and what comes after is the bit at the beginning of the
> string.  It's not the terminology that's confusing me, though I find
> it hard to describe exactly what is. Maybe it's just me -- does anyone
> else have the same confusion?

Hm. The example is poorly chosen because it's an end case. The
invariant for both is (I'd hope!)

  "".join(s.partition()) == s == "".join(s.rpartition())

Thus,

  "a/b/c".partition("/") returns ("a", "/", "b/c")

  "a/b/c".rpartition("/") returns ("a/b", "/", "c")

That can't be confusing can it?

(Just think of it as rpartition() stopping at the last occurrence,
rather than searching from the right. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From adurdin at gmail.com  Wed Aug 31 05:44:16 2005
From: adurdin at gmail.com (Andrew Durdin)
Date: Wed, 31 Aug 2005 13:44:16 +1000
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc20508302027314a7998@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>
	<59e9fd3a050830202334c69649@mail.gmail.com>
	<ca471dc20508302027314a7998@mail.gmail.com>
Message-ID: <59e9fd3a05083020444caecff4@mail.gmail.com>

On 8/31/05, Guido van Rossum <guido at python.org> wrote:
> 
> Hm. The example is poorly chosen because it's an end case. The
> invariant for both is (I'd hope!)
> 
>   "".join(s.partition()) == s == "".join(s.rpartition())

<snip>
 
> (Just think of it as rpartition() stopping at the last occurrence,
> rather than searching from the right. :-)

Ah, that makes a difference.  I could see that there was a different
way of looking at the function, I just couldn't see what it was... 
Now I understand the way it's been done.

Cheers,

Andrew.

From pje at telecommunity.com  Wed Aug 31 05:49:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 Aug 2005 23:49:15 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df2ona$kmm$1@sea.gmane.org>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>
	<4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com>
Message-ID: <5.1.1.6.0.20050830233356.01b34118@mail.telecommunity.com>

At 01:05 AM 8/31/2005 +0200, Fredrik Lundh wrote:
>Ron Adam wrote:
>
> > For cases where single values are desired, attribues could work.
> >
> > Slicing:
> >        line = line.partition(';').head
> >        line = line.partition('#').head
> >
> > But it gets awkward as soon as you want more than one.
> >
> >        sep, port = host.partition(':').head, host.partition(':').sep
>
>unless you go for the piece approach
>
>     host, port = host.piece(":", 1, 2)
>
>(which, of course, is short for
>
>     host, port = host.piece(":").group(1, 2)
>
>)
>
>and wait for Mr Eby to tell you that piece has nothing whatsoever
>to do with string splitting.

No, just to point out that you can make up whatever semantics you want, but 
the semantics you show above are *not* the same as what are shown at the 
page the person who posted about $PIECE cited, and on whose content I based 
my reply:

     http://www.jacquardsystems.com/Examples/function/piece.htm

If you were following those semantics, then the code you presented above is 
buggy, as host.piece(':',1,2) would return the original string!

Of course, since I know nothing of MUMPS besides what's on that page, it's 
entirely possible I've misinterpreted that page in some hideously subtle 
way -- as I pointed out in my original post regarding $PIECE.  I like to 
remind myself and others of the possibility that I *could* be wrong, even 
when I'm *certain* I'm right, because it helps keep me from appearing any 
more arrogant than I already do, and it also helps to keep me from looking 
too stupid in those cases where I turn out to be wrong.  Perhaps you might 
find that approach useful as well.

In any case, to avoid confusion, you should probably specify the semantics 
of your piece() proposal in Python terms, so that those of us who don't 
know MUMPS have some possibility of grasping the inner mysteries of your 
proposal.


From tjreedy at udel.edu  Wed Aug 31 05:58:41 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 30 Aug 2005 23:58:41 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <dens12$5kg$1@sea.gmane.org>
	<20050826134317.7DFD.JCARLSON@uci.edu><deoed1$hce$1@sea.gmane.org>
	<43100E14.6080009@v.loewis.de>
Message-ID: <df39th$nue$1@sea.gmane.org>


""Martin v. L�wis"" <martin at v.loewis.de> wrote in message 
news:43100E14.6080009 at v.loewis.de...
> Terry Reedy wrote:
>> One (1a) is to give an inband signal that is like a normal
>> response except that it is not (str.find returing -1).
>>
>> Python as distributed usually chooses 1b or 2.
>>  I believe str.find and
>> .rfind are unique in the choice of 1a.
>
> That is not true. str.find's choice is not 1a,

It it the paradigm example of 1a as I meant my definition.

> -1 does *not* look like a normal response,
> since a normal response is non-negative.

Actually, the current doc does not clearly specify to some people that the 
response is a count.  That is what lead to the 'str.find is buggy' thread 
on c.l.p, and something I will clarify when I propose a doc patch.

In any case, Python does not have a count type, though I sometime wish it 
did.  The return type is int and -1 is int, though it is not meant to be 
used as an int and it is a bug to do so.

>It is *not* the only method with choice 1a):
> dict.get returns None if the key is not found,

None is only the default default, and whatever the default is, it is not 
necessarily an error return.  A dict accessed via .get can be regarded as 
an infinite association matching all but a finite set of keys with the 
default.  Example: a doubly infinite array of numbers with only a finite 
number of non-zero entries, implemented as a dict.  This is the view 
actually used if one does normal calculations with that default return. 
There is no need, at least for that access method, for any key to be 
explicitly associated with the default.

If the default *is* regarded as an error indicator, and is only used to 
guard normal processing of the value returned, then that default must not 
be associated any key.   There is the problem that the domain of  dict 
values is normally considered to be any Python object and functions can 
only return Python objects and not any non-Python-object error return.
So the effective value domain for the particular dict must be the set 
'Python objects' minus the error indicator.  With discipline, None often 
works.  Or, to guarantee 1b-ness, one can create a new type that cannot be 
in the dict.

> For another example, file.read() returns an empty string at EOF.

If the request is 'give me the rest of the file as a string', then '' is 
the answer, not a 'cannot answer' indicator.  Similarly, if the request is 
'how many bytes are left to read', then zero is a numerical answer, not a 
non-numerical 'cannot answer' indicator.

Terry J. Reedy


From tjreedy at udel.edu  Wed Aug 31 06:08:25 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 31 Aug 2005 00:08:25 -0400
Subject: [Python-Dev] Remove str.find in 3.0?
References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>
Message-ID: <df3afp$p51$1@sea.gmane.org>


"Delaney, Timothy (Tim)" <tdelaney at avaya.com> wrote in message
> before, sep, after = s.partition('?')
> ('http://www.python.org', '', '')
>
> before, sep, after = s.rpartition('?')
> ('', '', 'http://www.python.org')

I can also see this as left, sep, right, with the sep not found case 
putting all in left or right depending on whether one scanned to the right 
or left.  In other words, when the scanner runs out of chars to scan, 
everything is 'behind' the scan, where 'behind' depends on the direction of 
scanning.  That seems nicely symmetric.

Terry J. Reedy


From tjreedy at udel.edu  Wed Aug 31 06:13:40 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 31 Aug 2005 00:13:40 -0400
Subject: [Python-Dev] Revising RE docs (was: partition() (was:
	Removestr.find in 3.0?))
References: <20050830143542.niq7a9s8bsrkc8ok@login.werra.lunarpages.com>
	<200508301741.28656.fdrake@acm.org>
Message-ID: <df3apl$pmf$1@sea.gmane.org>


"Fred L. Drake, Jr." <fdrake at acm.org> wrote in message 
news:200508301741.28656.fdrake at acm.org...
> I'd rather see it reversed from what it is as well.  While I don't have 
> the
> time myself (and don't consider it a critical issue), I certainly won't
> revert a patch to make the change without good reason.  :-)

Do you mean 'not reject' rather than 'not revert'?

Terry J. Reedy


From rrr at ronadam.com  Wed Aug 31 06:27:23 2005
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 31 Aug 2005 00:27:23 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df2ona$kmm$1@sea.gmane.org>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com>	<4314E1A2.4060409@ronadam.com>
	<df2ona$kmm$1@sea.gmane.org>
Message-ID: <431531AB.4080305@ronadam.com>

Fredrik Lundh wrote:

> Ron Adam wrote:
> 
> 
>>For cases where single values are desired, attribues could work.
>>
>>Slicing:
>>       line = line.partition(';').head
>>       line = line.partition('#').head
>>
>>But it gets awkward as soon as you want more than one.
>>
>>       sep, port = host.partition(':').head, host.partition(':').sep
> 
> 
> unless you go for the piece approach
> 
>     host, port = host.piece(":", 1, 2)
> 
> (which, of course, is short for
> 
>     host, port = host.piece(":").group(1, 2)
> 
> )

I'm not familiar with piece, but it occurred to me it might be useful to 
get attributes groups in some way.  My first (passing) thought was to do...

      host, port = host.partition(':').(head, sep)

Where that would be short calling a method to return them:

      host, port = host.partition(':').getattribs('head','sep')

But with only three items, the '_' is in the category of "Looks kind of 
strange, but I can get used to it because it works well.".

Cheers,
Ron


From steve at holdenweb.com  Wed Aug 31 06:50:12 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 30 Aug 2005 23:50:12 -0500
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <ca471dc20508302027314a7998@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB590@au3010avexu1.global.avaya.com>	<59e9fd3a050830202334c69649@mail.gmail.com>
	<ca471dc20508302027314a7998@mail.gmail.com>
Message-ID: <43153704.6080304@holdenweb.com>

Guido van Rossum wrote:
> On 8/30/05, Andrew Durdin <adurdin at gmail.com> wrote:
[confusion]
> 
> 
> Hm. The example is poorly chosen because it's an end case. The
> invariant for both is (I'd hope!)
> 
>   "".join(s.partition()) == s == "".join(s.rpartition())
> 
> Thus,
> 
>   "a/b/c".partition("/") returns ("a", "/", "b/c")
> 
>   "a/b/c".rpartition("/") returns ("a/b", "/", "c")
> 
> That can't be confusing can it?
> 
> (Just think of it as rpartition() stopping at the last occurrence,
> rather than searching from the right. :-)
> 
So we can check that a substring x appears precisely once in the string 
s using

s.partition(x) == s.rpartition(x)

Oops, it fails if s == "". I can usually find some way to go wrong ...

tongue-in-cheek-ly y'rs  - steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/


From tjreedy at udel.edu  Wed Aug 31 06:52:47 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 31 Aug 2005 00:52:47 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>	<4314CA3E.3020606@benjiyork.com><4314E1A2.4060409@ronadam.com>
	<4314E51B.1050507@hathawaymix.org>
Message-ID: <df3d2v$u2f$1@sea.gmane.org>


"Shane Hathaway" <shane at hathawaymix.org> wrote in message 
news:4314E51B.1050507 at hathawaymix.org...
> You can do both: make partition() return a sequence with attributes,
> similar to os.stat().  However, I would call the attributes "before",
> "sep", and "after".

One could see that as a special-case back-compatibility kludge that maybe 
should disappear in 3.0.  My impression is that the attributes were added 
precisely because unpacking several related attributes into several 
disconnected vars was found to be often awkward.  The sequencing is 
arbitrary and one often needs less that all attributes.

Terry J. Reedy


From shane at hathawaymix.org  Wed Aug 31 07:29:11 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Tue, 30 Aug 2005 23:29:11 -0600
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df3d2v$u2f$1@sea.gmane.org>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>	<4314CA3E.3020606@benjiyork.com><4314E1A2.4060409@ronadam.com>	<4314E51B.1050507@hathawaymix.org>
	<df3d2v$u2f$1@sea.gmane.org>
Message-ID: <43154027.9020301@hathawaymix.org>

Terry Reedy wrote:
> "Shane Hathaway" <shane at hathawaymix.org> wrote in message 
> news:4314E51B.1050507 at hathawaymix.org...
> 
>>You can do both: make partition() return a sequence with attributes,
>>similar to os.stat().  However, I would call the attributes "before",
>>"sep", and "after".
> 
> 
> One could see that as a special-case back-compatibility kludge that maybe 
> should disappear in 3.0.  My impression is that the attributes were added 
> precisely because unpacking several related attributes into several 
> disconnected vars was found to be often awkward.  The sequencing is 
> arbitrary and one often needs less that all attributes.

Good point.  Unlike os.stat(), it's very easy to remember the order of 
the return values from partition().

I'll add my +1 vote for part() and +0.9 for partition().

As for the regex version of partition(), I wonder if a little cleanup 
effort is in order so that new regex features don't have to be added in 
two places.  I suggest a builtin for compiling regular expressions, 
perhaps called "regex".  It would be easier to use the builtin than to 
import the re module, so there would no longer be a reason for the re 
module to have functions that duplicate the regular expression methods.

Shane

From jcarlson at uci.edu  Wed Aug 31 07:30:58 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 Aug 2005 22:30:58 -0700
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <43153704.6080304@holdenweb.com>
References: <ca471dc20508302027314a7998@mail.gmail.com>
	<43153704.6080304@holdenweb.com>
Message-ID: <20050830222856.8B14.JCARLSON@uci.edu>


Steve Holden <steve at holdenweb.com> wrote:
> 
> Guido van Rossum wrote:
> > On 8/30/05, Andrew Durdin <adurdin at gmail.com> wrote:
> [confusion]
> > 
> > 
> > Hm. The example is poorly chosen because it's an end case. The
> > invariant for both is (I'd hope!)
> > 
> >   "".join(s.partition()) == s == "".join(s.rpartition())
> > 
> > Thus,
> > 
> >   "a/b/c".partition("/") returns ("a", "/", "b/c")
> > 
> >   "a/b/c".rpartition("/") returns ("a/b", "/", "c")
> > 
> > That can't be confusing can it?
> > 
> > (Just think of it as rpartition() stopping at the last occurrence,
> > rather than searching from the right. :-)
> > 
> So we can check that a substring x appears precisely once in the string 
> s using
> 
> s.partition(x) == s.rpartition(x)
> 
> Oops, it fails if s == "". I can usually find some way to go wrong ...

There was an example in the standard library that used "s.find(y) ==
s.rfind(y)" as a test for zero or 1 instances of the searched for item.

Generally though, s.count(x)==1 is a better test.

 - Josiah


From pierre.barbier at cirad.fr  Wed Aug 31 10:16:59 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Wed, 31 Aug 2005 10:16:59 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <20050830091228.8B00.JCARLSON@uci.edu>
References: <20050830011440.7E5E.JCARLSON@uci.edu> <43142093.4080104@cirad.fr>
	<20050830091228.8B00.JCARLSON@uci.edu>
Message-ID: <4315677B.1000004@cirad.fr>

Josiah Carlson a ?crit :
> Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
> 
> 0.5
> 
> So, subtracting that .5 seconds from all the cases gives us...
> 
> 0.343 seconds for .find's comparison
> 0.313 seconds for .index's exception handling when an exception is not
> raised
> 3.797 seconds for .index's exception handling when an exception is
> raised.

Well, when I did benchmark that (two years ago) the difference was,
AFAIR, much greater ! But well, I just have to adjust my internal data
sets ;)

Pierre

> 
> In the case of a string being found, .index is about 10% faster than
> .find .  In the case of a string not being found, .index's exception
> handlnig mechanics are over 11 times slower than .find's comparison.
> 
> [...]
> 
>  - Josiah
> 

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68

From ncoghlan at gmail.com  Wed Aug 31 11:10:48 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 31 Aug 2005 19:10:48 +1000
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <4314B56E.6070609@ronadam.com>
References: <431492D5.6090102@ronadam.com>	<5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com>
	<4314B56E.6070609@ronadam.com>
Message-ID: <43157418.5040603@gmail.com>

Ron Adam wrote:
> I don't feel there is a need to avoid numbers entirely. In this case I
> think it's the better way to find the n'th seperator and since it's an
> optional value I feel it doesn't add a lot of complication.  Anyway...
> It's just a suggestion.

Avoid overengineering this without genuine use cases. Raymond's review of the 
standard library shows that the basic version of str.partition provides 
definite readability benefits and also makes it easier to write correct code - 
enhancements can wait until we have some real experience with how people use 
the method.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From fredrik at pythonware.com  Wed Aug 31 12:16:51 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 Aug 2005 12:16:51 +0200
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com>	<4314E1A2.4060409@ronadam.com><df2ona$kmm$1@sea.gmane.org>
	<431531AB.4080305@ronadam.com>
Message-ID: <df402j$fk5$1@sea.gmane.org>

Ron Adam wrote:

> I'm not familiar with piece, but it occurred to me it might be useful to
> get attributes groups in some way.  My first (passing) thought was to do...
>
>      host, port = host.partition(':').(head, sep)
>
> Where that would be short calling a method to return them:
>
>      host, port = host.partition(':').getattribs('head','sep')

note, however, that your first syntax doesn't work in today's python
(bare names are always evaluated in the current scope, before any calls
are made)

given that you want both the pieces *and* a way to see if a split was
made, the only half-reasonable alternatives to "I can always ignore the
values I don't need" that I can think of are

    flag, part1, part2, ... = somestring.partition(sep, count=2)

or

    flag, part1, part2, ... = somestring.piec^H^H^Hartition(sep, group, group, ...)

where flag is true if the separator was found, and the number of parts
returned corresponds to either count or the number of group indices
(the latter is of course the external influence that cannot be named,
but with an API modelled after RE's group method).

</F> 


From gmccaughan at synaptics-uk.com  Wed Aug 31 12:38:01 2005
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Wed, 31 Aug 2005 11:38:01 +0100
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <59e9fd3a050830192344ffeafd@mail.gmail.com>
References: <4f0b69dc0508301124423da48e@mail.gmail.com>
	<004701c5ad90$f0faeec0$8832c797@oemcomputer>
	<59e9fd3a050830192344ffeafd@mail.gmail.com>
Message-ID: <200508311138.02330.gmccaughan@synaptics-uk.com>

> Just to put my spoke in the wheel, I find the difference in the
> ordering of return values for partition() and rpartition() confusing:
> 
> head, sep, remainder = partition(s)
> remainder, sep, head = rpartition(s)
> 
> My first expectation for rpartition() was that it would return exactly
> the same values as partition(), but just work from the end of the
> string.
> 
> IOW, I expected "www.python.org".partition("python") to return exactly
> the same as "www.python.org".rpartition("python")

Yow. Me too, and indeed I've been skimming this thread without
it ever occurring to me that it would be otherwise.

> Anyway, I'm definitely +1 on partition(), but -1 on rpartition()
> returning in "reverse order".

+1.

-- 
g


From gmccaughan at synaptics-uk.com  Wed Aug 31 12:43:11 2005
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Wed, 31 Aug 2005 11:43:11 +0100
Subject: [Python-Dev] Remove str.find in 3.0?
In-Reply-To: <200508311138.02330.gmccaughan@synaptics-uk.com>
References: <4f0b69dc0508301124423da48e@mail.gmail.com>
	<59e9fd3a050830192344ffeafd@mail.gmail.com>
	<200508311138.02330.gmccaughan@synaptics-uk.com>
Message-ID: <200508311143.11809.gmccaughan@synaptics-uk.com>

I wrote:

[Andrew Durdin:]
> > IOW, I expected "www.python.org".partition("python") to return exactly
> > the same as "www.python.org".rpartition("python")
> 
> Yow. Me too, and indeed I've been skimming this thread without
> it ever occurring to me that it would be otherwise.

And, on re-skimming the thread, I think that was always the plan.
So that's OK, then. :-)

-- 
g


From solipsis at pitrou.net  Wed Aug 31 13:41:20 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 31 Aug 2005 13:41:20 +0200
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <17172.38752.55260.62198@montanaro.dyndns.org>
References: <000a01c5ad01$dd2e51a0$8832c797@oemcomputer>
	<431414AB.4010005@cirad.fr> <20050830011440.7E5E.JCARLSON@uci.edu>
	<43142093.4080104@cirad.fr> <4314542C.7080000@gmail.com>
	<65c606d6ef54240378726f4e4ad91f3d@xs4all.nl> <431474D0.70300@cirad.fr>
	<f836662fc8ffa673250afeec9157b0d6@xs4all.nl>
	<1125416426.17470.22.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<17172.38752.55260.62198@montanaro.dyndns.org>
Message-ID: <1125488480.31857.1.camel@p-dvsi-418-1.rd.francetelecom.fr>


Le mardi 30 ao?t 2005 ? 12:29 -0500, skip at pobox.com a ?crit :
> Just group your re:
> 
>     >>> import re
>     >>>
>     >>> re.split("ab", "abracadabra")
>     ['', 'racad', 'ra']
>     >>> re.split("(ab)", "abracadabra")
>     ['', 'ab', 'racad', 'ab', 'ra']
> 
> and you get it in the return value.  In fact, re.split with a grouped re is
> very much like Raymond's str.partition method without the guarantee of
> returning a three-element list.

Thanks! I guess I should have read the documentation carefully instead
of assuming re.split() worked like in some other language (namely, PHP).

Regards

Antoine.


From mcherm at mcherm.com  Wed Aug 31 13:55:35 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 31 Aug 2005 04:55:35 -0700
Subject: [Python-Dev] Proof of the pudding:  str.partition()
Message-ID: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com>

Raymond's original definition for partition() did NOT support any
of the following:

   (*) Regular Expressions
   (*) Ways to generate just 1 or 2 of the 3 values if some are
       not going to be used
   (*) Clever use of indices to avoid copying strings
   (*) Behind-the-scenes tricks to allow repeated re-partitioning
       to be faster than O(n^2)

The absence of these "features" is a GOOD thing. It makes the
behavior of partition() so simple and straightforward that it is
easily documented and can be instantly understood by a competent
programmer. I *like* keeping it simple. In fact, if anything, I'd
give UP the one fancy feature he chose to include:

   (*) An rpartition() function that searches from the right

...except that I understand why he included it and am convinced
by the arguments (use cases can be demonstrated and people would
expect it to be there and complain if it weren't).

Simplicity and elegence are two of the reasons that this is such
an excellent proposal, let's not lose them. We have existing
tools (like split() and the re module) to handle the tricky
problems.

-- Michael Chermside


From amk at amk.ca  Wed Aug 31 14:23:34 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 31 Aug 2005 08:23:34 -0400
Subject: [Python-Dev] Switching re and sre
Message-ID: <20050831122334.GA4104@rogue.amk.ca>

FYI: In a discussion on the Python security response list, Guido
suggested that the sre.py and re.py modules should be switched.

Currently re.py just imports the contents of sre.py -- once it
supported both sre and the PCRE-based pre.py -- and sre.py contains
the actual code.

Now that pre.py is gone, we can move the actual code into re.py and
make sre.py just import re.py, so that any user code that actually
imports sre will still work.  I'll make this change today.

--amk

From skip at pobox.com  Wed Aug 31 15:02:40 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 31 Aug 2005 08:02:40 -0500
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df3d2v$u2f$1@sea.gmane.org>
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer>
	<4314CA3E.3020606@benjiyork.com> <4314E1A2.4060409@ronadam.com>
	<4314E51B.1050507@hathawaymix.org> <df3d2v$u2f$1@sea.gmane.org>
Message-ID: <17173.43632.145313.858480@montanaro.dyndns.org>


    >> You can do both: make partition() return a sequence with attributes,
    >> similar to os.stat().  However, I would call the attributes "before",
    >> "sep", and "after".

    Terry> One could see that as a special-case back-compatibility kludge
    Terry> that maybe should disappear in 3.0.  

Back compatibility with what?  Since partition doesn't exist now there is
nothing to be backward compatible with is there?

I'm -1 on the notion of generating groups or attributes.  In other cases
(regular expressions, stat() results) there are good reasons to provide
them.  The results of a regular expression match are variable, depending on
how many groups the user defines in his pattern.  In the case of stat()
there is no reason other than historic for the results to be returned in any
particular order, so having named attributes makes the results easier to
work with.  The partition method has neither.  It always returns a fixed
tuple of three elements whose order is clearly based on the physical
relationship of the three pieces of the string that have been partitioned.

I think Raymond's original formulation is the correct one.  Always return a
three-element tuple of strings, nothing more.  Use '_' or 'dummy' if there
is some element you're not interested in.

Skip

From pje at telecommunity.com  Wed Aug 31 15:40:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 31 Aug 2005 09:40:15 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com
 >
Message-ID: <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com>

At 04:55 AM 8/31/2005 -0700, Michael Chermside wrote:
>Raymond's original definition for partition() did NOT support any
>of the following:
>
>    (*) Regular Expressions

This can be orthogonally added to the 're' module, and definitely should 
not be part of the string method.


>    (*) Ways to generate just 1 or 2 of the 3 values if some are
>        not going to be used

Yep, subscripting and slicing are more than adequate to handle *all* of 
those use cases, even the ones that some people have been jumping through 
odd hoops to express:

     before = x.partition(sep)[0]
     found  = x.partition(sep)[1]
     after  = x.partition(sep)[2]

     before, found = x.partition("foo")[:2]
     found,  after = x.partition("foo")[1:]
     before, after = x.partition("foo")[::2]

Okay, that last one is maybe a little too clever.  I'd personally just use 
'__' or 'DONTCARE' or something like that for the value(s) I didn't care 
about, because it  actually takes slightly less time to unpack a 3-tuple 
into three function-local variables than it does to pull out a single 
element of the tuple, and it's almost twice as fast as taking a slice and 
unpacking it into two variables.

So, using three variables is both faster *and* easier to read than any of 
the variations anybody has proposed, including the ones I just showed above.


>    (*) Clever use of indices to avoid copying strings
>    (*) Behind-the-scenes tricks to allow repeated re-partitioning
>        to be faster than O(n^2)

Yep, -1 on these.


>The absence of these "features" is a GOOD thing. It makes the
>behavior of partition() so simple and straightforward that it is
>easily documented and can be instantly understood by a competent
>programmer. I *like* keeping it simple. In fact, if anything, I'd
>give UP the one fancy feature he chose to include:
>
>    (*) An rpartition() function that searches from the right
>
>...except that I understand why he included it and am convinced
>by the arguments (use cases can be demonstrated and people would
>expect it to be there and complain if it weren't).

I'd definitely like to keep rpartition.  For example, splitting an HTTP 
url's hostname from its port should be done with rpartition, since you can 
have a 'username:password@' part before the host, and because the host can 
be a bracketed bracketed IPv6 host address with colons in it.


>Simplicity and elegence are two of the reasons that this is such
>an excellent proposal,

+1.


From rrr at ronadam.com  Wed Aug 31 15:41:38 2005
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 31 Aug 2005 09:41:38 -0400
Subject: [Python-Dev] partition() (was: Remove str.find in 3.0?)
In-Reply-To: <43157418.5040603@gmail.com>
References: <431492D5.6090102@ronadam.com>	<5.1.1.6.0.20050830143755.01fcc538@mail.telecommunity.com>	<4314B56E.6070609@ronadam.com>
	<43157418.5040603@gmail.com>
Message-ID: <4315B392.9040906@ronadam.com>

Nick Coghlan wrote:
> Ron Adam wrote:
> 
>>I don't feel there is a need to avoid numbers entirely. In this case I
>>think it's the better way to find the n'th seperator and since it's an
>>optional value I feel it doesn't add a lot of complication.  Anyway...
>>It's just a suggestion.
> 
> 
> Avoid overengineering this without genuine use cases. Raymond's review of the 
> standard library shows that the basic version of str.partition provides 
> definite readability benefits and also makes it easier to write correct code - 
> enhancements can wait until we have some real experience with how people use 
> the method.
> 
> Cheers,
> Nick.

The use cases for nth items 1 and -1 are the same ones for partition() 
and rpartition.  It's only values greater or less than those that need 
use cases.  (I'll try to find some.)

True, a directional index enhancement could be added later, but not 
considering it now and then adding it later would mean rpartition() 
would become redundant and/or an argument against doing it later.

As it's been stated fairly often, it's hard to remove something once 
it's put in. So it's prudent to consider a few alternative forms and 
rule them out, rather than try to change things later.

Cheers,
Ron


From fredrik at pythonware.com  Wed Aug 31 16:03:29 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 Aug 2005 16:03:29 +0200
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com >
	<5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com>
Message-ID: <df4dbh$rlt$1@sea.gmane.org>

Phillip J. Eby wrote:

> Yep, subscripting and slicing are more than adequate to handle *all* of
> those use cases, even the ones that some people have been jumping through
> odd hoops to express:
>
>     before = x.partition(sep)[0]
>     found  = x.partition(sep)[1]
>     after  = x.partition(sep)[2]
>
>     before, found = x.partition("foo")[:2]
>     found,  after = x.partition("foo")[1:]
>     before, after = x.partition("foo")[::2]
>
> Okay, that last one is maybe a little too clever.  I'd personally just use
> '__' or 'DONTCARE' or something like that for the value(s) I didn't care
> about, because it  actually takes slightly less time to unpack a 3-tuple
> into three function-local variables than it does to pull out a single
> element of the tuple, and it's almost twice as fast as taking a slice and
> unpacking it into two variables.

you're completely missing the point.

the problem isn't the time it takes to unpack the return value, the problem is that
it takes time to create the substrings that you don't need.

for some use cases, a naive partition-based solution is going to be a lot slower
than the old find+slice approach, no matter how you slice, index, or unpack the
return value.

> So, using three variables is both faster *and* easier to read than any of
> the variations anybody has proposed, including the ones I just showed above.

try again.

</F> 


From python at discworld.dyndns.org  Wed Aug 31 16:30:46 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed, 31 Aug 2005 08:30:46 -0600
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com>
References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com>
Message-ID: <20050831143046.GE522@discworld.dyndns.org>

Michael Chermside <mcherm at mcherm.com> wrote:
> 
>    (*) An rpartition() function that searches from the right
> 
> ...except that I understand why he included it and am convinced
> by the arguments (use cases can be demonstrated and people would
> expect it to be there and complain if it weren't).

I would think that perhaps an optional second argument to the method that
controls whether it searches from the start (default) or end of the string
might be nicer than having two separate methods, even though that would lose
parallelism with the current .find/.index.

While I'm at it, why not propose that for py3k that
.rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split grow an
optional "fromright" (or equivalent) optional keyword argument?

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From guido at python.org  Wed Aug 31 16:54:07 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 31 Aug 2005 07:54:07 -0700
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: <20050831143046.GE522@discworld.dyndns.org>
References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com>
	<20050831143046.GE522@discworld.dyndns.org>
Message-ID: <ca471dc20508310754592dd3df@mail.gmail.com>

On 8/31/05, Charles Cazabon <python at discworld.dyndns.org> wrote:

> I would think that perhaps an optional second argument to the method that
> controls whether it searches from the start (default) or end of the string
> might be nicer than having two separate methods, even though that would lose
> parallelism with the current .find/.index.
> 
> While I'm at it, why not propose that for py3k that
> .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split grow an
> optional "fromright" (or equivalent) optional keyword argument?

This violates one of my design principles: don't add boolean options
to an API that control the semantics in such a way that the option
value is (nearly) always a constant. Instead, provide two different
method names.

The motivation for this rule comes partly for performance: parameters
are relatively expensive, and you shouldn't make the method test
dynamically for a parameter value that is constant for the call site;
and partly from readability: don't bother the reader with having to
remember the full general functionality and how it is affected by the
various flags; also, a Boolean positional argument is a really poor
clue about its meaning, and it's easy to misremember the sense
reversed.

PS. This is a special case of a stronger design principle: don't let
the *type* of the return value depend on the *value* of the arguments.

PS2. As with all design principles, there are exceptions. But they
are, um, exceptional. index/rindex is not such an exception.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Wed Aug 31 18:51:17 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 31 Aug 2005 12:51:17 -0400
Subject: [Python-Dev] Proof of the pudding:  str.partition()
References: <005301c5ada1$4a52afc0$8832c797@oemcomputer><4314CA3E.3020606@benjiyork.com>
	<4314E1A2.4060409@ronadam.com><4314E51B.1050507@hathawaymix.org>
	<df3d2v$u2f$1@sea.gmane.org>
	<17173.43632.145313.858480@montanaro.dyndns.org>
Message-ID: <df4n64$vmu$1@sea.gmane.org>


<skip at pobox.com> wrote in message 
news:17173.43632.145313.858480 at montanaro.dyndns.org...
>
>    >> You can do both: make partition() return a sequence with 
> attributes,
>    >> similar to os.stat().  However, I would call the attributes 
> "before",
>    >> "sep", and "after".
>
>    Terry> One could see that as a special-case back-compatibility kludge
>    Terry> that maybe should disappear in 3.0.
>
> Back compatibility with what?

os.stat without attributes.  'that' referred to its current 'sequence with 
attributes' return.

> I'm -1 on the notion of generating groups or attributes.

We agree.  A back-compatibility kludge is not a precedent to be emulated.

>In the case of stat() there is no reason other than historic
> for the results to be returned in any particular order,

Which is why I wonder whether the sequence part should be dropped in 3.0.

Terry J. Reedy


From python at discworld.dyndns.org  Wed Aug 31 19:24:58 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed, 31 Aug 2005 11:24:58 -0600
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: <ca471dc20508310754592dd3df@mail.gmail.com>
References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com>
	<20050831143046.GE522@discworld.dyndns.org>
	<ca471dc20508310754592dd3df@mail.gmail.com>
Message-ID: <20050831172458.GB2476@discworld.dyndns.org>

Guido van Rossum <guido at python.org> wrote:
> On 8/31/05, Charles Cazabon <python at discworld.dyndns.org> wrote:
> 
> > While I'm at it, why not propose that for py3k that
> > .rfind/.rindex/.rjust/.rsplit disappear, and .find/.index/.just/.split
> > grow an optional "fromright" (or equivalent) optional keyword argument?
> 
> This violates one of my design principles:

Ah, excellent response.  Are your design principles written down anywhere?  I
didn't see anything on your essays page about them, but I'd like to learn at
the feet of the BDFL.

> don't add boolean options to an API that control the semantics in such a way
> that the option value is (nearly) always a constant. Instead, provide two
> different method names.

Hmmm.  I really dislike the additional names, but ...

> The motivation for this rule comes partly for performance: parameters
> are relatively expensive, and you shouldn't make the method test
> dynamically for a parameter value that is constant for the call site;

I can see this. 

> and partly from readability: don't bother the reader with having to
> remember the full general functionality and how it is affected by the
> various flags;

This I don't think is so bad.  It's analogous to providing the "reverse"
parameter to sorted et al, and I don't think that's particularly hard to
remember.  It would also be rarely used; I use find/index tens of times more
often than I use rfind/rindex, and I presume it would be the same for a
hypothetical .part/.rpart.

> also, a Boolean positional argument is a really poor clue about its meaning,
> and it's easy to misremember the sense reversed.

I totally agree.  I therefore borrowed the time machine and modified my
proposal to suggest it should be a keyword argument, not a positional one :).

> PS. This is a special case of a stronger design principle: don't let
> the *type* of the return value depend on the *value* of the arguments.

Hmmm.  In all of these cases, the type of the return value is constant.  Only
the value would change based on the value of the arguments.   ... ?

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From jimjjewett at gmail.com  Wed Aug 31 20:43:02 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 31 Aug 2005 14:43:02 -0400
Subject: [Python-Dev]  Alternative name for str.partition()
Message-ID: <fb6fbf56050831114318fbf61@mail.gmail.com>

[In http://mail.python.org/pipermail/python-dev/2005-August/055880.html ]
Andrew Durdin wrote:

> one of the "fixed stdlib" examples that Raymond
> posted actually uses rpartition and partition in two consecutive lines

Even with that leadin, even right next to each other, it took me a bit of
time to see the difference between rest.rpartition and rest.partition.

>         rest, _, query = rest.rpartition('?')
>         script, _, rest = rest.partition('/')

Shortening the names helps, because a single letter matters more.

>         rest, _, query = rest.rpart('?')
>         script, _, rest = rest.part('/')

A different-looking word (such as Greg's suggestion) might be even
better, if the word also works on its own.

>         rest, _, query = rest.rsplit_at('?')   
>         script, _, rest = rest.split_at('/')
 
-jJ

From jimjjewett at gmail.com  Wed Aug 31 20:56:44 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 31 Aug 2005 14:56:44 -0400
Subject: [Python-Dev]  Proof of the pudding: str.partition()
Message-ID: <fb6fbf56050831115642ba3bbb@mail.gmail.com>

Michael Chermside wrote (but I reordered):

>Simplicity and elegence are two of the reasons that this
> is such an excellent proposal, let's not lose them.

> Raymond's original definition for partition() did NOT support
> any of the following:

>   (*) Regular Expressions

While this is obviously more powerful, and an analogue should
probably go in re ... it doesn't belong in strings.  I don't want to
have to explain why

    "www.python.org".part('.') 

acts strangely (forget to escape the period).

>   (*) Ways to generate just 1 or 2 of the 3 values if some are
>       not going to be used
>   (*) Clever use of indices to avoid copying strings
>   (*) Behind-the-scenes tricks to allow repeated re-partitioning
>       to be faster than O(n^2)

I think these may be useful behind the scenes, but the API should
not expose them unless they are made more general.

For instance, the compiler could recognize that junk variables (or
variable names matching a certain pattern?) don't really have to be 
created -- and that would be useful for more than string splitting.

Doing it as a special case here just leads to a backwards compatibility
wart later.

-jJ

From raymond.hettinger at verizon.net  Wed Aug 31 21:37:18 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed, 31 Aug 2005 15:37:18 -0400
Subject: [Python-Dev] Design Principles
In-Reply-To: <ca471dc20508310754592dd3df@mail.gmail.com>
Message-ID: <003501c5ae63$664f5e40$4320c797@oemcomputer>

> > While I'm at it, why not propose that for py3k that
> > .rfind/.rindex/.rjust/.rsplit disappear, and
.find/.index/.just/.split
> grow an
> > optional "fromright" (or equivalent) optional keyword argument?
> 
> This violates one of my design principles: don't add boolean options
> to an API that control the semantics in such a way that the option
> value is (nearly) always a constant. Instead, provide two different
> method names.
> 
> The motivation for this rule comes partly for performance: parameters
> are relatively expensive, and you shouldn't make the method test
> dynamically for a parameter value that is constant for the call site;
> and partly from readability: don't bother the reader with having to
> remember the full general functionality and how it is affected by the
> various flags; also, a Boolean positional argument is a really poor
> clue about its meaning, and it's easy to misremember the sense
> reversed.
> 
> PS. This is a special case of a stronger design principle: don't let
> the *type* of the return value depend on the *value* of the arguments.
> 
> PS2. As with all design principles, there are exceptions. But they
> are, um, exceptional. index/rindex is not such an exception.

FWIW, after this is over, I'll put together a draft list of these
principles.  The one listed above has served us well.  An early draft of
itertools.ifilter() had an invert flag.  The toolset improved when that
was split to a separate function, ifilterfalse().

Other thoughts:

Tim's rule on algorithm selection:  We read Knuth so you don't have to.

Raymond's rule on language proposals:  Assertions that construct X is
better than an existing construct Y should be backed up by a variety of
side-by-side comparisons using real-world code samples.

I'm sure there are plenty more if these in the archives.


Raymond


From janssen at parc.com  Wed Aug 31 22:04:55 2005
From: janssen at parc.com (Bill Janssen)
Date: Wed, 31 Aug 2005 13:04:55 PDT
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: Your message of "Wed, 31 Aug 2005 06:40:15 PDT."
	<5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com> 
Message-ID: <05Aug31.130458pdt."58617"@synergy1.parc.xerox.com>

> >    (*) Regular Expressions
> 
> This can be orthogonally added to the 're' module, and definitely should 
> not be part of the string method.

Sounds right to me, and it *should* be orthogonally added to the 're'
module coincidentally simultaneously with the change to the string
object :-).

I have to say, it would be nice if

       "foo bar".partition(re.compile('\s'))

would work.  That is, if the argument is an re pattern object instead
of a string, it would be nice if it were understood appropriately,
just for symmetry's sake.  But it's hardly necessary.

Presumably in the re module, there would be a function like

	re.partition("\s", "foo bar")

for one-shot usage, or the expression

        re.compile('\s').partition("foo bar")

Bill

From oren.tirosh at gmail.com  Wed Aug 31 22:24:52 2005
From: oren.tirosh at gmail.com (Oren Tirosh)
Date: Wed, 31 Aug 2005 23:24:52 +0300
Subject: [Python-Dev] Python 3 design principles
Message-ID: <7168d65a050831132415118382@mail.gmail.com>

Most of the changes in PEP 3000 are tightening up of  "There should be
one obvious way to do it.":
* Remove multiple forms of raising exceptions, leaving just "raise instance" 
* Remove exec as statement, leaving the compatible tuple/call form.
* Remove <>, ``, leaving !=, repr
etc.

Other changes are to disallow things already considered poor style like:
* No assignment to True/False/None 
* No input() 
* No access to list comprehension variable 

And there is also completely new stuff like static type checking.

While a lot of existing code will break on 3.0 it is still generally
possible to write code that will run on both 2.x and 3.0: use only the
"proper" forms above, do not assume the result of zip or range is a
list, use absolute imports (and avoid static types, of course). I
already write all my new code this way.

Is this "common subset" a happy coincidence or a design principle? 

Not all proposed changes remove redundancy or add completely new
things. Some of them just change the way certain things must be done.
For example:
*  Moving compile, id, intern to sys
*  Replacing print with write/writeln
And possibly the biggest change:
*  Reorganize the standard library to not be as shallow

I'm between +0 and -1 on these. I don't find them enough of an
improvement to break this "common subset" behavior. It's not quite the
same as strict backward compatibility and I find it worthwhile to try
to keep it.

Writing programs that run on both 2.x and 3 may require ugly
version-dependent tricks like:

try:
    compile
except NameError:
    from sys import compile

or perhaps

try:
    import urllib
except ImportError:
    from www import urllib

Should the "common subset" be a design principle of Python 3? Do
compile and id really have to be moved from __builtins__ to sys? Could
the rearrangement of the standard library be a bit less aggressive and
try to leave commonly used modules in place?

  Oren

From python at discworld.dyndns.org  Wed Aug 31 22:44:39 2005
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed, 31 Aug 2005 14:44:39 -0600
Subject: [Python-Dev] Python 3 design principles
In-Reply-To: <7168d65a050831132415118382@mail.gmail.com>
References: <7168d65a050831132415118382@mail.gmail.com>
Message-ID: <20050831204439.GA3775@discworld.dyndns.org>

Oren Tirosh <oren.tirosh at gmail.com> wrote:
> 
> Not all proposed changes remove redundancy or add completely new
> things. Some of them just change the way certain things must be done.
> For example:
> *  Moving compile, id, intern to sys
> *  Replacing print with write/writeln
> And possibly the biggest change:
> *  Reorganize the standard library to not be as shallow
> 
> I'm between +0 and -1 on these. I don't find them enough of an
> improvement to break this "common subset" behavior. It's not quite the
> same as strict backward compatibility and I find it worthwhile to try
> to keep it.
> 
> Writing programs that run on both 2.x and 3 may require ugly
> version-dependent tricks like:
> 
> try:
>     compile
> except NameError:
>     from sys import compile

Perhaps py3k could have a py2compat module.  Importing it could have the
effect of (for instance) putting compile, id, and intern into the global
namespace, making print an alias for writeln, alias the standard library
namespace, ... ?

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python at discworld.dyndns.org>
GPL'ed software available at:               http://pyropus.ca/software/
-----------------------------------------------------------------------

From reinhold-birkenfeld-nospam at wolke7.net  Wed Aug 31 22:49:23 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Wed, 31 Aug 2005 22:49:23 +0200
Subject: [Python-Dev] Proof of the pudding: str.partition()
In-Reply-To: <05Aug31.130458pdt."58617"@synergy1.parc.xerox.com>
References: <5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com>
	<05Aug31.130458pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <df554j$e1r$1@sea.gmane.org>

Bill Janssen wrote:
>> >    (*) Regular Expressions
>> 
>> This can be orthogonally added to the 're' module, and definitely should 
>> not be part of the string method.
> 
> Sounds right to me, and it *should* be orthogonally added to the 're'
> module coincidentally simultaneously with the change to the string
> object :-).
> 
> I have to say, it would be nice if
> 
>        "foo bar".partition(re.compile('\s'))
> 
> would work.  That is, if the argument is an re pattern object instead
> of a string, it would be nice if it were understood appropriately,
> just for symmetry's sake.  But it's hardly necessary.

And it's horrible, for none of the other string methods accept a RE.

In Python, RE functionality is in the re module and nowhere else, and this
is a Good Thing. There are languages which give REs too much weight by philosophy
(hint, hint), but Python isn't one of them. Interestingly, Python programmers
suffer less from the "help me, my RE doesn't work" problem.

Reinhold

-- 
Mail address is perfectly valid!


From nnorwitz at gmail.com  Wed Aug 31 22:56:37 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 31 Aug 2005 13:56:37 -0700
Subject: [Python-Dev] Python 3 design principles
In-Reply-To: <7168d65a050831132415118382@mail.gmail.com>
References: <7168d65a050831132415118382@mail.gmail.com>
Message-ID: <ee2a432c05083113564b4bc521@mail.gmail.com>

On 8/31/05, Oren Tirosh <oren.tirosh at gmail.com> wrote:
> 
> Writing programs that run on both 2.x and 3 may require ugly
> version-dependent tricks like:
> 
> try:
>     compile
> except NameError:
>     from sys import compile

Note we can ease this process a little by making a copy without
removing, e.g., adding compile to sys now without removing it.  As
programs support only Python 2.5+, they could use sys.compile and
wouldn't need to resort to the try/except above.

I realize this is only a marginal improvement.  However, if we don't
start making changes, we will be stuck maintain suboptimal behaviour
forever.

n

From collinw at gmail.com  Wed Aug 31 23:00:54 2005
From: collinw at gmail.com (Collin Winter)
Date: Wed, 31 Aug 2005 16:00:54 -0500
Subject: [Python-Dev] Python 3 design principles
In-Reply-To: <20050831204439.GA3775@discworld.dyndns.org>
References: <7168d65a050831132415118382@mail.gmail.com>
	<20050831204439.GA3775@discworld.dyndns.org>
Message-ID: <43aa6ff705083114004924f4a9@mail.gmail.com>

Am 31-Aug 05, Charles Cazabon <python at discworld.dyndns.org> schrieb:

> Perhaps py3k could have a py2compat module.  Importing it could have the
> effect of (for instance) putting compile, id, and intern into the global
> namespace, making print an alias for writeln, alias the standard library
> namespace, ... ?

from __past__ import python2

Gr??e,
Collin Winter

From rkern at ucsd.edu  Wed Aug 31 23:00:08 2005
From: rkern at ucsd.edu (Robert Kern)
Date: Wed, 31 Aug 2005 14:00:08 -0700
Subject: [Python-Dev] Python 3 design principles
In-Reply-To: <7168d65a050831132415118382@mail.gmail.com>
References: <7168d65a050831132415118382@mail.gmail.com>
Message-ID: <df55oo$fpg$1@sea.gmane.org>

Oren Tirosh wrote:

> While a lot of existing code will break on 3.0 it is still generally
> possible to write code that will run on both 2.x and 3.0: use only the
> "proper" forms above, do not assume the result of zip or range is a
> list, use absolute imports (and avoid static types, of course). I
> already write all my new code this way.
> 
> Is this "common subset" a happy coincidence or a design principle? 

I think it's because those are the most obvious things right now. The
really radical stuff won't come up until active development on Python
3000 actually starts. And it will, so any "common subset" will probably
not be very large.

IMO, if we are going to restrict Python 3000 enough to protect that
"common subset," then there's not enough payoff to justify breaking
*any* backwards compatibility. If my current codebase[1] isn't going to
be supported in Python 3000, I'm going to want the Python developers to
use that opportunity to the fullest advantage to make a better language.

[1] By which I mean the sum total of the code that I use not just code
that I've personally written. I am a library-whore.

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter


From tjreedy at udel.edu  Wed Aug 31 23:05:17 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 31 Aug 2005 17:05:17 -0400
Subject: [Python-Dev] Design Principles
References: <ca471dc20508310754592dd3df@mail.gmail.com>
	<003501c5ae63$664f5e40$4320c797@oemcomputer>
Message-ID: <df562d$h5c$1@sea.gmane.org>


"Raymond Hettinger" <raymond.hettinger at verizon.net> wrote in message 
news:003501c5ae63$664f5e40$4320c797 at oemcomputer...
> FWIW, after this is over, I'll put together a draft list of these
> principles.  The one listed above has served us well.  An early draft of
> itertools.ifilter() had an invert flag.  The toolset improved when that
> was split to a separate function, ifilterfalse().
>
> Other thoughts:
>
> Tim's rule on algorithm selection:  We read Knuth so you don't have to.
>
> Raymond's rule on language proposals:  Assertions that construct X is
> better than an existing construct Y should be backed up by a variety of
> side-by-side comparisons using real-world code samples.
>
> I'm sure there are plenty more if these in the archives.

This would make a good information PEP to point people to when they ask 
'Why ...' and the answer goes back to one of these principles.

Terry J. Reedy


From foom at fuhm.net  Wed Aug 31 23:39:54 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed, 31 Aug 2005 17:39:54 -0400
Subject: [Python-Dev] Python 3 design principles
In-Reply-To: <df55oo$fpg$1@sea.gmane.org>
References: <7168d65a050831132415118382@mail.gmail.com>
	<df55oo$fpg$1@sea.gmane.org>
Message-ID: <F93A8EC2-BAF4-4179-BFC1-0AE0FE8CE105@fuhm.net>


On Aug 31, 2005, at 5:00 PM, Robert Kern wrote:
> IMO, if we are going to restrict Python 3000 enough to protect that
> "common subset," then there's not enough payoff to justify breaking
> *any* backwards compatibility. If my current codebase[1] isn't  
> going to
> be supported in Python 3000, I'm going to want the Python  
> developers to
> use that opportunity to the fullest advantage to make a better  
> language.

I disagree fully. As a maintainer in the Twisted project I very much  
hope that it is possible to adapt the code such that it will work on  
Python 3 while still maintaining compatibility with Python 2.X.  
Otherwise, it will be impossible to make the transition to Python 3  
without either maintaining two forks of the codebase (I doubt that'll  
happen) or abandoning all users still on Python 2. And that surely  
won't happen either, for a while. Maybe by the time Python 3.1 or 3.2  
comes out it'll be possible to completely abandon Python 2.

I'm perfectly happy to see backwards-incompatible changes in Python  
3, as long as they do not make it completely impossible to write code  
that can run on both Python 3 and Python 2.X. This suggests a few  
things to me:

a) new features should be added to the python 2.x series first  
wherever possible.
b) 3.0 should by and large by simply a feature-removal release,  
removing support for features already marked as going away by the end  
of the 2.x series and which have replacements.
c) don't make any radical syntax changes which make it impossible to  
write code that can even parse in both versions.
d) for all backwards-incompatible-change proposals, have a section  
dedicated to compatibility and migration of old code that explains  
both how to modify old code to do things purely the new way, _and_  
how to modify code to work under both the old and new ways. Strive to  
make this as simple as possible, but if totally necessary, it may be  
reasonable to suggest writing a wrapper function which changes  
behavior based on python version/existence of new methods.

James

From steve at holdenweb.com  Wed Aug 31 23:51:14 2005
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 31 Aug 2005 16:51:14 -0500
Subject: [Python-Dev] Proof of the pudding:  str.partition()
In-Reply-To: <df4dbh$rlt$1@sea.gmane.org>
References: <20050831045535.4dovty96y0w0g4gg@login.werra.lunarpages.com
	>	<5.1.1.6.0.20050831092223.01b56d98@mail.telecommunity.com>
	<df4dbh$rlt$1@sea.gmane.org>
Message-ID: <df58oi$p4p$1@sea.gmane.org>

Fredrik Lundh wrote:
> Phillip J. Eby wrote:
> 
> 
>>Yep, subscripting and slicing are more than adequate to handle *all* of
>>those use cases, even the ones that some people have been jumping through
>>odd hoops to express:
>>
>>    before = x.partition(sep)[0]
>>    found  = x.partition(sep)[1]
>>    after  = x.partition(sep)[2]
>>
>>    before, found = x.partition("foo")[:2]
>>    found,  after = x.partition("foo")[1:]
>>    before, after = x.partition("foo")[::2]
>>
>>Okay, that last one is maybe a little too clever.  I'd personally just use
>>'__' or 'DONTCARE' or something like that for the value(s) I didn't care
>>about, because it  actually takes slightly less time to unpack a 3-tuple
>>into three function-local variables than it does to pull out a single
>>element of the tuple, and it's almost twice as fast as taking a slice and
>>unpacking it into two variables.
> 
> 
> you're completely missing the point.
> 
> the problem isn't the time it takes to unpack the return value, the problem is that
> it takes time to create the substrings that you don't need.
> 
Indeed, and therefore the performance of rpartition is likely to get 
worse as the length of the input strung increases. I don't like to think 
about all those strings being created just to be garbage-collected. Pity 
the poor CPU ... :-)

> for some use cases, a naive partition-based solution is going to be a lot slower
> than the old find+slice approach, no matter how you slice, index, or unpack the
> return value.
> 
Yup. Then it gets down to statistical arguments about the distribution 
of use cases and input lengths. If we had a type that represented a 
substring of an existing string it might avoid the stress, but I'm not 
sure I see that one flying.

> 
>>So, using three variables is both faster *and* easier to read than any of
>>the variations anybody has proposed, including the ones I just showed above.
> 
> 
> try again.
> 
The collective brainpower that's been exercised on this one enhancement 
already must be phenomenal, but the proposal still isn't perfect.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/


From aahz at pythoncraft.com  Wed Aug 31 23:55:50 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 31 Aug 2005 14:55:50 -0700
Subject: [Python-Dev] Design Principles
In-Reply-To: <003501c5ae63$664f5e40$4320c797@oemcomputer>
References: <ca471dc20508310754592dd3df@mail.gmail.com>
	<003501c5ae63$664f5e40$4320c797@oemcomputer>
Message-ID: <20050831215550.GA437@panix.com>

On Wed, Aug 31, 2005, Raymond Hettinger wrote:
>
> FWIW, after this is over, I'll put together a draft list of these
> principles.  The one listed above has served us well.  An early draft of
> itertools.ifilter() had an invert flag.  The toolset improved when that
> was split to a separate function, ifilterfalse().
> 
> Other thoughts:
> 
> Tim's rule on algorithm selection:  We read Knuth so you don't have to.
> 
> Raymond's rule on language proposals:  Assertions that construct X is
> better than an existing construct Y should be backed up by a variety of
> side-by-side comparisons using real-world code samples.
> 
> I'm sure there are plenty more if these in the archives.

Nice!  Also a pointer to the Zen of Python.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

The way to build large Python applications is to componentize and
loosely-couple the hell out of everything.