From arnodel at googlemail.com  Sun Mar  1 11:34:59 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Sun, 1 Mar 2009 10:34:59 +0000
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49A9A488.4070308@improva.dk>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk>
Message-ID: <E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>


On 28 Feb 2009, at 20:54, Jacob Holm wrote:

> Replying to myself here...
>
> Jacob Holm wrote:
>> I think I can actually see a way to do it that should be fast  
>> enough, but I'd like to
>> work out the details first. If it works it will be O(1) with low  
>> constants as long as
>> you don't build trees, and similar to traversing a delegation chain  
>> in the worst case.
>>
>> All this depends on getting it working using delegation chains  
>> first though, as most of
>> the StopIteration and Exception handling would be the same.
>
> I have now worked out the details, and it is indeed possible to get  
> O(1) for simple cases and amortized O(logN)
> in general, all with fairly low constants.

I'm sorry if I'm missing something obvious, but there are two things I  
can't work out:

* What you are measuring the time complexity of.
* What N stands for.

I suspect that N is the 'delegation depth': the number of yield-from  
that have to be gone through.  I imagine that you are measuring the  
time it takes to get the next element in the generator.  These are  
guesses - can you correct me?

> I have implemented the tree structure as a python module and added a  
> trampoline-based pure-python implementation of "yield-from" to try  
> it out.
>
> It seems that this version beats a normal "for v in it: yield v"  
> when the delegation chains get around 90 generators deep.

Can you give an example?

-- 
Arnaud


From jh at improva.dk  Sun Mar  1 13:30:36 2009
From: jh at improva.dk (Jacob Holm)
Date: Sun, 01 Mar 2009 13:30:36 +0100
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
Message-ID: <49AA7FEC.4090609@improva.dk>

Hi Arnaud

Arnaud Delobelle wrote:
>> I have now worked out the details, and it is indeed possible to get 
>> O(1) for simple cases and amortized O(logN)
>> in general, all with fairly low constants.
>
> I'm sorry if I'm missing something obvious, but there are two things I 
> can't work out:
I am glad you asked. Reading it again, I can see that this is definitely 
not obvious.

>
> * What you are measuring the time complexity of.
The time for a single 'next', 'send', 'throw' or 'close' call to a 
generator or a single "yield from" expression, excluding the time spent 
running the user-defined code in it. (Does that make sense?) In other 
words, the total overhead of finding the actual user code to run and 
handling the return values/exceptions according to the PEP.

> * What N stands for.
>
> I suspect that N is the 'delegation depth': the number of yield-from 
> that have to be gone through. I imagine that you are measuring the 
> time it takes to get the next element in the generator. These are 
> guesses - can you correct me?
N is the total number of suspended generators in the tree(s) of 
generators involved in the operation. Remember that it is possible to 
have multiple generators yield from the same 'parent' generator. The 
'delegation depth' would be the height of that tree.

One interesting thing to note is that all non-contrived examples I have 
seen only build simple chains of iterators, and only do that by adding 
one at a time in the deepest nested one. This is the best possible case 
for my algorithm. If you stick to that, the time per operation is O(1). 
If we decided to only allow that, the algorithm can be simplified 
significantly.

>
>> I have implemented the tree structure as a python module and added a 
>> trampoline-based pure-python implementation of "yield-from" to try it 
>> out.
>>
>> It seems that this version beats a normal "for v in it: yield v" when 
>> the delegation chains get around 90 generators deep.
>
> Can you give an example?
>
Sure, here is the simple code I used for timing the 'next' call

import itertools, time
from yieldfrom import uses_from, from_ # my module...

@uses_from
def child(it):
yield from_(it)

def child2(it):
for i in it:
yield i

def longchain(N):
it = itertools.count()
for i in xrange(N):
it = child(it) # replace this with child2 to test the current
# "for v in it: yield v" pattern.
it.next() # we are timing the 'next' calls (not the setup of
# the chain) so skip the setup by calling next once.
return it

it = longchain(90)

times = []
for i in xrange(10):
t1 = time.time()
for i in xrange(100000):
it.next()
t2 = time.time()
times.append(t2-t1)
print min(times)

This version takes about the same time whether you use child or child2 
to build the chain. However, the version using yield..from takes the 
same time no matter how long the chain is, while the time for the "for v 
in it: yield v" version is linear in the length of the chain, and so 
would lose big-time for longer chains.

I hope this answers your questions.

Regards

Jacob


From lists at cheimes.de  Sun Mar  1 14:49:03 2009
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 01 Mar 2009 14:49:03 +0100
Subject: [Python-ideas] with statement: multiple context manager
Message-ID: <goe3og$gpd$1@ger.gmane.org>

Hello fellow Pythonistas!

On a regularly basis I'm bothered and annoyed by the fact that the with
statement takes only one context manager. Often I need to open two files
to read from one and write to the other. I propose to modify the with
statement to accept multiple context manangers.

Example
=======

The nested block::

  with lock:
      with open(infile) as fin:
          with open(outfile, 'w') as fout:
              fout.write(fin.read())


could be written as::

  with lock, open(infile) as fin, open(outfile, 'w') as fout:
      fout.write(fin.read())


The context managers' __enter__() and __exit__() methods are called FILO
(first in, last out). When an exception is raised by the __enter__()
method, the right handed context managers are omitted.

Grammar
=======

I'm not sure if I got the grammar right but I *think* the new grammar
should look like::

with_stmt: 'with' with_vars ':' suite
with_var: test ['as' expr]
with_vars: with_var (',' with_var)* [',']


Christian


From grosser.meister.morti at gmx.net  Sun Mar  1 15:30:38 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sun, 01 Mar 2009 15:30:38 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <goe3og$gpd$1@ger.gmane.org>
References: <goe3og$gpd$1@ger.gmane.org>
Message-ID: <49AA9C0E.4060401@gmx.net>

Why not use this?

from contextlib import nested
with nested(lock, open(infile), open(outfile, 'w')) as (_, fin, fout):
    fout.write(fin.read())


Ok, the _ is ugly, but is it ugly enough so we need this extension to the with 
statement?

	-panzi


From guido at python.org  Sun Mar  1 20:46:17 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Mar 2009 11:46:17 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <goe3og$gpd$1@ger.gmane.org>
References: <goe3og$gpd$1@ger.gmane.org>
Message-ID: <ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>

On Sun, Mar 1, 2009 at 5:49 AM, Christian Heimes <lists at cheimes.de> wrote:
> On a regularly basis I'm bothered and annoyed by the fact that the with
> statement takes only one context manager. Often I need to open two files
> to read from one and write to the other. I propose to modify the with
> statement to accept multiple context manangers.
>
> Example
> =======
>
> The nested block::
>
> ?with lock:
> ? ? ?with open(infile) as fin:
> ? ? ? ? ?with open(outfile, 'w') as fout:
> ? ? ? ? ? ? ?fout.write(fin.read())
>
>
> could be written as::
>
> ?with lock, open(infile) as fin, open(outfile, 'w') as fout:
> ? ? ?fout.write(fin.read())
>
>
> The context managers' __enter__() and __exit__() methods are called FILO
> (first in, last out). When an exception is raised by the __enter__()
> method, the right handed context managers are omitted.
>
> Grammar
> =======
>
> I'm not sure if I got the grammar right but I *think* the new grammar
> should look like::
>
> with_stmt: 'with' with_vars ':' suite
> with_var: test ['as' expr]
> with_vars: with_var (',' with_var)* [',']

I am sympathetic to this desire -- I think we almost added this to the
original PEP but decided to hold off until a clear need was found.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From eli at courtwright.org  Sun Mar  1 20:53:28 2009
From: eli at courtwright.org (Eli Courtwright)
Date: Sun, 1 Mar 2009 14:53:28 -0500
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
Message-ID: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>

On Sun, Mar 1, 2009 at 2:46 PM, Guido van Rossum <guido at python.org> wrote:
> I am sympathetic to this desire -- I think we almost added this to the
> original PEP but decided to hold off until a clear need was found.

I second the motion to have this syntax added to the language.  I've
often had to write nested with blocks to open one file for reading and
another for writing.

 - Eli


From pyideas at rebertia.com  Sun Mar  1 21:22:53 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 1 Mar 2009 12:22:53 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
Message-ID: <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>

On Sun, Mar 1, 2009 at 11:53 AM, Eli Courtwright <eli at courtwright.org> wrote:
> On Sun, Mar 1, 2009 at 2:46 PM, Guido van Rossum <guido at python.org> wrote:
>> I am sympathetic to this desire -- I think we almost added this to the
>> original PEP but decided to hold off until a clear need was found.
>
> I second the motion to have this syntax added to the language. ?I've
> often had to write nested with blocks to open one file for reading and
> another for writing.

It does seem slightly incongruous though, given that the
for-statement, which is quite similar to the with-statement in that
they both bind new variables in a subsidiary block of code, does not
directly support multiple simultaneous bindings.

To put it more concretely, currently one must write:

for a, b, c in zip(seq1, seq2, seq3):
    #body

Rather than:

for a in seq1, b in seq2, c in seq3:
   #body

But for some reason we're proposing to, in a way, make nested() built
into `with` but not make zip() likewise built into `for`.

While I still mostly like the idea, it does seem to undermine Python's
uniformity a bit.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com


From daniel at stutzbachenterprises.com  Sun Mar  1 21:49:06 2009
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Sun, 1 Mar 2009 14:49:06 -0600
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
Message-ID: <eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>

On Fri, Feb 27, 2009 at 1:20 PM, Guido van Rossum <guido at python.org> wrote:

> On Fri, Feb 27, 2009 at 4:49 AM, Curt Hagenlocher <curt at hagenlocher.org>
> wrote:
> > That way lies madness.  What distinguishes "with" from other compound
> > statements is that it's already about resource management in the face
> > of possible exceptions.
>
> Still, a firm -1 from me. Once we have "try try" I'm sure people are
> going to clamor for "try if", "try while", "try for", even (oh horror
> :-) "try try". I don't think we should complicate the syntax just to
> save one level of indentation occasionally.
>

In addition to reasons outlined by Curt, "with" is unique because it's
short-hand for a "try" block with a "finally" clause.  Unfortunately, "with"
doesn't allow for other clauses and so I often end up using both "try" and
"with".

Also, "try if", "try while", and "try for" wouldn't work because they
already have a meaning for the "else" clause.  "with" does not.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090301/16ae5979/attachment.html>

From lists at cheimes.de  Sun Mar  1 21:57:50 2009
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 01 Mar 2009 21:57:50 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
Message-ID: <goesse$qhd$1@ger.gmane.org>

Chris Rebert wrote:
> While I still mostly like the idea, it does seem to undermine Python's
> uniformity a bit.

I played with both possible versions before I wrote the proposal. Both
ways have their pros and cons. I'm preferring the proposed way::

  with a, b as x, d as y:
       ...

over the other possibility::

  with a, b, c as _, x, y:
      ...

for two reasons. For one I dislike the temporary variable that is
required for some cases, e.g. the case I used in my initial proposal. It
doesn't feel quite right to use a useless placeholder. The proposed way
follows the example of the import statement, too::

  from module import a, b as x, d as y


Christian


From greg at krypto.org  Sun Mar  1 22:01:47 2009
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 1 Mar 2009 13:01:47 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
Message-ID: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>

On Sun, Mar 1, 2009 at 11:46 AM, Guido van Rossum <guido at python.org> wrote:

> On Sun, Mar 1, 2009 at 5:49 AM, Christian Heimes <lists at cheimes.de> wrote:
> > On a regularly basis I'm bothered and annoyed by the fact that the with
> > statement takes only one context manager. Often I need to open two files
> > to read from one and write to the other. I propose to modify the with
> > statement to accept multiple context manangers.
> >
> > Example
> > =======
> >
> > The nested block::
> >
> >  with lock:
> >      with open(infile) as fin:
> >          with open(outfile, 'w') as fout:
> >              fout.write(fin.read())
> >
> >
> > could be written as::
> >
> >  with lock, open(infile) as fin, open(outfile, 'w') as fout:
> >      fout.write(fin.read())
>

Alternatively if closer conformity with for loop syntax is desirable
consider this:

with lock, open(infile), open(outfile) as lock, fin, fout:
    fout.fwrite(fin.read())


> >
> >
> > The context managers' __enter__() and __exit__() methods are called FILO
> > (first in, last out). When an exception is raised by the __enter__()
> > method, the right handed context managers are omitted.
> >
> > Grammar
> > =======
> >
> > I'm not sure if I got the grammar right but I *think* the new grammar
> > should look like::
> >
> > with_stmt: 'with' with_vars ':' suite
> > with_var: test ['as' expr]
> > with_vars: with_var (',' with_var)* [',']
>
> I am sympathetic to this desire -- I think we almost added this to the
> original PEP but decided to hold off until a clear need was found.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090301/7dfd1989/attachment.html>

From ironfroggy at gmail.com  Sun Mar  1 22:15:45 2009
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 1 Mar 2009 16:15:45 -0500
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>
Message-ID: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>

On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith <greg at krypto.org> wrote:
> Alternatively if closer conformity with for loop syntax is desirable
> consider this:
> with lock, open(infile), open(outfile) as lock, fin, fout:
> ?? ?fout.fwrite(fin.read())

+1

We don't have multi-assignment statements in favor of the unpacking
concept, and I think it carries over here. Also, as mentioned, this
goes along with the lack of any multi-for statement. The `x as y` part
of the with statement is basically an assignment with extras, and the
original suggestion then combines multiple assignments on one line.
This option, I think, is more concise and readable.

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From pyideas at rebertia.com  Sun Mar  1 22:35:07 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 1 Mar 2009 13:35:07 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <goesse$qhd$1@ger.gmane.org>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org>
Message-ID: <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com>

On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes <lists at cheimes.de> wrote:
> Chris Rebert wrote:
>> While I still mostly like the idea, it does seem to undermine Python's
>> uniformity a bit.
>
> I played with both possible versions before I wrote the proposal. Both
> ways have their pros and cons. I'm preferring the proposed way::
>
>  with a, b as x, d as y:
>       ...
>
> over the other possibility::
>
>  with a, b, c as _, x, y:
>      ...

You misunderstand me. My quibble isn't over the exact syntax (in fact,
I completely agree about the superiority of the proposed ordering),
but rather that we're introducing syntax to do something that can
already be done with a function (nested()) and is /extremely/ similar
to another case (parallel for-loop) where we are opting to still
require the use of a function (zip()). This proposed asymmetry
concerns me.

> The proposed way
> follows the example of the import statement, too::
>
>  from module import a, b as x, d as y

This parallel does quell my concern somewhat.

Cheers,
Chris

--
Follow the path of the Iguana...
http://rebertia.com


From tjreedy at udel.edu  Sun Mar  1 22:45:07 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 01 Mar 2009 16:45:07 -0500
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>
	<76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>
Message-ID: <goevl2$27n$1@ger.gmane.org>

Calvin Spealman wrote:
> On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith <greg at krypto.org> wrote:
>> Alternatively if closer conformity with for loop syntax is desirable
>> consider this:
>> with lock, open(infile), open(outfile) as lock, fin, fout:
>>     fout.fwrite(fin.read())
> 
> +1
> 
> We don't have multi-assignment statements in favor of the unpacking
> concept, and I think it carries over here. Also, as mentioned, this
> goes along with the lack of any multi-for statement. The `x as y` part
> of the with statement is basically an assignment with extras, and the
> original suggestion then combines multiple assignments on one line.
> This option, I think, is more concise and readable.

I prefer this also, for the same reasons.

tjr


From guido at python.org  Sun Mar  1 23:01:47 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Mar 2009 14:01:47 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>
	<76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>
Message-ID: <ca471dc20903011401t7581575lb4a40774c1f7d94c@mail.gmail.com>

On Sun, Mar 1, 2009 at 1:15 PM, Calvin Spealman <ironfroggy at gmail.com> wrote:
> On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith <greg at krypto.org> wrote:
>> Alternatively if closer conformity with for loop syntax is desirable
>> consider this:
>> with lock, open(infile), open(outfile) as lock, fin, fout:
>> ?? ?fout.fwrite(fin.read())
>
> +1
>
> We don't have multi-assignment statements in favor of the unpacking
> concept, and I think it carries over here. Also, as mentioned, this
> goes along with the lack of any multi-for statement. The `x as y` part
> of the with statement is basically an assignment with extras, and the
> original suggestion then combines multiple assignments on one line.
> This option, I think, is more concise and readable.

-1 for this variant. The syntactic model is import: import foo as bar,
bletch, quuz as frobl. If we're doing this it should be like this.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sun Mar  1 23:14:30 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Mar 2009 14:14:30 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org>
	<50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com>
Message-ID: <ca471dc20903011414g459e26fw55adb89b6455c1d0@mail.gmail.com>

On Sun, Mar 1, 2009 at 1:35 PM, Chris Rebert <pyideas at rebertia.com> wrote:
> On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes <lists at cheimes.de> wrote:
>> Chris Rebert wrote:
>>> While I still mostly like the idea, it does seem to undermine Python's
>>> uniformity a bit.
>>
>> I played with both possible versions before I wrote the proposal. Both
>> ways have their pros and cons. I'm preferring the proposed way::
>>
>> ?with a, b as x, d as y:
>> ? ? ? ...
>>
>> over the other possibility::
>>
>> ?with a, b, c as _, x, y:
>> ? ? ?...
>
> You misunderstand me. My quibble isn't over the exact syntax (in fact,
> I completely agree about the superiority of the proposed ordering),
> but rather that we're introducing syntax to do something that can
> already be done with a function (nested()) and is /extremely/ similar
> to another case (parallel for-loop) where we are opting to still
> require the use of a function (zip()). This proposed asymmetry
> concerns me.

Hm. While we can indeed write the equivalent of the proposed "with a,
b:" today as "with nested(a, b):", I don't think that the situation is
quite comparable to a for-loop over a zip() call. The nested() context
manager isn't particularly intuitive to me (and Nick just found a
problem in a corner case of its semantics). Compared to nested(), I
find "with a, b:" very obvious as a shorthand for nested
with-statements:

  with a:
    with b:
      ...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From pyideas at rebertia.com  Sun Mar  1 23:22:56 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 1 Mar 2009 14:22:56 -0800
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <ca471dc20903011414g459e26fw55adb89b6455c1d0@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org>
	<50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com>
	<ca471dc20903011414g459e26fw55adb89b6455c1d0@mail.gmail.com>
Message-ID: <50697b2c0903011422l1f2dad1dv2ed566093dd1addf@mail.gmail.com>

On Sun, Mar 1, 2009 at 2:14 PM, Guido van Rossum <guido at python.org> wrote:
> On Sun, Mar 1, 2009 at 1:35 PM, Chris Rebert <pyideas at rebertia.com> wrote:
>> On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes <lists at cheimes.de> wrote:
>>> Chris Rebert wrote:
>>>> While I still mostly like the idea, it does seem to undermine Python's
>>>> uniformity a bit.
>>>
>>> I played with both possible versions before I wrote the proposal. Both
>>> ways have their pros and cons. I'm preferring the proposed way::
>>>
>>> ?with a, b as x, d as y:
>>> ? ? ? ...
>>>
>>> over the other possibility::
>>>
>>> ?with a, b, c as _, x, y:
>>> ? ? ?...
>>
>> You misunderstand me. My quibble isn't over the exact syntax (in fact,
>> I completely agree about the superiority of the proposed ordering),
>> but rather that we're introducing syntax to do something that can
>> already be done with a function (nested()) and is /extremely/ similar
>> to another case (parallel for-loop) where we are opting to still
>> require the use of a function (zip()). This proposed asymmetry
>> concerns me.
>
> Hm. While we can indeed write the equivalent of the proposed "with a,
> b:" today as "with nested(a, b):", I don't think that the situation is
> quite comparable to a for-loop over a zip() call. The nested() context
> manager isn't particularly intuitive to me (and Nick just found a
> problem in a corner case of its semantics). Compared to nested(), I
> find "with a, b:" very obvious as a shorthand for nested
> with-statements:
>
> ?with a:
> ? ?with b:
> ? ? ?...
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

Ok, fine by me. Just wanted to ensure the point was brought up and
adaquately responded to.

On another note, someone ought to draft a revision to PEP 343 to
document this proposal.

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com


From greg.ewing at canterbury.ac.nz  Sun Mar  1 23:32:21 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 11:32:21 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AA7FEC.4090609@improva.dk>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk>
Message-ID: <49AB0CF5.6020900@canterbury.ac.nz>

Jacob Holm wrote:
>
> Arnaud Delobelle wrote:
> 
>> * What you are measuring the time complexity of.
> 
> The time for a single 'next', 'send', 'throw' or 'close' call to a 
> generator or a single "yield from" expression,

How are you measuring 'time', though? Usually when discussing
time complexities we're talking about the number of some
fundamental operation being performed, and assuming that
all such operations take the same time. What operations
are you counting here?

> One interesting thing to note is that all non-contrived examples
 > I have seen only build simple chains of iterators, and only do
 > that by adding one at a time in the deepest nested one. This is
 > the best possible case for my algorithm. If you stick to that,
 > the time per operation is O(1).

I don't understand that limitation. If you keep a stack of
active generators, you always have constant-time access to the
one to be resumed next. There's some overhead for pushing and
popping the stack, but it follows exactly the same pattern
as the recursive calls you'd be making if you weren't using
some kind of yield-from, so it's irrelevant when comparing the
two.

-- 
Greg


From guido at python.org  Sun Mar  1 23:31:58 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Mar 2009 14:31:58 -0800
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
Message-ID: <ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>

On Sun, Mar 1, 2009 at 12:49 PM, Daniel Stutzbach
<daniel at stutzbachenterprises.com> wrote:
> On Fri, Feb 27, 2009 at 1:20 PM, Guido van Rossum <guido at python.org> wrote:
>>
>> On Fri, Feb 27, 2009 at 4:49 AM, Curt Hagenlocher <curt at hagenlocher.org>
>> wrote:
>> > That way lies madness. ?What distinguishes "with" from other compound
>> > statements is that it's already about resource management in the face
>> > of possible exceptions.
>>
>> Still, a firm -1 from me. Once we have "try try" I'm sure people are
>> going to clamor for "try if", "try while", "try for", even (oh horror
>> :-) "try try". I don't think we should complicate the syntax just to
>> save one level of indentation occasionally.
>
> In addition to reasons outlined by Curt, "with" is unique because it's
> short-hand for a "try" block with a "finally" clause.? Unfortunately, "with"
> doesn't allow for other clauses and so I often end up using both "try" and
> "with".
>
> Also, "try if", "try while", and "try for" wouldn't work because they
> already have a meaning for the "else" clause.? "with" does not.

Sorry, but my gut keeps telling me that "try with" is not taking the
language into a direction I am comfortable with. Programming language
design is not a rational science. Most reasoning about is is at best
rationalization of gut feelings, and at worst plain wrong. So, sorry,
but I'm going with my gut feelings, so it's still -1. (Or if you wish,
-1000.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From daniel at stutzbachenterprises.com  Sun Mar  1 23:53:09 2009
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Sun, 1 Mar 2009 16:53:09 -0600
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
	<ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
Message-ID: <eae285400903011453o39f11f2es6aae9f37735890a9@mail.gmail.com>

On Sun, Mar 1, 2009 at 4:31 PM, Guido van Rossum <guido at python.org> wrote:

> Sorry, but my gut keeps telling me that "try with" is not taking the
> language into a direction I am comfortable with. Programming language
> design is not a rational science. Most reasoning about is is at best
> rationalization of gut feelings, and at worst plain wrong. So, sorry,
> but I'm going with my gut feelings, so it's still -1. (Or if you wish,
> -1000.)
>

I thought that might be the case based on your first response, but figured
I'd give it one more shot.  ;-)  I respect your opinion.  Consider it
dropped.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090301/6c656acb/attachment.html>

From greg.ewing at canterbury.ac.nz  Mon Mar  2 00:51:44 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 12:51:44 +1300
Subject: [Python-ideas] Revised**7 PEP on Yield-From
Message-ID: <49AB1F90.7070201@canterbury.ac.nz>

I've made another couple of tweaks to the formal semantics
(so as not to over-specify when the iterator methods are
looked up).

Latest version of the PEP, together with the prototype
implementation and other related material, is available
here:

http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/

-- 
Greg


From jh at improva.dk  Mon Mar  2 01:09:14 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 02 Mar 2009 01:09:14 +0100
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB0CF5.6020900@canterbury.ac.nz>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk>	<49A9A488.4070308@improva.dk>	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>	<49AA7FEC.4090609@improva.dk>
	<49AB0CF5.6020900@canterbury.ac.nz>
Message-ID: <49AB23AA.4060106@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>>
>> Arnaud Delobelle wrote:
>>
>>> * What you are measuring the time complexity of.
>>
>> The time for a single 'next', 'send', 'throw' or 'close' call to a 
>> generator or a single "yield from" expression,
>
> How are you measuring 'time', though? Usually when discussing
> time complexities we're talking about the number of some
> fundamental operation being performed, and assuming that
> all such operations take the same time. What operations
> are you counting here?
All operations are simple attribute lookups, assignments, addition and 
the like. No hashing, no dynamic arrays, no magic, just plain' ol' 
fundamental operations as they would be in a C program :)

>> One interesting thing to note is that all non-contrived examples
> > I have seen only build simple chains of iterators, and only do
> > that by adding one at a time in the deepest nested one. This is
> > the best possible case for my algorithm. If you stick to that,
> > the time per operation is O(1).
>
> I don't understand that limitation. If you keep a stack of
> active generators, you always have constant-time access to the
> one to be resumed next. 
But that is exactly the point. This is not a simple stack. After doing 
"yield from R" in generator A, you can go ahead and do "yield from R" in 
generator B as well. If R is also using "yield from" you will have the 
following situation:

A
  \
    R --- (whatever R is waiting for)
  /
B

Yes, A sees it as a stack [A,R,...], and B sees it as a stack [B,R,...], 
but the part of the stack following R is the same in the two.
If R does a "yield from", the new iterator must appear at the top of 
both "stacks". Efficiently getting to the top of this "shared stack", 
and doing the equivalent of push, pop, ... is what I am trying to 
achieve. And I think I have succeeded.

> There's some overhead for pushing and
> popping the stack, but it follows exactly the same pattern
> as the recursive calls you'd be making if you weren't using
> some kind of yield-from, so it's irrelevant when comparing the
> two.
>
This would be correct if it was somehow forbidden to create the scenario 
I sketched above. As long as that scenario is possible I can construct 
an example where treating it as a simple stack will either do the wrong 
thing, or do the right thing but slower than a standard "for v in it: 
yield v".

Of course, you could argue that these examples are contrived and we 
don't have to care about them. I just think that we should do better if 
we can.

FWIW, here is the example from above in more detail...

def R():
    yield 'A'
    yield 'B'
    yield from xrange(3)
    # we could do anything here really...

r = R()

def A():
    yield from r

a = A()

def B()
    yield from r

b = B()

a.next() # returns 'A'
b.next() # returns 'B'
r.next() # returns 0
a.next() # returns 1
b.next() # returns 2


This is clearly legal based on the PEP, and it generalizes to a way of 
building an arbitrary tree with suspended generators as nodes. What my 
algorithm does is break this tree down into what is essentially a set of 
stacks. Managed such that the common case has just a single stack, and 
no matter how twisted your program is, the path from any generator to 
the root is split across at most O(logN) stacks (where N is the size of 
the tree).

I hope this helps to explain what I think I have done...

Best regards

Jacob


From guido at python.org  Mon Mar  2 01:28:48 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Mar 2009 16:28:48 -0800
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <eae285400903011453o39f11f2es6aae9f37735890a9@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
	<ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
	<eae285400903011453o39f11f2es6aae9f37735890a9@mail.gmail.com>
Message-ID: <ca471dc20903011628v70f176c3q8d1f8993d66b667f@mail.gmail.com>

On Sun, Mar 1, 2009 at 2:53 PM, Daniel Stutzbach
<daniel at stutzbachenterprises.com> wrote:
> On Sun, Mar 1, 2009 at 4:31 PM, Guido van Rossum <guido at python.org> wrote:
>>
>> Sorry, but my gut keeps telling me that "try with" is not taking the
>> language into a direction I am comfortable with. Programming language
>> design is not a rational science. Most reasoning about is is at best
>> rationalization of gut feelings, and at worst plain wrong. So, sorry,
>> but I'm going with my gut feelings, so it's still -1. (Or if you wish,
>> -1000.)
>
> I thought that might be the case based on your first response, but figured
> I'd give it one more shot.? ;-)? I respect your opinion.? Consider it
> dropped.

Thanks. I've learned the hard way to trust my gut instincts in this
area. If anyone can figure out how to explain my gut's responses to
people wanting rational answers I'd love to hear about it. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jh at improva.dk  Mon Mar  2 01:35:10 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 02 Mar 2009 01:35:10 +0100
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <9A1B0516-AF3E-4BB2-99DA-16EA9F6E7F56@googlemail.com>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk>
	<DED40FEF-640C-4731-A85B-40ACD58304E7@googlemail.com>
	<49AB0522.1010106@improva.dk>
	<9A1B0516-AF3E-4BB2-99DA-16EA9F6E7F56@googlemail.com>
Message-ID: <49AB29BE.1070508@improva.dk>

Arnaud Delobelle wrote:
>> * I don't think you can handle 'close' in accordance with the PEP,
>> and fixing this does not look easy.
>
> I haven't thought about this - unfortunately probably won't have time 
> till next weekend :(
:(

>> * It seems like in your version, you only gain when only the
>> outermost generator is decorated. (You conveniently modified the
>> testing code to only decorate with co *after* building the long
>> chain). You need to have *all* generators using FROM decorated, or
>> disaster will strike as soon as someone calls next on one of the
>> intermediate generators.
>>
>
> I don't think this is really a problem. If yield-from was in the 
> language, this 'decoration' could be added automatically at 
> compile-time. I haven't really thought about it in detail however :)
Agreed, but the purpose of the test is to see how it *would* behave if 
the 'decoration' happened at compile time. Stripping it off will only 
hide problems, such as the fact that there is no speed benefit at all...

>
>>> So the turning point is a depth of 10, after which my implementation 
>>> wins. At a depth of 90, cochild is about 10 times faster than child.
>> As could be expected, since you also have O(1) in this case. However, 
>> if you decorate all the generators in the chain you lose no matter 
>> what N is... Most of the speed difference between our versions is 
>> probably due to the fact that I am using classes instead of 
>> generators to allow me to have everything decorated, and to handle 
>> some more convoluted cases as well.
>>
>
> But decorating each generator would defeat the purpose! I guess one 
> way to find out would be to reimplement it with a class - see below!
And what purpose is that, if I might ask? My two purposes were to a) 
test my algorithm for speeding up arbitrarily complex cases, and b) have 
a way to play with the "yield from" concept as specified in the PEP. It 
seems that your original purpose was similar to b). My claim is that 
using your 'FROM' outside a generator decorated with your 'co' has no 
equivalent in the PEP and might be a bad idea, and I gave an example 
where this gives a bad/unexpected result.

>>>
>>> I don't know if my implementation behaves well with other examples 
>>> you have in mind, I would like to know!
>>>
>> I'll write a few more tests tonight so you can see the kind of 
>> problems I am looking at.
>
> That would be good. If I think my method can handle them, then I might 
> reimplement it with a class - it wouldn't take very long but I have 
> very little time at the moment!
>
> Thanks,
>
It looks like it might not be tonight after all. I'll try to get to it 
some time during this week.

Best regards

Jacob


From greg.ewing at canterbury.ac.nz  Mon Mar  2 01:39:00 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 13:39:00 +1300
Subject: [Python-ideas] Yield-from patches on Rietveld
In-Reply-To: <loom.20090222T180223-33@post.gmane.org>
References: <49A13D45.7080608@canterbury.ac.nz>
	<loom.20090222T180223-33@post.gmane.org>
Message-ID: <49AB2AA4.903@canterbury.ac.nz>

I have uploaded the patches for my yield-from implementation
to Rietveld if anyone wants to take a look at them:

   http://codereview.appspot.com/20101/show

This is the first time I've attempted to use Rietveld, so
I hope I've got things right.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon Mar  2 02:09:20 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 14:09:20 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB23AA.4060106@improva.dk>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk>
Message-ID: <49AB31C0.5000005@canterbury.ac.nz>

Jacob Holm wrote:
> After doing
> "yield from R" in generator A, you can go ahead and do "yield from R" in 
> generator B as well. If R is also using "yield from" you will have the 
> following situation:
> 
> A
>  \
>    R --- (whatever R is waiting for)
>  /
> B

Unless you're doing something extremely unusual, this situation
wouldn't arise. The yield-from statement is intended to run the
iterator to exhaustion, and normally you'd create a fresh iterator
for each yield-from that you want to do. So A and B would really
be yielding from different iterators, even if they were both
iterating over the same underlying object.

If you did try to share iterators between yield-froms like that,
you would have to arrange things so that at least all but one
of them broke out of the loop early, otherwise something is
going to get an exception due to trying to resume an exhausted
iterator.

But in any case, I think you can still model this as two separate
stacks, with R appearing in both stacks: [A, R] and [B, R]. Whichever
one of them finishes yielding from R first pops it from its stack,
and when the other one tries to resume R it gets an exception. Either
that or it breaks out of its yield-from early and discards its
version of R.

 > As long as that scenario is possible I can construct
> an example where treating it as a simple stack will either do the wrong 
> thing, or do the right thing but slower than a standard "for v in it: 
> yield v".

That depends on what you think the "right thing" is. If you
think that somehow A needs to notice that B has finished yielding
from R and gracefully stop doing so itself, then that's not something
I intended and it's not the way the current prototype implementation
would behave.

So IMO you're worrying about a problem that doesn't exist.

-- 
Greg


From jh at improva.dk  Mon Mar  2 03:11:04 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 02 Mar 2009 03:11:04 +0100
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB31C0.5000005@canterbury.ac.nz>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
Message-ID: <49AB4038.6070205@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> After doing
>> "yield from R" in generator A, you can go ahead and do "yield from R" 
>> in generator B as well. If R is also using "yield from" you will have 
>> the following situation:
>>
>> A
>>  \
>>    R --- (whatever R is waiting for)
>>  /
>> B
>
> Unless you're doing something extremely unusual, this situation
> wouldn't arise. The yield-from statement is intended to run the
> iterator to exhaustion, and normally you'd create a fresh iterator
> for each yield-from that you want to do. So A and B would really
> be yielding from different iterators, even if they were both
> iterating over the same underlying object.
Unusual, yes.  Extremely?  I'm not sure.  If you/we allow this, someone 
will find a use for it.
>
> If you did try to share iterators between yield-froms like that,
> you would have to arrange things so that at least all but one
> of them broke out of the loop early, otherwise something is
> going to get an exception due to trying to resume an exhausted
> iterator.
Did you read the spelled-out version at the bottom?  No need to "break 
out" of anything. That happens automatically because of the "yield 
from".  Just a few well-placed calls to next...

>
> But in any case, I think you can still model this as two separate
> stacks, with R appearing in both stacks: [A, R] and [B, R]. Whichever
> one of them finishes yielding from R first pops it from its stack,
> and when the other one tries to resume R it gets an exception. Either
> that or it breaks out of its yield-from early and discards its
> version of R.
I am not worried about R running out, each of A and B would find out 
about that next time they tried to get a value. I *am* worried about R 
doing a yield-from to X (the xrange in this example) which then needs to 
appear in both stacks to get the expected behavior from the PEP.

>
> > As long as that scenario is possible I can construct
>> an example where treating it as a simple stack will either do the 
>> wrong thing, or do the right thing but slower than a standard "for v 
>> in it: yield v".
>
> That depends on what you think the "right thing" is. If you
> think that somehow A needs to notice that B has finished yielding
> from R and gracefully stop doing so itself, then that's not something
> I intended and it's not the way the current prototype implementation
> would behave.
The right thing is whatever the PEP specifies :)  You are the author, so 
you get to decide...

I am saying that what the PEP currently specifies is not quite so simple 
to speed up as you and Arnaud seem to think.
(Even with a simple stack, handling 'close' and 'StopIteration' 
correctly is not exactly trivial)

>
> So IMO you're worrying about a problem that doesn't exist.
>
No, I am worrying about a problem that so far has only appeared in 
contrived examples designed to expose it.  Any real-life examples I have 
seen of the "yield from" feature would work perfectly well with a simple 
stack-based approach.  However, I have seen several ideas for speeding 
up long chains of "yield from"s beyond the current C implementation, and 
most of them fail either by giving wrong results (bad) or by slowing 
things down in admittedly unusual cases (not so bad, but not good).

Anyway...  it is 3 in the morning.  As I told Arnaud, I will try to find 
some time this week to write some more of these crazy examples. 

Regards

   Jacob


From jimjjewett at gmail.com  Mon Mar  2 04:05:16 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 1 Mar 2009 22:05:16 -0500
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB31C0.5000005@canterbury.ac.nz>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
Message-ID: <fb6fbf560903011905n67a53349v694f850c1b508068@mail.gmail.com>

On 3/1/09, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Jacob Holm wrote:

>> A
>>  \
>>    R --- (whatever R is waiting for)
>>  /
>> B

>... normally you'd create a fresh iterator
> for each yield-from that you want to do. So A and B would really
> be yielding from different iterators, even if they were both
> iterating over the same underlying object.

I think the problem might come up with objects that are using the
iterator protocol destructively.  For example, imagine A and B as
worker threads, and R as a work queue.

> If you did try to share iterators between yield-froms like that,
>... something is
> going to get an exception due to trying to resume an exhausted
> iterator.

I would expect that to be interpreted as a StopIteration and handled
gracefully. If that doesn't seem reasonable, then I wonder if the
whole protocol is still too fragile.

-jJ


From jimjjewett at gmail.com  Mon Mar  2 04:34:48 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Sun, 1 Mar 2009 22:34:48 -0500
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB4038.6070205@improva.dk>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
	<49AB4038.6070205@improva.dk>
Message-ID: <fb6fbf560903011934h3fc3527bkb1b3d8609a2f63a9@mail.gmail.com>

On 3/1/09, Jacob Holm <jh at improva.dk> wrote:
> Greg Ewing wrote:
>> Jacob Holm wrote:

[naming that for which R waits]
>>> A
>>>  \
>>>    R --- X (=whatever R is waiting for)
>>>  /
>>> B

>> But in any case, I think you can still model this as two separate
>> stacks, with R appearing in both stacks: [A, R] and [B, R].

[A, R, X] and [B, R, X]

>> Whichever
>> one of them finishes yielding from R first pops it from its stack,
>> and when the other one tries to resume R it gets an exception. Either
>> that or it breaks out of its yield-from early and discards its
>> version of R.
> I am not worried about R running out, each of A and B would find out
> about that next time they tried to get a value. I *am* worried about R
> doing a yield-from to X (the xrange in this example) which then needs to
> appear in both stacks to get the expected behavior from the PEP.

I would assume that the second one tries to resume, gets the
StopIteration from X, retreats to R, gets the StopIteration there as
well, and then continues after its yield-from.

If that didn't happen, I would wonder whether (theoretical speed)
optimization was leading to suboptimal semantics.

>> > As long as that scenario is possible I can construct
>>> an example where treating it as a simple stack will either do the
>>> wrong thing, or do the right thing but slower than a standard "for v
>>> in it: yield v".

I have to wonder whether any optimization will be a mistake.  At the
moment, I can't think of any way to do it without adding at least an
extra pointer and an extra if-test.  That isn't much, but ... how
often will there be long chains, vs how often are generators used
without getting any benefit from this sort of delegation?

-jJ


From greg.ewing at canterbury.ac.nz  Mon Mar  2 05:09:24 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 17:09:24 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB4038.6070205@improva.dk>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
	<49AB4038.6070205@improva.dk>
Message-ID: <49AB5BF4.6020203@canterbury.ac.nz>

Jacob Holm wrote:

> Did you read the spelled-out version at the bottom?  No need to "break 
> out" of anything. That happens automatically because of the "yield 
> from".  Just a few well-placed calls to next...

Yes, and I tried running it on my prototype implementation.
It gives exactly the results you suggested, and any further
next() calls on a or b raise StopIteration.

You're right that there's no need to break out early in the
case of generators, since it seems they just continue to
raise StopIteration if you call next() on them after they've
finished. Other kinds of iterators might not be so forgiving.

> I am not worried about R running out, each of A and B would find out 
> about that next time they tried to get a value. I *am* worried about R 
> doing a yield-from to X (the xrange in this example) which then needs to 
> appear in both stacks to get the expected behavior from the PEP.

What's wrong with it appearing in both stacks, though,
as long as it gives the right result?

> I am saying that what the PEP currently specifies is not quite so simple 
> to speed up as you and Arnaud seem to think.
> (Even with a simple stack, handling 'close' and 'StopIteration' 
> correctly is not exactly trivial)

It's a bit tricky to handle all the cases correctly, but
that has nothing to do with speed.

I should perhaps point out that neither of my suggested
implementations actually use separate stacks like this.
The data structure is really more like the shared stack
you suggest, except that it's accessed from the "bottoms"
rather than the "top".

This is only a speed issue if the time taken to find the
"top" starting from one of the "bottoms" is a significant
component of the total running time. My conjecture is that
it won't be, especially if you do it iteratively in a
tight C loop.

Some timing experiments I did suggest that the current
implementation (which finds the "top" using recursion in C
rather than iteration) is at least 20 times faster at
delegating a next() call than using a for-loop, which is
already a useful improvement, and the iterative method
can only make it better.

So until someone demonstrates that the simple algorithm
I'm using is too slow in practice, I don't see much point
in trying to find a smarter one.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon Mar  2 05:21:51 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 17:21:51 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <fb6fbf560903011905n67a53349v694f850c1b508068@mail.gmail.com>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
	<fb6fbf560903011905n67a53349v694f850c1b508068@mail.gmail.com>
Message-ID: <49AB5EDF.6070903@canterbury.ac.nz>

Jim Jewett wrote:

> I would expect that to be interpreted as a StopIteration and handled
> gracefully. If that doesn't seem reasonable, then I wonder if the
> whole protocol is still too fragile.

I've just done what I should have done in the first place
and checked the documentation. From the Library Reference,
section 3.5, Iterator Types:

"The intention of the protocol is that once an iterator's next() method raises 
StopIteration, it will continue to do so on subsequent calls. Implementations 
that do not obey this property are deemed broken. (This constraint was added in 
Python 2.3; in Python 2.2, various iterators are broken according to this rule.)"

So according to modern-day rules at least, Jacob's example
will work fine.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon Mar  2 05:37:24 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 02 Mar 2009 17:37:24 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <fb6fbf560903011934h3fc3527bkb1b3d8609a2f63a9@mail.gmail.com>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
	<49AB4038.6070205@improva.dk>
	<fb6fbf560903011934h3fc3527bkb1b3d8609a2f63a9@mail.gmail.com>
Message-ID: <49AB6284.7020202@canterbury.ac.nz>

Jim Jewett wrote:

> how often will there be long chains,

My suspicion is not very often. The timing tests I
did suggest that the biggest benefit will be from
simply removing most of the delegation overhead from
a single level of delegation, and you don't need
any fancy algorithm for that.

My experiments with traversing binary trees suggest
that the delegation overhead isn't all that great
a problem until the tree is about 20 levels deep,
or 1e6 nodes. Even then it doesn't exactly kill you,
but even so, my naive implementation shows a clear
improvement.

-- 
Greg


From denis.spir at free.fr  Mon Mar  2 08:16:28 2009
From: denis.spir at free.fr (spir)
Date: Mon, 2 Mar 2009 08:16:28 +0100
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
	<ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
Message-ID: <20090302081628.502bc1ee@o>

Le Sun, 1 Mar 2009 14:31:58 -0800,
Guido van Rossum <guido at python.org> s'exprima ainsi:

> Programming language
> design is not a rational science. Most reasoning about is is at best
> rationalization of gut feelings, and at worst plain wrong.

!
;-)

Denis
------
la vita e estrany

PS: Time for starting a "Quotes" section at
http://en.wikipedia.org/wiki/Guido_van_Rossum
?


From pyideas at rebertia.com  Mon Mar  2 08:18:48 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 1 Mar 2009 23:18:48 -0800
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <20090302081628.502bc1ee@o>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
	<ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
	<20090302081628.502bc1ee@o>
Message-ID: <50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com>

On Sun, Mar 1, 2009 at 11:16 PM, spir <denis.spir at free.fr> wrote:
> Le Sun, 1 Mar 2009 14:31:58 -0800,
> Guido van Rossum <guido at python.org> s'exprima ainsi:
>
>> Programming language
>> design is not a rational science. Most reasoning about is is at best
>> rationalization of gut feelings, and at worst plain wrong.

+1 QOTW!

- Chris

-- 
Shameless self-promotion:
http://blog.rebertia.com


From g.brandl at gmx.net  Mon Mar  2 09:10:41 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 02 Mar 2009 09:10:41 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <ca471dc20903011401t7581575lb4a40774c1f7d94c@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com>	<76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com>
	<ca471dc20903011401t7581575lb4a40774c1f7d94c@mail.gmail.com>
Message-ID: <gog4a1$fkt$1@ger.gmane.org>

Guido van Rossum schrieb:
> On Sun, Mar 1, 2009 at 1:15 PM, Calvin Spealman <ironfroggy at gmail.com> wrote:
>> On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith <greg at krypto.org> wrote:
>>> Alternatively if closer conformity with for loop syntax is desirable
>>> consider this:
>>> with lock, open(infile), open(outfile) as lock, fin, fout:
>>>     fout.fwrite(fin.read())
>>
>> +1
>>
>> We don't have multi-assignment statements in favor of the unpacking
>> concept, and I think it carries over here. Also, as mentioned, this
>> goes along with the lack of any multi-for statement. The `x as y` part
>> of the with statement is basically an assignment with extras, and the
>> original suggestion then combines multiple assignments on one line.
>> This option, I think, is more concise and readable.
> 
> -1 for this variant. The syntactic model is import: import foo as bar,
> bletch, quuz as frobl. If we're doing this it should be like this.

Also, the "a as b, c as d" syntax better conveys the fact that the managers
are called sequentially, and not somehow "in parallel".

Georg


From clockworksaint at gmail.com  Mon Mar  2 10:22:10 2009
From: clockworksaint at gmail.com (Weeble)
Date: Mon, 2 Mar 2009 09:22:10 +0000
Subject: [Python-ideas] Syntax for curried functions
Message-ID: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>

Would this be crazy? Make this:

def spam(a,b,c)(d,e,f):
    [BODY]

a synonym for:

def spam(a,b,c):
    def spam2(d,e,f):
        [BODY]
    return spam2

As far as I can tell, it would always be illegal syntax right now, so
it seems a safe change. And it would make it cleaner to write higher
order functions. While I had the idea bouncing round my head, I
realised I would quite like the syntax for writing methods too. E.g.:

class Eggs(object):
    def spam(self)(a,b,c):
        [BODY]

Of course, that wouldn't work unless you wrote and used a special
meta-class to change how bound methods work (instead of creating a
bound method with the instance and function, you'd call the "method
function" with the instance and get back the "bound method"), and I
have no idea, but I guess it might slow things down. But I thought it
was worth mentioning because it looked cool.

Anyway, that's tangential to the idea. The idea is that any (positive)
number of parameter lists could follow the method name in its
declaration. The function takes as input its first parameter list and
returns a function that takes the second parameter list, etc., until
there are no more method lists and the last level of function executes
and returns the result of the function body.


From steve at pearwood.info  Mon Mar  2 13:09:57 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 02 Mar 2009 23:09:57 +1100
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
Message-ID: <49ABCC95.2050503@pearwood.info>

Weeble wrote:
> Would this be crazy? Make this:
> 
> def spam(a,b,c)(d,e,f):
>     [BODY]
> 
> a synonym for:
> 
> def spam(a,b,c):
>     def spam2(d,e,f):
>         [BODY]
>     return spam2
> 
> As far as I can tell, it would always be illegal syntax right now, so
> it seems a safe change. And it would make it cleaner to write higher
> order functions. 

I don't think it would. Your proposed syntax is only useful in the case 
that the outer function has no doc string and there's no pre-processing 
or post-processing of the inner function, including decorators.

In practice, when I write higher order functions, I often do something 
like this example:

# untested
def spam_factory(n):
     """Factory returning decorators that print 'spam' n times."""
     msg = "spam " * n
     if n > 5:
         msg += 'WONDERFUL SPAM!!!'
     def decorator(func):
         """Print 'spam' %d times."""
         @functools.wraps(func)
         def f(*args, *kwargs):
             print msg
             return func(*args, **kwargs)
         return f
     decorator.__doc__ = decorator.__doc__ % n
     return decorator

Although the function as given is contrived, the basic structure 
(including doc strings, decorators, pre-processing and post-processing) 
is realistic, and your suggested syntax wouldn't be useful here.


-- 
Steven


From sturla at molden.no  Mon Mar  2 13:21:09 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 02 Mar 2009 13:21:09 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <goesse$qhd$1@ger.gmane.org>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org>
Message-ID: <49ABCF35.5030002@molden.no>

On 3/1/2009 9:57 PM, Christian Heimes wrote:

>   with a, b as x, d as y:

I'd like to add that parentheses improve readability here:

    with a, (b as x), (d as y):

I am worried the proposed syntax could be a source of confusion and 
errors. E.g. when looking at

    with a,b as c,d:

my eyes read

    with nested(a,b) as c,d:

when Python would read

    with a,(b as c),d:


It may actually be better to keep the current implementation with 
contextlib.nested.

If contextlib.nested is not well known (I only learned of its existence 
recently), maybe it should be better documented? Tutorial examples of 
the with statement should cover contextlib.nested as well.


Sturla Molden


From dangyogi at gmail.com  Mon Mar  2 15:45:56 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Mon, 02 Mar 2009 09:45:56 -0500
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
Message-ID: <49ABF124.2000907@gmail.com>

Weeble wrote:
> Would this be crazy? Make this:
>
> def spam(a,b,c)(d,e,f):
>     [BODY]
>
> a synonym for:
>
> def spam(a,b,c):
>     def spam2(d,e,f):
>         [BODY]
>     return spam2
>   
Why not:

def curry(n):
    def decorator(fn):
        @functools.wraps(fn)
        def surrogate(*args):
            if len(args) != n:
                raise TypeError("%s() takes exactly %d arguments (%d 
given)" %
                                (fn.__name__, n, len(args)))
            def curried(*rest, **kws):
                return fn(*(args + rest), **kws)
            return curried
        return surrogate
    return decorator

@curry(3)
def spam(a,b,c,d,e,f):
    print a,b,c,d,e,f

spam(1,2,3)(4,5,6)


-bruce frederiksen


From guido at python.org  Mon Mar  2 16:25:03 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 2 Mar 2009 07:25:03 -0800
Subject: [Python-ideas] "try with" syntactic sugar
In-Reply-To: <50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com>
References: <eae285400902260636i20038696j4e0333358fbd17ab@mail.gmail.com>
	<9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com>
	<d2155e360902270449k155ce42bs5f292f5f967fc700@mail.gmail.com>
	<ca471dc20902271120x5bd970e8y8c6d098442780b10@mail.gmail.com>
	<eae285400903011249n47f93066l64d442904ce35cbc@mail.gmail.com>
	<ca471dc20903011431y290b366aude6c664e4f02ed1b@mail.gmail.com>
	<20090302081628.502bc1ee@o>
	<50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com>
Message-ID: <ca471dc20903020725j4b07512i24b797db471bc1ef@mail.gmail.com>

On Sun, Mar 1, 2009 at 11:18 PM, Chris Rebert <pyideas at rebertia.com> wrote:
> On Sun, Mar 1, 2009 at 11:16 PM, spir <denis.spir at free.fr> wrote:
>> Le Sun, 1 Mar 2009 14:31:58 -0800,
>> Guido van Rossum <guido at python.org> s'exprima ainsi:
>>
>>> Programming language
>>> design is not a rational science. Most reasoning about is is at best
>>> rationalization of gut feelings, and at worst plain wrong.
>
> +1 QOTW!

Let me add that this formulation was at least in part inspired by
reading "How we decide" by Jonah Lehrer.

http://www.amazon.com/How-We-Decide-Jonah-Lehrer/dp/0618620117

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From Scott.Daniels at Acm.Org  Mon Mar  2 18:05:54 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Mon, 02 Mar 2009 09:05:54 -0800
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
Message-ID: <goh3gc$u75$1@ger.gmane.org>

Weeble wrote:
> Would this be crazy? Make this:
> 
> def spam(a,b,c)(d,e,f):
>     [BODY]
> 
> a synonym for:
> 
> def spam(a,b,c):
>     def spam2(d,e,f):
>         [BODY]
>     return spam2

What advantage does this style have over:

     def spam(a, b, c, d, e, f):
         [BODY]

     spam3 = functools.partial(spam, 'initial', 'three', 'args')

--Scott David Daniels
Scott.Daniels at Acm.Org


From sturla at molden.no  Mon Mar  2 18:06:55 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 02 Mar 2009 18:06:55 +0100
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
Message-ID: <49AC122F.4090403@molden.no>

On 3/2/2009 10:22 AM, Weeble wrote:
> Would this be crazy? Make this:
> 
> def spam(a,b,c)(d,e,f):
>     [BODY]
> 
> a synonym for:
> 
> def spam(a,b,c):
>     def spam2(d,e,f):
>         [BODY]
>     return spam2

This is atrocious.


From clockworksaint at gmail.com  Mon Mar  2 20:07:14 2009
From: clockworksaint at gmail.com (Weeble)
Date: Mon, 2 Mar 2009 19:07:14 +0000
Subject: [Python-ideas] Syntax for curried functions
Message-ID: <13e3f9930903021107n5deac6e8lfcf3574e9c9e3a4f@mail.gmail.com>

Steven D'Aprano wrote:
> Weeble wrote:
> > Would this be crazy? Make this:
> >
> > def spam(a,b,c)(d,e,f):
> >     [BODY]
> >
> > a synonym for:
> >
> > def spam(a,b,c):
> >     def spam2(d,e,f):
> >         [BODY]
> >     return spam2
> >
> > As far as I can tell, it would always be illegal syntax right now, so
> > it seems a safe change. And it would make it cleaner to write higher
> > order functions.
>
> I don't think it would. Your proposed syntax is only useful in the case
> that the outer function has no doc string and there's no pre-processing
> or post-processing of the inner function, including decorators.

Fair enough. I knew I must be overlooking something.

Sturla Molden wrote:
> This is atrocious.

:'(

Sorry. I thought it might be a bit of a crazy idea. I didn't think it
was that bad!


From adam at atlas.st  Tue Mar  3 00:05:25 2009
From: adam at atlas.st (Adam Atlas)
Date: Mon, 2 Mar 2009 18:05:25 -0500
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
Message-ID: <FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>


On 2 Mar 2009, at 04:22, Weeble wrote:
> Would this be crazy? Make this:
>
> def spam(a,b,c)(d,e,f):
>    [BODY]
>
> a synonym for:
>
> def spam(a,b,c):
>    def spam2(d,e,f):
>        [BODY]
>    return spam2


I once proposed (as far as I can tell) the exact same thing. Here's  
the discussion that took place -- http://markmail.org/message/aa22tnx2vog3rnin

I still like the idea, but it doesn't appear to be very popular.


From clockworksaint at gmail.com  Tue Mar  3 01:22:40 2009
From: clockworksaint at gmail.com (Weeble)
Date: Mon, 2 Mar 2009 16:22:40 -0800 (PST)
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> 
	<FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>
Message-ID: <cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>

On Mar 2, 11:05?pm, Adam Atlas <a... at atlas.st> wrote:
> I once proposed (as far as I can tell) the exact same thing. Here's ?
> the discussion that took place --http://markmail.org/message/aa22tnx2vog3rnin
>
> I still like the idea, but it doesn't appear to be very popular.

Thank you, at least I feel a little less foolish now. I did try to
search for such a proposal, but it's hard to know what search terms to
use. To be honest I don't think it's generally useful enough to merit
a change to the language, but somehow the idea floated into my head
and it just seemed so *neat* that I had to tell somebody. It may be
that it just appeals to my sense of aesthetics.


From greg.ewing at canterbury.ac.nz  Tue Mar  3 21:56:56 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 04 Mar 2009 09:56:56 +1300
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
	<FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>
	<cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
Message-ID: <49AD9998.7050009@canterbury.ac.nz>

Weeble wrote:

> I don't think it's generally useful enough to merit
> a change to the language, but somehow the idea floated into my head
> and it just seemed so *neat* that I had to tell somebody.

The designers of at least one other language seem to
think it's neat, too. Scheme has an exactly analogous
construct:

   (define ((f x) y)
     ...)

which is shorthand for

   (define f
     (lambda (x)
       (lambda (y) ...)))

[The *really* neat thing about the Scheme version is
the way it just naturally falls out of the macro expansion
of 'define' in terms of 'lambda' -- so it's not really
a separate language feature at all!]

-- 
Greg


From guido at python.org  Tue Mar  3 23:22:25 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 3 Mar 2009 14:22:25 -0800
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <49AD9998.7050009@canterbury.ac.nz>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
	<FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>
	<cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
	<49AD9998.7050009@canterbury.ac.nz>
Message-ID: <ca471dc20903031422l39e7fa87v5d5688be947eaa82@mail.gmail.com>

Haskell has this too, perhaps even more extreme: there's not really
such a thing in Haskell as a function of N arguments (N > 1). "f a b =
..." defines a function f of one argument a which returns another
function ("f a") of one argument b. And so on.

That doesn't mean we need to copy this idea in Python.

On Tue, Mar 3, 2009 at 12:56 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Weeble wrote:
>
>> I don't think it's generally useful enough to merit
>> a change to the language, but somehow the idea floated into my head
>> and it just seemed so *neat* that I had to tell somebody.
>
> The designers of at least one other language seem to
> think it's neat, too. Scheme has an exactly analogous
> construct:
>
> ?(define ((f x) y)
> ? ?...)
>
> which is shorthand for
>
> ?(define f
> ? ?(lambda (x)
> ? ? ?(lambda (y) ...)))
>
> [The *really* neat thing about the Scheme version is
> the way it just naturally falls out of the macro expansion
> of 'define' in terms of 'lambda' -- so it's not really
> a separate language feature at all!]
>
> --
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz at pythoncraft.com  Wed Mar  4 00:06:50 2009
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 3 Mar 2009 15:06:50 -0800
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com>
	<FDE01E10-4859-4354-B5B1-A64E1669415F@atlas.st>
	<cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
Message-ID: <20090303230650.GA10398@panix.com>

On Mon, Mar 02, 2009, Weeble wrote:
> On Mar 2, 11:05?pm, Adam Atlas <a... at atlas.st> wrote:
>>
>> I once proposed (as far as I can tell) the exact same thing. Here's ?
>> the discussion that took place --http://markmail.org/message/aa22tnx2vog3rnin
>>
>> I still like the idea, but it doesn't appear to be very popular.
> 
> Thank you, at least I feel a little less foolish now. I did try to
> search for such a proposal, but it's hard to know what search terms to
> use. To be honest I don't think it's generally useful enough to merit
> a change to the language, but somehow the idea floated into my head
> and it just seemed so *neat* that I had to tell somebody. It may be
> that it just appeals to my sense of aesthetics.

Don't worry too much about it -- part of the point of python-ideas is to
provide an appropriate forum for trial ballons like this.  It's only
foolish when you haven't done your homework, especially for issues that
have previously been brought up ad nauseum.  It's also a good idea to
keep in mind that even when people response with "ewww, yuck!" they're
attacking your idea, not you (yes, I know how difficult that can be ;-).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.


From leif.walsh at gmail.com  Wed Mar  4 01:37:33 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Tue, 3 Mar 2009 19:37:33 -0500 (EST)
Subject: [Python-ideas] Syntax for curried functions
In-Reply-To: <cb294a1e-029a-46a1-bfb0-91fd82ba8385@a39g2000yqc.googlegroups.com>
Message-ID: <frva4qmdnm2dws1ojoUYAxe124vaj_firegpg@mail.gmail.com>

On Mon, Mar 2, 2009 at 7:22 PM, Weeble <clockworksaint at gmail.com> wrote:
> Thank you, at least I feel a little less foolish now. I did try to
> search for such a proposal, but it's hard to know what search terms to
> use. To be honest I don't think it's generally useful enough to merit
> a change to the language, but somehow the idea floated into my head
> and it just seemed so *neat* that I had to tell somebody. It may be
> that it just appeals to my sense of aesthetics.

If it makes you feel better, you can still get almost all of the way there with decorators, and it's even "officially" documented: http://wiki.python.org/moin/PythonDecoratorLibrary#Pseudo-currying

-- 
Cheers,
Leif

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 270 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090303/4a4c9309/attachment.pgp>

From kornelpal at gmail.com  Wed Mar  4 10:21:33 2009
From: kornelpal at gmail.com (=?UTF-8?B?S29ybsOpbCBQw6Fs?=)
Date: Wed, 4 Mar 2009 10:21:33 +0100
Subject: [Python-ideas] Python Bytecode Verifier
Message-ID: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com>

Hi,

I've created a Python Bytecode Verifier in Python. I'm not a Python
guru so I borrowed coding patterns from C/C++. I also created this
with C portability in mind. The only reason I used Python was to
experiment with Python and was easier to morph code during
development.

If this program were ported to C it would only need 8 bytes per opcode
(some additional storage to track blocks) and a single pass. I haven't
found backward jumps to previously unused code in any compiled Python
code but it can easily be supported. In that case some more partial
passes are required.

I also was able to successfully verify all Python files in the
Python-2.5.2 source package.

The verification algorythm should be quite complete but I may have
missed some limitations of the interpreter that could be checked by
the verifier as well.

The ability to create this verfier proved that although Python
bytecode is designed for a dynamically typed interpreter, is still
easily verifiable. I am willing port this code C but only in the case
if there is any chance to be included in Python.

I believe that Python in general would benefit having the ability to
safely load .pyc files and create code objects on the fly.

Both Java and .NET have the ability to safely load compiled byte code.
.NET Framework, just like Python also has the ability to create and
execute new code at run-time.

You may feel that enabling closed source applications and/or creating
a multi-language runtime would hurt Python but both of these have
contributed to the success of Java (both the language and the
runtime).

Korn?l
-------------- next part --------------
A non-text attachment was scrubbed...
Name: verifier.py
Type: application/octet-stream
Size: 22908 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090304/7cef48d9/attachment.obj>

From aahz at pythoncraft.com  Wed Mar  4 15:49:22 2009
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 4 Mar 2009 06:49:22 -0800
Subject: [Python-ideas] Python Bytecode Verifier
In-Reply-To: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com>
References: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com>
Message-ID: <20090304144922.GA2645@panix.com>

On Wed, Mar 04, 2009, Korn?l P?l wrote:
> 
> I've created a Python Bytecode Verifier in Python. I'm not a Python
> guru so I borrowed coding patterns from C/C++. I also created this
> with C portability in mind. The only reason I used Python was to
> experiment with Python and was easier to morph code during
> development.

You should upload this to PyPI.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.


From rrr at ronadam.com  Thu Mar  5 19:31:38 2009
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 05 Mar 2009 12:31:38 -0600
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49AB5BF4.6020203@canterbury.ac.nz>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk>	<49A9A488.4070308@improva.dk>	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>	<49AA7FEC.4090609@improva.dk>
	<49AB0CF5.6020900@canterbury.ac.nz>	<49AB23AA.4060106@improva.dk>
	<49AB31C0.5000005@canterbury.ac.nz>	<49AB4038.6070205@improva.dk>
	<49AB5BF4.6020203@canterbury.ac.nz>
Message-ID: <49B01A8A.1010907@ronadam.com>


Greg Ewing wrote:

> This is only a speed issue if the time taken to find the
> "top" starting from one of the "bottoms" is a significant
> component of the total running time. My conjecture is that
> it won't be, especially if you do it iteratively in a
> tight C loop.


Could it be possible to design it so that the yield path of generators are 
passed down or forwarded to sub-generators when they are called?  If that 
can be done then no matter how deep they get, it works as if it is always 
only one generator deep.

So...

      result = yield from sub-generator

Might be sugar for ...  (very rough example)

      sub-generator.forward(this_generator_yield_path)  # set yield path
      sub_generator.run()         # pass control to sub-generator
      result = sub_generator.return_value     # get return value if set

Of course the plumbing in this may takes some creative rewriting of 
generators so the yield path can be passed around.  Being able to get the 
yield path might also be useful in other ways, such as error reporting.

Cheers, Ron


From greg.ewing at canterbury.ac.nz  Thu Mar  5 21:34:32 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 Mar 2009 09:34:32 +1300
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49B01A8A.1010907@ronadam.com>
References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk>
	<499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk>
	<49A9A488.4070308@improva.dk>
	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>
	<49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz>
	<49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz>
	<49B01A8A.1010907@ronadam.com>
Message-ID: <49B03758.6090703@canterbury.ac.nz>

Ron Adam wrote:

> Could it be possible to design it so that the yield path of generators 
> are passed down or forwarded to sub-generators when they are called?

It's really the other way around -- you would need some way
of passing the yield path *up* to a generator's caller.

E.g. if A is yielding from B is yielding from C, then A needs
a direct path to C.

Possibly some scheme could be devised to do this, but I don't
feel like spending any brain cycles on it until the current
scheme is shown to be too slow in practice. Premature
optimisation and all that.

  --
Greg


From rrr at ronadam.com  Fri Mar  6 03:58:17 2009
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 05 Mar 2009 20:58:17 -0600
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49B03758.6090703@canterbury.ac.nz>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<499F6053.40407@improva.dk>	<499F93A9.9070500@canterbury.ac.nz>
	<499FC423.6080500@improva.dk>	<49A9A488.4070308@improva.dk>	<E81EF0A7-D80B-4D9F-B5DB-CE9E950C16F1@googlemail.com>	<49AA7FEC.4090609@improva.dk>
	<49AB0CF5.6020900@canterbury.ac.nz>	<49AB23AA.4060106@improva.dk>
	<49AB31C0.5000005@canterbury.ac.nz>	<49AB4038.6070205@improva.dk>
	<49AB5BF4.6020203@canterbury.ac.nz>	<49B01A8A.1010907@ronadam.com>
	<49B03758.6090703@canterbury.ac.nz>
Message-ID: <49B09149.5020001@ronadam.com>


Greg Ewing wrote:
> Ron Adam wrote:
> 
>> Could it be possible to design it so that the yield path of generators 
>> are passed down or forwarded to sub-generators when they are called?
> 
> It's really the other way around -- you would need some way
> of passing the yield path *up* to a generator's caller.
 >
> E.g. if A is yielding from B is yielding from C, then A needs
> a direct path to C.

So when A.next() is called it in effect does a C.next() instead. Is that 
correct?  And when C's yield statement is executed, it needs to return the 
value to A's caller.

So the .next() methods need to be passed up, while the yield return path 
needs to be passed down?

OK, I guess I need to look at some byte/source code. ;-)


> Possibly some scheme could be devised to do this, but I don't
> feel like spending any brain cycles on it until the current
> scheme is shown to be too slow in practice. Premature
> optimisation and all that.

Right I agree.  I have an intuitive feeling that being able to expose and 
redirect caller and return values may useful in more ways than just 
generators.  ie.. general functions, translators, encoders, decorators, and 
possibly method resolution.  <shrug> then again, there may not be any easy 
obvious way to do it.

Ron


From arnodel at googlemail.com  Fri Mar  6 11:11:24 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Fri, 6 Mar 2009 10:11:24 +0000
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <49B09149.5020001@ronadam.com>
References: <499DDA4C.8090906@canterbury.ac.nz> <49AA7FEC.4090609@improva.dk>
	<49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk>
	<49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk>
	<49AB5BF4.6020203@canterbury.ac.nz> <49B01A8A.1010907@ronadam.com>
	<49B03758.6090703@canterbury.ac.nz> <49B09149.5020001@ronadam.com>
Message-ID: <9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com>

2009/3/6 Ron Adam <rrr at ronadam.com>:
>
>
> Greg Ewing wrote:
[...]
> So when A.next() is called it in effect does a C.next() instead. Is that
> correct? ?And when C's yield statement is executed, it needs to return the
> value to A's caller.

This is what the toy python implementation that I posted last month
does.  I think Jacob Holm did something more sophisticated that does
this as well (I haven't seen it!).

> So the .next() methods need to be passed up, while the yield return path
> needs to be passed down?

Can you draw a picture? :)

-- 
Arnaud


From ben+python at benfinney.id.au  Fri Mar  6 11:37:23 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 06 Mar 2009 21:37:23 +1100
Subject: [Python-ideas] Cross-platform file locking, PID files ,
	and the "daemon" PEP
References: <21108.1233038783@pippin.parc.xerox.com>
	<87wschyykf.fsf@benfinney.id.au> <20090127090058.GA302@phd.pp.ru>
	<20090127160643.GC37589@wind.teleri.net>
	<20090127161933.GB28125@phd.pp.ru> <497F351D.30700@timgolden.me.uk>
	<4222a8490901270830l275a7e55t1514e51944d86089@mail.gmail.com>
	<87eiyozbus.fsf@benfinney.id.au>
	<20090128012906.GH57568@wind.teleri.net>
	<87ab9cyy42.fsf@benfinney.id.au>
	<20090128030351.GD64950@wind.teleri.net>
	<871vuoyuxg.fsf@benfinney.id.au> <87k58gdpgl.fsf@xemacs.org>
	<874ozhtq3c.fsf_-_@benfinney.id.au> <87ocxpcpxz.fsf@xemacs.org>
Message-ID: <87fxhr544c.fsf@benfinney.id.au>

"Stephen J. Turnbull" <stephen at xemacs.org> writes:

> Ben Finney writes:
> 
>  > Skip Montanaro has a PyPI package, ?lockfile? (currently marked
>  > "Beta") <URL:http://pypi.python.org/pypi/lockfile> that has such
>  > ambitions.
> 
> I would send your proposal for PIDlockfile to Skip. [?]

For those people who have asked about the status of this: Yes, I'm
currently working with Skip on the ?lockfile? package so that it can
be used for the ?daemon? implementation.

-- 
 \       ?Following fashion and the status quo is easy. Thinking about |
  `\        your users' lives and creating something practical is much |
_o__)                                harder.? ?Ryan Singer, 2008-07-09 |
Ben Finney


From jh at improva.dk  Fri Mar  6 13:20:25 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 06 Mar 2009 13:20:25 +0100
Subject: [Python-ideas] Revised**5 PEP on yield-from
In-Reply-To: <9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com>
References: <499DDA4C.8090906@canterbury.ac.nz>
	<49AA7FEC.4090609@improva.dk>	<49AB0CF5.6020900@canterbury.ac.nz>
	<49AB23AA.4060106@improva.dk>	<49AB31C0.5000005@canterbury.ac.nz>
	<49AB4038.6070205@improva.dk>	<49AB5BF4.6020203@canterbury.ac.nz>
	<49B01A8A.1010907@ronadam.com>	<49B03758.6090703@canterbury.ac.nz>
	<49B09149.5020001@ronadam.com>
	<9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com>
Message-ID: <49B11509.10408@improva.dk>

Arnaud Delobelle wrote:
> [...]
>   
> This is what the toy python implementation that I posted last month
> does.  I think Jacob Holm did something more sophisticated that does
> this as well (I haven't seen it!).
>
> [...]

I do, and here is the most readable version so far.  If you look very 
closely, you will notice a certain similarity to the version you
posted.  It turned out there was an easy fix for the 'closing' problem i 
mentioned, and a hack based on gi_frame that allowed me to use a 
generator function for the wrapper after all, so the 
_generator_iterator_wrapper below is heavily based on your code. The 
rest is there to implement the tree structure I mentioned, instead of 
just a simple stack.  The speed of this version on simple long chains is 
still about 6 times slower than your code.  This can be improved to 
about 1.5 times by inlining most of the function calls and eliminating 
the RootPath class, but the resulting code is close to unreadable and 
therefore not well suited to explain the idea.  Most of the remaining 
slowdown compared to "for v in it: yield v" is due to attribute lookups 
and simple integer computations.  In fact, I would be surprised if a C 
version of this wasn't going to be faster even for just a single "yield 
from".

Anyway, here it is

  Jacob


"""Tree structure implementing the operations needed by yield-from.
"""


class Node(object):
    """A Node object represents a single node in the tree.
    """

    __slots__ = ('parent', # the parent of this node (if any)
                 'chain', # the chain this node belongs to (if any)
                 'child', # the child of this node in the chain (if any)
                 'size', # The number of descendants of this node
                         # (including the node itself) that are not
                         # descendants of 'child', plus the number of
                         # times any of these nodes have been accessed.
                 )

    def __init__(self):
        self.parent = None # We are not actually using this one, but
                           # it is cheap to maintain.
        self.chain = None
        self.child = None
        self.size = 0


class Chain(object):
    """A Chain object represents a fragment of the path from some node
    towards the root.

    Chains are long-lived, and are used to make shortcuts in the tree
    enabling operations to take O(logN) rather than O(N) time. (And
    even enabling O(1) for certain usage patterns).
    """

    __slots__ = ('top', # topmost node in the chain
                 'size', # The sum of sizes of all nodes in the chain
                 'parent', # the parent node of top in the tree
                 'rp_child', # the child of this chain in the current
                             # root path
                 )

    def __init__(self, *nodes):
        """Construct a chain from the given nodes.

        The nodes must have a size>0 assigned before constructing the
        chain, and must have their 'chain' pointer set to None.

        The nodes are linked together using their 'child' pointers,
        and get their 'chain' pointers set to the new chain.  First
        node in the list will be at the bottom of the chain, last node
        in the list becomes the value of the 'top' field.  The size of
        the new chain is the sum of the sizes of the nodes.
        """
        top = None
        size = 0
        for node in nodes:
            assert node.chain is None
            assert node.child is None
            assert node.size > 0
            node.chain = self
            node.child = top
            size += node.size
            top = node
        parent = None
        for node in reversed(nodes):
            node.parent = parent
            parent = node
        self.top = top
        self.size = size
        self.parent = None
        self.rp_child = None


class RootPath(object):
    """A RootPath represents the whole path from the root to some given node.

    RootPaths need to be 'acquired' before use and 'released' after.
    Acquiring the path establishes the necessary down-links, ensures
    the path has length O(logN), and gives an efficient way of
    detecting and avoiding loops.
    """

    __slots__ = ('base', # The node defining the path
                 'root', # The root node of the tree
                 )


    def __init__(self, base):
        """Construct the RootPath representing the path from base to its root.
        """
        assert isinstance(base, Node)
        self.base = base
        self.root = None


    def acquire(self, sizedelta=0):
        """Find the root and take ownership of the tree containing self.base.

        Create a linked list from the root to base using the
        Chain.rp_child pointers.  (Using self as sentinel in
        self.base.rp_child).

        Optionally adds 'sizedelta' to the size of the base node for
        the path and updates the rest of the sizes accordingly.

        If the tree was already marked (one of the chains had a
        non-None rp_child pointer), back out all changes and return
        None, else return the root node of the tree.
        """
        assert self.root is None
        node = self.base
        assert node is not None
        rp_child = self
        while True:
            chain = node.chain
            assert chain is not None
            if chain.rp_child is not None:
                # Some other rootpath already owns this tree. Undo the
                # changes so far and raise a RuntimeError
                if rp_child is not self:
                    while True:
                        chain = rp_child
                        rp_child = chain.rp_child
                        chain.rp_child = None
                        if rp_child is self:
                            break
                        node = rp_child.parent
                        node.size -= sizedelta
                        chain.size -= sizedelta
                return None
            assert chain.rp_child is None
            node.size += sizedelta
            chain.size += sizedelta
            chain.rp_child = rp_child
            rp_child = chain
            node = rp_child.parent
            if node is None:
                break
        self.root = root = rp_child.top
        assert root.chain is not None
        assert root.chain.parent is None
        # Tricky, this rebalancing is needed because cut_root
        # cannot do a full rebalancing fast enough without maintaining
        # a lot more information. We may actually have traversed a
        # (slightly) unbalanced path above. Rebalancing here makes
        # sure that it is balanced when we return, and the cost of
        # traversing the unbalanced path can be included in the cost
        # of rebalancing it. The amortized cost is still O(logn) per
        # operation as it should be.
        self._rebalance(self, False)
        return root


    def release(self):
        """Release the tree containing this RootPath.
        """
        assert self.root is not None
        chain = self.root.chain
        assert chain is not None
        while chain is not self:
            child = chain.rp_child
            chain.rp_child = None
            chain = child
        self.root = None


    def cut_root(self):
        """Cut the link between the root and its child on the root path.

        Release the tree containing the root.

        If the root was the only node on the path, return None.  Else
        return the new root.
        """
        root = self.root
        assert root is not None
        chain = root.chain

        assert chain.parent is None
        assert root is chain.top

        child = chain.rp_child
        assert child is not None

        if child is self:
            # only one chain
            if root is self.base:
                # only one node, release tree
                chain.rp_child = None
                self.root = None
                return None
        else:
            # multiple chains
            if root is child.parent:
                # only one node from this chain on the path.
                root.size -= child.size
                chain.size -= child.size
                child.parent = None
                chain.rp_child = None
                self.root = newroot = child.top
                newroot.parent = None
                return newroot
        # Multiple nodes on the chain. Cut the topmost off and put it
        # into its own chain if necessary. (This is needed when the
        # node has other children)
        newroot = root.child
        assert newroot
        newroot.parent = None

        self.root = chain.top = newroot
        chain.size -= root.size

        root.child = None
        root.chain = None
        Chain(root)

        self._rebalance(self, True)
        return newroot


    def link(self, node):
        """Make the root of this tree a child of a node from another tree.

        Return the root of the resulting tree on succes, or None if
        the tree for the parent node couldn't be acquired.
        """
        root = self.root
        assert root is not None

        chain = root.chain
        assert chain.parent is None

        assert isinstance(node, Node)
        rp = RootPath(node)
        newroot = rp.acquire(chain.size)
        if newroot is None:
            return None

        self.root = newroot

        node.chain.rp_child = chain
        root.parent = chain.parent = node
        self._rebalance(chain.rp_child, False)
        return newroot


    def _rebalance(self, stop, quick):
        # check and rebalance all the chains starting from the root.
        # If 'quick' is True, stop the first time no rebalancing took
        # place, else continue until the child is 'stop'.
        gpchain = None
        pchain = self.root.chain
        chain = pchain.rp_child
        while chain is not stop:
            parent = chain.parent
            if 2*(pchain.size-parent.size) <= chain.size:
                # Unbalanced chains. Move all ancestors to parent from
                # pchain into this chain. This may look like an
                # expensive operation, but the balancing criterion is
                # chosen such that every time a node is moved from one
                # chain to another, the sum of the sizes of everything
                # *after* the node in its chain is at least
                # doubled. This number is only decreased by cut_root
                # (where it may be reset to zero), so a node can move
                # between chains at most log_2(N) times before if
                # becomes the root in a cut_root. The amortized cost
                # of keeping the tree balanced is thus O(logN). The
                # purpose of the balancing is of course to keep the
                # height of the tree down. Any node that is balanced
                # according to this criterion has
                #
                #  2*(pchain.size-self.size) > 2*(pchain.size-self.parent.size)
                #                            > self.size
                #
                # and so
                #
                #   pchain.size > 1.5*self.size
                #
                # Therefore, the number of chains in any balanced
                # RootPath is at most log_1.5(N), and so the cost per
                # operation is O(logN).

                # Compute size of elements to move and set their 'chain'
                # pointers.
                p = pchain.top
                movedsize = p.size
                p.chain = chain
                while p is not parent:
                    p = p.child
                    movedsize += p.size
                    p.chain = chain

                # update sizes
                parent.size -= chain.size
                chain.size = pchain.size
                pchain.size -= movedsize
                parent.size += pchain.size

                # update child, top and parent links
                pchain.top, parent.child, chain.top \
                            = parent.child, chain.top, pchain.top
                chain.parent = pchain.parent
                pchain.parent = parent

                # update rp_child links
                pchain.rp_child = None # pchain is no longer on the path
                if gpchain is not None:
                    assert gpchain is not None
                    assert gpchain.rp_child is pchain
                    gpchain.rp_child = chain

                assert (pchain.top is None)==(pchain.size==0)
                # if pchain.top is None:
                #
                #    # pchain has become empty. If coding this in C,
                #    # remember to free the memory.
            elif quick:
                break
            else:
                gpchain = pchain

            pchain = chain
            chain = pchain.rp_child


###############################################################################

"""Yield-from implementation
"""
import sys       # _getframe, used only in an assert
import types     # GeneratorType
import functools # wraps


class _iterator_node(Node):
    # Wrapper for turning an unknown iterator into a node in the tree.

    __slots__ = ('iterator', # the wrapped iterator
                 )

    locals().update((k, Node.__dict__[k]) for k in
                    ('parent', 'chain', 'child', 'size'))

    def __init__(self, iterator):
        self.iterator = iterator
        Node.__init__(self)
        self.size = 1
        Chain(self)


class _generator_iterator_wrapper(_iterator_node):
    # Wrapper for turning a generator using "yield from_(expr)" into a
    # node in the tree.

    __slots__ = ()

    locals().update((k, _iterator_node.__dict__[k]) for k in
                    ('parent', 'chain', 'child', 'size', 'iterator'))

    def __new__(cls, iterator):
        self = _iterator_node.__new__(cls)
        _iterator_node.__init__(self, iterator)
        return self.__iter__() # we don't hold on to a reference to
                               # this generator, because a) we don't
                               # need to, and b) when the last
                               # user-code reference to it goes away,
                               # the generator is automatically closed
                               # and we get a chance to clean up the
                               # rest of the cycles created by this
                               # structure.

    def __iter__(self):
        val = exc = None
        rp = RootPath(self)
        while True:
            try:
                if rp.acquire(1) is None:
                    raise ValueError('generator already executing')
                while True:
                    try:
                        gen = rp.root.iterator
                        try:
                            if exc is not None:
                                throw = getattr(gen, 'throw', None)
                                try:
                                    if throw is None:
                                        raise exc
                                    ret = throw(exc)
                                finally:
                                    del throw
                                    exc = None
                            elif val is None:
                                ret = gen.next()
                            else:
                                ret = gen.send(val)
                        except:
                            close = getattr(gen, 'close', None)
                            try:
                                if close is not None:
                                    close()
                                raise
                            finally:
                                del close
                        finally:
                            del gen
                    except Exception, e:
                        if rp.cut_root() is None:
                            raise
                        if isinstance(e, StopIteration):
                            val, exc = getattr(e, 'val', None), None
                        else:
                            val, exc = None, e
                    else:
                        if type(ret) is from_:
                            if rp.link(ret.node) is not None:
                                val = None
                            else:
                                exc = ValueError('generator already executing')
                            continue
                        break
            finally:
                if rp.root is not None:
                    rp.release()
            try:
                val = yield ret
            except Exception, e:
                exc = e


class from_(object):
    """This class is used together with the uses_from decorator to
    simulate the proposed 'yield from' statement using existing python
    features.

    Use:

      @uses_from
      def mygenerator():
          ...
          result = yield _from(expr)
          ...
          raise StopIteration(result)

    To get the equivalent effect of the proposed:

      def mygenerator():
          ...
          result = yield from expr
          ...
          return result

    Any use other than directly in a yield expression within a
    generator function decorated with 'uses_from' is unsupported, and
    could eat your harddrive for all I care.  Unsupported uses
    include, but are not limited to: subclassing, calling methods,
    reading or writing attributes, storing in a variable, and passing
    as argument to a builtin or other function.
    """

    __slots__ = ('node',)

    def __init__(self, iterable):
        # get the code object for the wrapper, for comparison
        func_code = _generator_iterator_wrapper.__iter__.func_code
        # verify that from_ is only used in a wrapped generator function
        if __debug__:
            frame = sys._getframe(2)
            assert frame is not None and frame.f_code is func_code, (
                "from_ called from outside a @uses_from generator function.")
        if type(iterable) is types.GeneratorType:
            frame = iterable.gi_frame
            if frame is not None and frame.f_code is func_code:
                # this is a wrapped generator, extract the node for it
                # by peeking at it's locals.
                self.node = frame.f_locals['self']
            else:
                # an unwrapped generator, create a new node.
                self.node = _iterator_node(iterable)
        else:
            # some other iterable, create a new node.
            self.node = _iterator_node(iter(iterable))


def uses_from(func):
    """Decorator for generator functions/methods that use "yield from_(expr)"

    This class is used together with the from_ class to simulate the
    proposed 'yield from' statement using existing python features.

    Use:

      @uses_from
      def mygenerator():
          ...
          result = yield _from(expr)
          ...
          raise StopIteration(result)

    To get the equivalent effect of the proposed:

      def mygenerator():
          ...
          result = yield from expr
          ...
          return result

    Any other use than as a decorator for a normal generator
    function/method is at your own risk.  I wouldn't do it if I were
    you.  Seriously.
    """
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        return _generator_iterator_wrapper(func(*args, **kwargs))
    return wrapper


From skip at pobox.com  Fri Mar  6 14:55:00 2009
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 6 Mar 2009 07:55:00 -0600
Subject: [Python-ideas] Cross-platform file locking, PID files ,
 and the "daemon" PEP
In-Reply-To: <87fxhr544c.fsf@benfinney.id.au>
References: <87fxhr544c.fsf@benfinney.id.au>
Message-ID: <18865.11060.602298.698687@montanaro.dyndns.org>


    Ben> For those people who have asked about the status of this: Yes, I'm
    Ben> currently working with Skip on the ?lockfile? package so that it
    Ben> can be used for the ?daemon? implementation.

Indeed.  I should have my Mercurial repository in a more globally visible
place later today so Ben and anyone else interested can bash bits.

Skip


From eric at trueblade.com  Mon Mar  9 13:20:51 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 09 Mar 2009 08:20:51 -0400
Subject: [Python-ideas] String formatting and namedtuple
In-Reply-To: <499AAF4E.3020506@trueblade.com>
References: <gmrlom$gth$1@ger.gmane.org>	<aac2c7cb0902111318n6a2d8c2jdbfc015924a16c3@mail.gmail.com>	<ca471dc20902111324qbe4611fgc233138f073fb59e@mail.gmail.com>	<gmvim3$3lf$1@ger.gmane.org>	<ca471dc20902111501l40b51684of0af4f9ef635725e@mail.gmail.com>	<gmvsmp$k7$1@ger.gmane.org>	<20090212141040.0c89e0fc@o>	<70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com>	<ca471dc20902121046n46d4c0f3pcd2ad259b837a54e@mail.gmail.com>	<gn1t14$tlu$1@ger.gmane.org>	<cf5b87740902161122j2a1abdcif8e5cd9053a1c456@mail.gmail.com>	<gncgic$q3q$1@ger.gmane.org>	<4999D184.3080105@trueblade.com>
	<499AAF4E.3020506@trueblade.com>
Message-ID: <49B509A3.1080404@trueblade.com>

I've added a patch to http://bugs.python.org/issue5237 that implements 
the basic '{}' functionality in str.format.

Read the note in the issue; this patch is not ready for production. But 
it will let you play with the feature.

I like it, mainly because it's so much quicker to type '{}' than '{0}' 
because you don't have to shift-unshift-shift (on my US English 
keyboard). If we decide to add this feature I hope I can finish it 
before PyCon, or worst case finish it during the sprints.


From tav at espians.com  Mon Mar  9 14:04:23 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 13:04:23 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
Message-ID: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>

Hey all,

I've come up with a way to do Ruby-style blocks in what I feel to be a
Pythonic way:

  using employees.select do (employee):
      if employee.salary > developer.salary:
          fireEmployee(employee)
      else:
          extendContract(employee)

I originally overloaded the ``with`` keyword, but on Guido's guidance
and responses from many others, switched to the ``using`` keyword.

I explain the idea in detail in this blog article:

 http://tav.espians.com/ruby-style-blocks-in-python.html

It covers everything from why these are useful to a proposal of how
the new ``do`` statement and __do__ function could work.

Please note that the intention of this proposal is not to encourage
iteration with blocks -- Python already does this rather elegantly --
but blocks are very useful for more than just iteration as various
Ruby applications have shown.

Let me know what you think.

My apologies if I've missed something obvious. Thanks!

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From sturla at molden.no  Mon Mar  9 16:24:23 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 09 Mar 2009 16:24:23 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
Message-ID: <49B534A7.2010001@molden.no>

On 3/9/2009 2:04 PM, tav wrote:
> Hey all,
> 
> I've come up with a way to do Ruby-style blocks in what I feel to be a
> Pythonic way:
> 
>   using employees.select do (employee):
>       if employee.salary > developer.salary:
>           fireEmployee(employee)
>       else:
>           extendContract(employee)


I believe this is just an extension to the lambda keyword. If lambdas 
could define a block, not just a statement, this would e.g. be

employees.select(
    lambda employee:
       if employee.salary > developer.salary:
          fireEmployee(employee)
       else:
          extendContract(employee)
)

or

tmp = lambda employee:
     if employee.salary > developer.salary:
        fireEmployee(employee)
     else:
        extendContract(employee)

employees.select(tmp)


I see no reason for introducing two new keywords to do this, as you are 
really just enhancing the current lambda keyword.

On the other hand, turning blocks into anonymous functions would be very 
useful for functional programming. As such, I like your suggestion.

This also has a great potential for abuse (as in writing unreadable 
code), just consider how anonymous classes are used in Java's GUI 
toolkits. I really don't want to see

self.Bind(wx.BUTTON, lamda: evt
                          <some huge block of code here>
                               , mybutton)

in wxPython code. But Java programmers coming to Pytho would jump to 
this, as they have been brain washed to use anonymous classes for 
everything (no pun intended).


Sturla Molden


From tav at espians.com  Mon Mar  9 16:33:30 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 15:33:30 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <49B534A7.2010001@molden.no>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no>
Message-ID: <eb24b25b0903090833i4e4256b0wf289c773d7a4a1a0@mail.gmail.com>

Hey Stuart,

> I believe this is just an extension to the lambda keyword. If lambdas could
> define a block, not just a statement, this would e.g. be

Multi-line lambdas would be nice, but I struggle to find a way to do
so in a Pythonic manner.

See: http://unlimitednovelty.com/2009/03/indentation-sensitivity-post-mortem.html

It would be nice to find a way though...

> tmp = lambda employee:
> ? ?if employee.salary > developer.salary:
> ? ? ? fireEmployee(employee)
> ? ?else:
> ? ? ? extendContract(employee)
>
> employees.select(tmp)

You might as well just use ``def`` above...

> I see no reason for introducing two new keywords to do this, as you are
> really just enhancing the current lambda keyword.

I agree that two new keywords is a bit much. I tried to re-use
``with`` initially -- but I guess people would be confused by the
conflicting semantics.

> On the other hand, turning blocks into anonymous functions would be very
> useful for functional programming. As such, I like your suggestion.

Thanks =)

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From aahz at pythoncraft.com  Mon Mar  9 16:45:05 2009
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 9 Mar 2009 08:45:05 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <49B534A7.2010001@molden.no>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no>
Message-ID: <20090309154505.GA18115@panix.com>

On Mon, Mar 09, 2009, Sturla Molden wrote:
>
> I see no reason for introducing two new keywords to do this, as you are  
> really just enhancing the current lambda keyword.
>
> On the other hand, turning blocks into anonymous functions would be very  
> useful for functional programming. As such, I like your suggestion.

There's a substantial minority (possibly even a majority) in the Python
community that abhors functional programming.  Even among those who like
functional programming, there's a substantial population that dislikes
extensive use of anonymous functions.  The trick to getting features for
functional programming accepted is to make them look as Pythonic as
possible.

Right now, I'm somewhere between -0 and -1 on this proposal, because all
the motivation I see looks like it's perfectly satisfied by using ``def``
instead of lambda.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"All problems in computer science can be solved by another level of     
indirection."  --Butler Lampson


From tav at espians.com  Mon Mar  9 16:53:35 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 15:53:35 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090309154505.GA18115@panix.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
Message-ID: <eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>

Hey Aahz,

> The trick to getting features for functional programming accepted
> is to make them look as Pythonic as possible.

I spent considerable effort to make the using/do statement as Pythonic
as possible.

Could you please elaborate on what you don't like about it?

Please note that the lambda thing was Sturla's follow up comment...

-- 
Thanks, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From sturla at molden.no  Mon Mar  9 17:09:02 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 09 Mar 2009 17:09:02 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>
	<eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>
Message-ID: <49B53F1E.30009@molden.no>

On 3/9/2009 4:53 PM, tav wrote:

> I spent considerable effort to make the using/do statement as Pythonic
> as possible.
> 
> Could you please elaborate on what you don't like about it?
> 
> Please note that the lambda thing was Sturla's follow up comment...

If I can elaborate as well. There are three things I don't like:

1. You are introducing two new keywords. Solving problems by constantly 
adding new syntax is how programming languages are designed in Redmond, 
WA. I don't exactly know what Pythonic means, but bloating the syntax is 
not.

2. Most of this is covered by 'def'. Python allows functions to be 
nested. Python does support closures.

3. Anonymous classes in Java have more cases for abuse than use. Just 
see how they are abused to write callbacks/handlers. They are a 
notorious source of unreadable and unmaintainable spaghetti code.


S.M.


From aahz at pythoncraft.com  Mon Mar  9 17:16:06 2009
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 9 Mar 2009 09:16:06 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>
Message-ID: <20090309161606.GA19375@panix.com>

On Mon, Mar 09, 2009, tav wrote:
>
> Hey Aahz,
> 
>> The trick to getting features for functional programming accepted
>> is to make them look as Pythonic as possible.
> 
> I spent considerable effort to make the using/do statement as Pythonic
> as possible.
> 
> Could you please elaborate on what you don't like about it?

That was a general point.  The specific point was what you cut: your
proposal seems to offer little advantage over a ``def``.  You need to
justify yourself more thoroughly.  Also, because this list is archived,
you should probably include your entire argument here rather than
referring to an external web page.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"All problems in computer science can be solved by another level of     
indirection."  --Butler Lampson


From bronger at physik.rwth-aachen.de  Mon Mar  9 17:22:18 2009
From: bronger at physik.rwth-aachen.de (Torsten Bronger)
Date: Mon, 09 Mar 2009 17:22:18 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<eb24b25b0903090853m1a7d1635le45fd8fd422b6993@mail.gmail.com>
Message-ID: <87y6veeked.fsf@physik.rwth-aachen.de>

Hall?chen!

tav writes:

> Hey Aahz,
>
>> The trick to getting features for functional programming accepted
>> is to make them look as Pythonic as possible.
>
> I spent considerable effort to make the using/do statement as
> Pythonic as possible.
>
> Could you please elaborate on what you don't like about it?

Two things for me: The "using ..." is not well-readable.  The
"(employee)" sits clumsily at the end of the line whithout any
connection to the rest.  Even worse, the "do" anticipates the ":",
which in Python already means "do".

And secondly, I'm not comfortable with the fact that the return
value is the first (or last? or all?) expression the interpreter
stumbles over.  Because Python distinguishs between expressions and
statements, you have to look twice to see what actually happens in
the block.  In other words, expressions work differently in these
almost-functions, so we end up with two kinds of functions that have
different semantic rules.  This makes reading more difficult, as
well as code-reuse.

Tsch?,
Torsten.

-- 
Torsten Bronger, aquisgrana, europa vetus
                   Jabber ID: torsten.bronger at jabber.rwth-aachen.de


From sturla at molden.no  Mon Mar  9 17:24:36 2009
From: sturla at molden.no (Sturla Molden)
Date: Mon, 09 Mar 2009 17:24:36 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090309154505.GA18115@panix.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>
Message-ID: <49B542C4.6090702@molden.no>

On 3/9/2009 4:45 PM, Aahz wrote:

> There's a substantial minority (possibly even a majority) in the Python
> community that abhors functional programming.  

There are a substantial minority that use Python for scientific 
computing (cf. numpy and scipy, the Hubble space telescope, the NEURON 
simulator, Sage, etc.) For numerical computing, functional programming 
often leads to code that are shorter and easier to read. That is, 
equations look like functions, not like classes.

S.M.


From aahz at pythoncraft.com  Mon Mar  9 17:29:43 2009
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 9 Mar 2009 09:29:43 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <49B542C4.6090702@molden.no>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<49B542C4.6090702@molden.no>
Message-ID: <20090309162943.GB19375@panix.com>

On Mon, Mar 09, 2009, Sturla Molden wrote:
> On 3/9/2009 4:45 PM, Aahz wrote:
>>
>> There's a substantial minority (possibly even a majority) in the Python
>> community that abhors functional programming.  
>
> There are a substantial minority that use Python for scientific  
> computing (cf. numpy and scipy, the Hubble space telescope, the NEURON  
> simulator, Sage, etc.) For numerical computing, functional programming  
> often leads to code that are shorter and easier to read. That is,  
> equations look like functions, not like classes.

Yes, I know; I'm just pointing out that Python is not a pure functional
language and that there's a tension within the community about how far
Python should go in the functional direction.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"All problems in computer science can be solved by another level of     
indirection."  --Butler Lampson


From guido at python.org  Mon Mar  9 17:39:30 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 09:39:30 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
Message-ID: <ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>

On Mon, Mar 9, 2009 at 6:04 AM, tav <tav at espians.com> wrote:
> I've come up with a way to do Ruby-style blocks in what I feel to be a
> Pythonic way:
>
> ?using employees.select do (employee):
> ? ? ?if employee.salary > developer.salary:
> ? ? ? ? ?fireEmployee(employee)
> ? ? ?else:
> ? ? ? ? ?extendContract(employee)

Sounds like you might as well write a decorator named @using:

 @using(employees.select)
 def _(employee):
     if employee.salary > developer.salary:
         fireEmployee(employee)
     else:
         extendContract(employee)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From dreamingforward at gmail.com  Mon Mar  9 17:43:03 2009
From: dreamingforward at gmail.com (average)
Date: Mon, 9 Mar 2009 09:43:03 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
Message-ID: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>

While there are several complaining that all this can be done with a
def, there's an critical distinction being overlooked.  Passing around
code blocks is a very different style of programming that Python or
most languages like it have ever experimented with.  The programming
art itself hasn't really even explored the different range of thinking
this style of programming opens up.  Where as most function
definitions are verb-like, this would be a noun-like definition.  Akin
perhaps to the difference between a hormone in the body and a
neurotransmitter, respectively.Setting it off with a new keyword, or
expanding the use of lambda is really a mandatory way of signifying
this INTENT.

marcos

> I've come up with a way to do Ruby-style blocks in what I feel to be a
> Pythonic way:
>
>  using employees.select do (employee):
>      if employee.salary > developer.salary:
>          fireEmployee(employee)
>      else:
>          extendContract(employee)
>
> I originally overloaded the ``with`` keyword, but on Guido's guidance
> and responses from many others, switched to the ``using`` keyword.
>
> It covers everything from why these are useful to a proposal of how
> the new ``do`` statement and __do__ function could work.


From guido at python.org  Mon Mar  9 17:45:07 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 09:45:07 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
Message-ID: <ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>

On Mon, Mar 9, 2009 at 9:43 AM, average <dreamingforward at gmail.com> wrote:
> While there are several complaining that all this can be done with a
> def, there's an critical distinction being overlooked. ?Passing around
> code blocks is a very different style of programming that Python or
> most languages like it have ever experimented with. ?The programming
> art itself hasn't really even explored the different range of thinking
> this style of programming opens up. ?Where as most function
> definitions are verb-like, this would be a noun-like definition. ?Akin
> perhaps to the difference between a hormone in the body and a
> neurotransmitter, respectively.Setting it off with a new keyword, or
> expanding the use of lambda is really a mandatory way of signifying
> this INTENT.

Your claim that this is somehow something new seems to be overlooking
Lisp and Smalltalk, as well as Ruby which was mentioned in the quoted
blog post.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From steve at pearwood.info  Mon Mar  9 17:46:46 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 Mar 2009 03:46:46 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <49B542C4.6090702@molden.no>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<20090309154505.GA18115@panix.com> <49B542C4.6090702@molden.no>
Message-ID: <200903100346.46452.steve@pearwood.info>

On Tue, 10 Mar 2009 03:24:36 am Sturla Molden wrote:
> On 3/9/2009 4:45 PM, Aahz wrote:
> > There's a substantial minority (possibly even a majority) in the
> > Python community that abhors functional programming.
>
> There are a substantial minority that use Python for scientific
> computing (cf. numpy and scipy, the Hubble space telescope, the
> NEURON simulator, Sage, etc.) For numerical computing, functional
> programming often leads to code that are shorter and easier to read.
> That is, equations look like functions, not like classes.

I don't understand what you mean. As far as I can see, equations never 
look like classes in Python, regardless of whether you are using 
functional programming, object-oriented programming or procedural 
programming. Can you give me an example of what you mean?

Secondly, the proposal relates to *anonymous* functions, which is a 
small part of functional programming. Perhaps they are necessary in a 
purely functional programming language, but Python is not such a 
language. Python never *requires* anonymous functions, they are a 
convenience and that's all.


-- 
Steven D'Aprano


From steve at pearwood.info  Mon Mar  9 18:03:04 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 Mar 2009 04:03:04 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
Message-ID: <200903100403.04733.steve@pearwood.info>

On Tue, 10 Mar 2009 03:43:03 am average wrote:
> While there are several complaining that all this can be done with a
> def, there's an critical distinction being overlooked.  Passing
> around code blocks is a very different style of programming that
> Python or most languages like it have ever experimented with.  The
> programming art itself hasn't really even explored the different
> range of thinking this style of programming opens up.  Where as most
> function
> definitions are verb-like, this would be a noun-like definition. 
> Akin perhaps to the difference between a hormone in the body and a
> neurotransmitter, respectively.Setting it off with a new keyword, or
> expanding the use of lambda is really a mandatory way of signifying
> this INTENT.

Marcos, I'm afraid that I don't understand your analogy, or your 
argument. An anonymous code block is just like a named function, except 
it doesn't have a name. Can you explain why:

func(named_function)

is radically different from:

func(multi-line-code-block-without-the-name)

please? We can already do this in Python, using functions made up of a 
single expression:

func(lambda: expr)

I use this frequently, because it is sometimes convenient, but there is 
nothing I can do with a lambda that I can't do with a named function. I 
don't see that introducing multi-lined lambdas will change that.

I'm trying to keep an open-mind here, but I also don't understand your 
analogy. In what way are named functions like neurotransmitters, or 
hormones? Which one is supposed to be verb-like and which one is 
noun-like? What does that even mean?


-- 
Steven D'Aprano


From grosser.meister.morti at gmx.net  Mon Mar  9 18:11:20 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Mon, 09 Mar 2009 18:11:20 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
Message-ID: <49B54DB8.3090101@gmx.net>

tav wrote:
 > Hey all,
 >
 > I've come up with a way to do Ruby-style blocks in what I feel to be a
 > Pythonic way:
 >
 >   using employees.select do (employee):
 >       if employee.salary > developer.salary:
 >           fireEmployee(employee)
 >       else:
 >           extendContract(employee)
 >

Maybe if you come up with an example that isn't written with already existing 
python syntax as easy (or even more easily):

for employee in employees: # or employees.select() if you like
	if employee.salary > developer.salary:
		fireEmployee(employee)
	else:
		extendContract(employee)


	-panzi


From tav at espians.com  Mon Mar  9 18:15:13 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 17:15:13 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <200903100403.04733.steve@pearwood.info>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
Message-ID: <eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>

Hey Steven,

> Can you explain why:
>   func(named_function)
> is radically different from:
>   func(multi-line-code-block-without-the-name)

Hmz, the intention isn't to support multi-line lambdas. It's to make
passing in anonymous functions easier.

For precedence let's take a look at decorators. Fundamentally,
decorators save a user nothing more than a single line of code.

Why do @foo, when you could just do: func = foo(func) ?

But saving developers that extra line of typing has obviously been
useful -- you can find decorators used pretty heavily in many of the
major Python frameworks nowadays...

By easing up some of the hassle, we can encourage certain forms of development.

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From guido at python.org  Mon Mar  9 18:26:14 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 10:26:14 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
Message-ID: <ca471dc20903091026x3e467904hfc971439d0365961@mail.gmail.com>

On Mon, Mar 9, 2009 at 10:15 AM, tav <tav at espians.com> wrote:
>> Can you explain why:
>> ? func(named_function)
>> is radically different from:
>> ? func(multi-line-code-block-without-the-name)
>
> Hmz, the intention isn't to support multi-line lambdas. It's to make
> passing in anonymous functions easier.

Well, that works only as long as there is only a single anonymous
function to pass in and it's the last argument. Plus it doesn't work
with an existing function that takes a function argument -- your
function (if I understand your proposed __do__ implemtation correctly)
must really be a generator.

> For precedence let's take a look at decorators. Fundamentally,
> decorators save a user nothing more than a single line of code.

I guess you weren't there at the time. If it was about saving a line
of code it would have been boohed out of the room. (Especially since
the line count is actually the same with or without using the
decorator syntax!) The big improvement that decorators offer is to
move the "decoration" from the end of the function body, where it is
easily missed, to the front of the declaration, where it changes the
emphasis for the reader. I don't see a similar advantage in your
example; it looks more like "Ruby-envy" to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From dreamingforward at gmail.com  Mon Mar  9 18:43:43 2009
From: dreamingforward at gmail.com (average)
Date: Mon, 9 Mar 2009 10:43:43 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>
Message-ID: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>

>> [RE:"using" keyword].Setting it off with a new keyword, or
>> expanding the use of lambda is really a mandatory way of signifying
>> this INTENT.
>
> Your claim that this is somehow something new seems to be overlooking
> Lisp and Smalltalk, as well as Ruby which was mentioned in the quoted
> blog post.

Acknowledged.  However, the power of such a language as Lisp is
under-appreciated (as I think all who know it can agree), and the
problem with it (and the challenge of language design in general) is
how best to *orgranize* that power; i.e. that generality.  In my view,
Lisp is like assembly language for the mind.  It's powerful, but not
easily organized or visualized into a fashion where one can see and
evaluate the building up of and into higher-level constructs.

MY point was really about how the programming art *itself* hasn't
fully explored this concept to even be able to *evaluate* the power
and usefulness of employing techniques such as code blocks.  To me,
Python and Ruby are both exciting and interesting examples of how
language design is evolving to find ways to express and evolve that
power.  In my mind, there is no doubt that languages will have to find
elegant ways to express that power.  What's cool about Python and Ruby
is that it's taking that vast general space of the "mind's assembly
language" and distilling it down into nicely manageble and elegant
chunks of language syntax.  The concept of distinct, passable code
blocks is a nice example of that compression, one that certainly has
correspondence within our biology.

Thanks for the dialog though...

marcos


From steve at pearwood.info  Mon Mar  9 18:50:56 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 Mar 2009 04:50:56 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
Message-ID: <200903100450.56903.steve@pearwood.info>

On Tue, 10 Mar 2009 04:15:13 am you wrote:
> Hey Steven,
>
> > Can you explain why:
> >   func(named_function)
> > is radically different from:
> >   func(multi-line-code-block-without-the-name)
>
> Hmz, the intention isn't to support multi-line lambdas. It's to make
> passing in anonymous functions easier.

Lambdas are single-line (technically, single-statement) anonymous 
functions, and it's already easy to pass them in:

caller(lambda args: statement)

A multi-line lambda (technically, multi-statement) would also be an 
anonymous function. Your syntax:

using caller (args) do:
    multiple lines 
    making an 
    anonymous function

seems to me to be defining a multi-line lambda. The three lines of the 
indented block are exactly equivalent to the statement of a lambda. 
What is the difference that you see?


-- 
Steven D'Aprano


From tav at espians.com  Mon Mar  9 19:47:18 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 18:47:18 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples]
Message-ID: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>

Dear all,

Here's another stab at making a case for it. I'll avoid referring to
Ruby this time round -- I was merely using it as an example of where
this approach has been successful. Believe me, there's no Ruby-envy
here ;p

The motivation:

1. Having to name a one-off function adds additional cognitive
overload to a developer. It doesn't make the code any cleaner and by
taking away the burden, we'd have happier developers and cleaner code.

2. This approach is more descriptive and in line with the code flow.
With blocks the first line says "I'm about to define a function for
use with X" instead of the existing way which says "I'm defining a
function. Now I'm using that function with X."

3. DSLs -- whether we like them or not, they are in mainstream use.
Python already has beautiful syntax. We should be leveraging that for
DSLs instead of forcing framework developers to create their own
ugly/buggy mini-DSLs. This will enable that.

The proposed syntax:

  using EXPR do PARAM_LIST:
    FUNCTION_BODY

Examples:

# Django/App Engine Query

Frameworks like Django or App Engine define DSLs to enable easy
querying of datastores by users. Wouldn't it better if this could be
done in pure Python syntax?

Compare the current Django:

  q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now())

with a hypothetical:

  using Entry.filter do (entry):
      if entry.headline.startswith('What') and entry.pub_date <= datetime.now():
          return entry

Wouldn't the latter be easier for a developer to read/maintain?

Let's compare this App Engine:

  composer = "Lennon, John"
  query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer)

with:

  composer = "Lennon, John"
  using Song.query do (item):
      if item.composer == composer:
          return item

Again, being able to do it in Python syntax will save developers the
hassles of having to learn non-Python DSLs.

# Event-driven Programming

Right now, event-driven programming like it's done in Twisted is
rather painful for many developers. It's filled with callbacks and the
order in which code is written is completely inverted as far as the
average developer is concerned.

Let's take Eventlet -- a nice coroutines-based networking library in
Python. Their example webcrawler.py currently does:

  urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
          "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

  def fetch(url):
      print "%s fetching %s" % (time.asctime(), url)
      httpc.get(url)
      print "%s fetched %s" % (time.asctime(), url)

  pool = coros.CoroutinePool(max_size=4)
  waiters = []

  for url in urls:
      waiters.append(pool.execute(fetch, url))

Wouldn't it be nicer to do this instead:

  pool = coros.CoroutinePool(max_size=4)

  for url in urls:
      using pool.execute do:
          print "%s fetching %s" % (time.asctime(), url)
          httpc.get(url)
          print "%s fetched %s" % (time.asctime(), url)

I'd argue that it is -- but then I have bias =)

# SCons

SCons is a make-esque build tool. In the SConstruct (makefile) for
Google Chrome, we find:

  def WantSystemLib(env, lib):
      if lib not in env['all_system_libs']:
          env['all_system_libs'].append(lib)
      return (lib in env['req_system_libs'])
  root_env.AddMethod(WantSystemLib, "WantSystemLib")

Which we could hypothetically do as:

  with root_env.WantSystemLib do (env, lib):
      if lib not in env['all_system_libs']:
          env['all_system_libs'].append(lib)
      return (lib in env['req_system_libs'])

As someone who's used both make and SCons, I found SCons terribly
verbose and painful to use. By using the proposed do statement, SCons
could be made extremely pleasant!

# Webapp Configuration

Configuration in web applications is generally a real pain:

  application = webapp([('/profile', ProfileHandler), ('/', MainHandler)],
                                   debug=True)
  run(application)

Compare to:

  using webapp.runner do (config, routes):
      routes['/profiles'] = ProfileHandler
      routes['/'] = MainHandler
      config.debug = True

I think the latter is more readable and maintainable.

Please let me know if more examples would help...

I really do believe that a block syntax would make developers more
productive and lead to cleaner code.

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From guido at python.org  Mon Mar  9 20:10:11 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 12:10:11 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples]
In-Reply-To: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
References: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
Message-ID: <ca471dc20903091210p48c23d2cubef5ba5f40b306f1@mail.gmail.com>

On Mon, Mar 9, 2009 at 11:47 AM, tav <tav at espians.com> wrote:
> Here's another stab at making a case for it. I'll avoid referring to
> Ruby this time round -- I was merely using it as an example of where
> this approach has been successful. Believe me, there's no Ruby-envy
> here ;p
>
> The motivation:
>
> 1. Having to name a one-off function adds additional cognitive
> overload to a developer. It doesn't make the code any cleaner and by
> taking away the burden, we'd have happier developers and cleaner code.

I showed an example using "def _(...)".

> 2. This approach is more descriptive and in line with the code flow.
> With blocks the first line says "I'm about to define a function for
> use with X" instead of the existing way which says "I'm defining a
> function. Now I'm using that function with X."

Marginal. The decorator name could clarify this too.

> 3. DSLs -- whether we like them or not, they are in mainstream use.
> Python already has beautiful syntax. We should be leveraging that for
> DSLs instead of forcing framework developers to create their own
> ugly/buggy mini-DSLs. This will enable that.

This is quite the non-sequitur. Given that what you propose is
trivially done using a decorator, what's ugly/buggy about the existing
approach?

> The proposed syntax:
>
> ?using EXPR do PARAM_LIST:
> ? ?FUNCTION_BODY
>
> Examples:
>
> # Django/App Engine Query
>
> Frameworks like Django or App Engine define DSLs to enable easy
> querying of datastores by users. Wouldn't it better if this could be
> done in pure Python syntax?
>
> Compare the current Django:
>
> ?q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now())
>
> with a hypothetical:
>
> ?using Entry.filter do (entry):
> ? ? ?if entry.headline.startswith('What') and entry.pub_date <= datetime.now():
> ? ? ? ? ?return entry

Hmm... where does the 'return' return to? The "current Django" has an
assignment to q. How do you set q in this example?

> Wouldn't the latter be easier for a developer to read/maintain?

But it's not the same. If they had wanted you to be able to write that
they could easily have provided an iterator (and actually of course
they do -- it's the iterator over all records). But the (admittedly
awkward) current syntax *executes the query in the database engine*.
There's no way (without proposing a lot of other changes to Python
anyway) to translate the Python code in the body of your example to
SQL, because Python (the language, anyway -- not all implementations
support access to the bytecode) doesn't let you recover the source
code of a block at run time.

> Let's compare this App Engine:
>
> ?composer = "Lennon, John"
> ?query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer)
>
> with:
>
> ?composer = "Lennon, John"
> ?using Song.query do (item):
> ? ? ?if item.composer == composer:
> ? ? ? ? ?return item
>
> Again, being able to do it in Python syntax will save developers the
> hassles of having to learn non-Python DSLs.

Again, it's broken the same way.

> # Event-driven Programming
>
> Right now, event-driven programming like it's done in Twisted is
> rather painful for many developers. It's filled with callbacks and the
> order in which code is written is completely inverted as far as the
> average developer is concerned.
>
> Let's take Eventlet -- a nice coroutines-based networking library in
> Python. Their example webcrawler.py currently does:
>
> ?urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
> ? ? ? ? ?"http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
>
> ?def fetch(url):
> ? ? ?print "%s fetching %s" % (time.asctime(), url)
> ? ? ?httpc.get(url)
> ? ? ?print "%s fetched %s" % (time.asctime(), url)
>
> ?pool = coros.CoroutinePool(max_size=4)
> ?waiters = []
>
> ?for url in urls:
> ? ? ?waiters.append(pool.execute(fetch, url))
>
> Wouldn't it be nicer to do this instead:
>
> ?pool = coros.CoroutinePool(max_size=4)
>
> ?for url in urls:
> ? ? ?using pool.execute do:
> ? ? ? ? ?print "%s fetching %s" % (time.asctime(), url)
> ? ? ? ? ?httpc.get(url)
> ? ? ? ? ?print "%s fetched %s" % (time.asctime(), url)
>
> I'd argue that it is -- but then I have bias =)

It's also broken -- unless you also have some kind of alternative
semantics in mind that does *not* map to existing Python functions and
scopes, all callbacks will reference the last url in the list. Compare
this classic stumbling block:

addN = [(lambda x: x+i) for i in range(10)]
add1 = addN[1]
print add1(10)  # prints 19

> # SCons
>
> SCons is a make-esque build tool. In the SConstruct (makefile) for
> Google Chrome, we find:
>
> ?def WantSystemLib(env, lib):
> ? ? ?if lib not in env['all_system_libs']:
> ? ? ? ? ?env['all_system_libs'].append(lib)
> ? ? ?return (lib in env['req_system_libs'])
> ?root_env.AddMethod(WantSystemLib, "WantSystemLib")
>
> Which we could hypothetically do as:
>
> ?with root_env.WantSystemLib do (env, lib):
s/with/using/
> ? ? ?if lib not in env['all_system_libs']:
> ? ? ? ? ?env['all_system_libs'].append(lib)
> ? ? ?return (lib in env['req_system_libs'])

That's just a matter of API design. They could easily have provided a
decorator to register the callback. The name of the decorated function
could even serve as the key.

> As someone who's used both make and SCons, I found SCons terribly
> verbose and painful to use. By using the proposed do statement, SCons
> could be made extremely pleasant!

Quite the exaggeration.

> # Webapp Configuration
>
> Configuration in web applications is generally a real pain:
>
> ?application = webapp([('/profile', ProfileHandler), ('/', MainHandler)],
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? debug=True)
> ?run(application)
>
> Compare to:
>
> ?using webapp.runner do (config, routes):
> ? ? ?routes['/profiles'] = ProfileHandler
> ? ? ?routes['/'] = MainHandler
> ? ? ?config.debug = True
>
> I think the latter is more readable and maintainable.

Again, there's no need to add new syntax if you wanted to make this
API easier on the eyes.

> Please let me know if more examples would help...

Well, examples that (a) aren't broken and (b) aren't trivially written
using decorators might help...

> I really do believe that a block syntax would make developers more
> productive and lead to cleaner code.

I got that. :-)

> --
> love, tav
>
> plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
> http://tav.espians.com | http://twitter.com/tav | skype:tavespian
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From bruce at leapyear.org  Mon Mar  9 20:19:50 2009
From: bruce at leapyear.org (Bruce Leban)
Date: Mon, 9 Mar 2009 12:19:50 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples]
In-Reply-To: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
References: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
Message-ID: <cf5b87740903091219m525aa4fcjecd8bccba572c91a@mail.gmail.com>

> Again, being able to do it in Python syntax will save developers the
> hassles of having to learn non-Python DSLs.

Hmm. For your GqlQuery example, the way I see it doing it in Python syntax
will save the hassle of GqlQuery optimizing the database access. The truth
is that you don't know what happens to the query string but you do know that
it can't take your function apart and have the database process it. So this
example is useless.

I also can't tell if using/do is supposed to be a loop or not. Some of your
examples are loopy and some aren't.

And besides that your examples are apples and oranges:

 composer = "Lennon, John"
 query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer)

with:

 composer = "Lennon, John"
 using Song.query do (item):
     if item.composer == composer:
         return item

The first one produces a result: something bound to a variable named query.
What does the second one produce? I see no advantage in a new syntax that is
this confusing.

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090309/15cecad3/attachment.html>

From denis.spir at free.fr  Mon Mar  9 21:22:25 2009
From: denis.spir at free.fr (spir)
Date: Mon, 9 Mar 2009 21:22:25 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
Message-ID: <20090309212225.56d18547@o>

Le Mon, 9 Mar 2009 09:43:03 -0700,
average <dreamingforward at gmail.com> s'exprima ainsi:

> While there are several complaining that all this can be done with a
> def, there's an critical distinction being overlooked.  Passing around
> code blocks is a very different style of programming that Python or
> most languages like it have ever experimented with.  

This style of programming is very common in stack-based languages, or rather in concatenative languages in general.

:square dup *		# def square(x): return x*x
:squares [square] map	# def squares(l): return [square(x) for x in l]
			# using equivalent of first class func
:squares [dup *] map	# using anonymous func def
[1 2 3] squares ==> [1 4 9]

In the latter form, the func literal expression -- often called 'quotation' because it 'quotes' code without executing it -- can be whatever and as long as needed, which is easy due to the linear style of stack-based programming. "Higher order" functions like map, called combinators, are very frequent and (unlike in other paradigms) make the code clearer. But they are not really higher order functions, rather they take several kinds of data as input, one of which happens to be code that will be 'unquoted', i.e. run -- closer to Lisp's eval.

see http://www.latrobe.edu.au/philosophy/phimvt/joy/faq.html for a nice introduction (but rather distinctive to FP) esp section #10

denis
------
la vita e estrany


From dreamingforward at gmail.com  Mon Mar  9 21:22:53 2009
From: dreamingforward at gmail.com (average)
Date: Mon, 9 Mar 2009 13:22:53 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com>
Message-ID: <913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com>

> average wrote:
> Perhaps I've misunderstood, but both Perl and Javascript (highly
> popular languages by any standard) support "passing around code
> blocks" by defining anonymous functions. How can you say that most
> languages like Python have never experimented with this, when of the
> more popular programming languages, Javascript and Perl are the most
> obviously similar to Python (besides Ruby)?

You're likely right.  And I'm probably being sloppier than I should.
My point was really more about how the art of programming has yet to
really explore the concept and power of code-blocks adequately.  Most
of us are comfortably stuck in our decades of procedural programming
experience.

What's misleading about framing this discussion is that all of us are
[over]used to the "flatland" of the program editor.  Code blocks
appear on the screen like any other code, but the *critical* point is
that LOGICALLY they are ORTHOGONAL to it.  Where most of your code
could be organized in a tree-like fashion rooted in your program's
"main" node, code blocks are orthogonal and are really like leaves
spanning back into the screen to a *different* tree's root (the
*application's* surface) hence its subtlety.

Hoping that analogy is more useful....

marcos


From guido at python.org  Mon Mar  9 21:29:16 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 13:29:16 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com>
	<913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com>
Message-ID: <ca471dc20903091329x5c59beaap2071b62e94de4a7a@mail.gmail.com>

On Mon, Mar 9, 2009 at 1:22 PM, average <dreamingforward at gmail.com> wrote:
> What's misleading about framing this discussion is that all of us are
> [over]used to the "flatland" of the program editor. ?Code blocks
> appear on the screen like any other code, but the *critical* point is
> that LOGICALLY they are ORTHOGONAL to it. ?Where most of your code
> could be organized in a tree-like fashion rooted in your program's
> "main" node, code blocks are orthogonal and are really like leaves
> spanning back into the screen to a *different* tree's root (the
> *application's* surface) hence its subtlety.

That's only one use of callbacks. One could claim that a confusing
part of anonymous blocks (as used in SmallTalk and Ruby) is that they
use the same syntax for both use cases: you can't tell from the syntax
whether the block is executed in the place where you see it (perhaps
in a loop or with an error handler wrapped around it, or conditionally
like in SmallTalk's "if" construct), or squirreled away for later use
(once or many times).

The good thing about function syntax (as used in JavaScript's
anonymous blocks) is that it leaves no doubt about these two different
uses: at least by convention, anonymous functions are used for
asynchronous programming, while in-line code uses regular block
syntax.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Mon Mar  9 21:58:43 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 Mar 2009 09:58:43 +1300
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>
	<913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>
Message-ID: <49B58303.5080802@canterbury.ac.nz>

average wrote:

> MY point was really about how the programming art *itself* hasn't
> fully explored this concept to even be able to *evaluate* the power
> and usefulness of employing techniques such as code blocks.

On the contrary, I think Smalltalk has explored it very
well. Smalltalk implements *all* control structures in
terms of code blocks, and does so in a very readable
way, without any of the brain-exploding characteristics
of Lisp.

It works well in Smalltalk because the whole language
syntax is designed from the ground up to accommodate it.
Ruby inherits the idea, but struggles to fit it into its
syntax, leading to a much-weakened form (you can only
pass one code block to a given method at a time).

It's even harder to fit the idea into Python's syntax.
This isn't just because of the indentation issue, but
also because Pythonistas tend to have a higher standard
of aesthetics when comes to syntax design. Ruby can get
away with looking a bit messy and haphazard, but that's
not acceptable in the Python community.

There are also semantic problems with the idea in Python.
Once you're allowed to write the code block in-line, it
becomes expected that you can write things like:

   while some_condition:
     with flapple() do (arg):
       if some_other_condition:
         break

and have the 'break' exit from the while-loop. But if
the body is actually a separate function, this is not
easy to arrange.

Back when the existing with-statement was being designed,
there was serious thought put towards implementing it
by passing the body as a function. But handling 'break',
'continue', 'return' and 'yield' inside the body would
have required raising special control-flow exceptions,
and it all got very messy and complicated. In the end
it was decided not to be worth the hassle, and the
existing generator-based implementation was settled on.

-- 
Greg


From tav at espians.com  Mon Mar  9 22:20:07 2009
From: tav at espians.com (tav)
Date: Mon, 9 Mar 2009 21:20:07 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <49B58303.5080802@canterbury.ac.nz>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>
	<913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>
	<49B58303.5080802@canterbury.ac.nz>
Message-ID: <eb24b25b0903091420p3c6d9b9fp3a5449adec580a49@mail.gmail.com>

Hey Greg,

> It becomes expected that you can write things like:
>
> ?while some_condition:
> ? ?with flapple() do (arg):
> ? ? ?if some_other_condition:
> ? ? ? ?break
>
> and have the 'break' exit from the while-loop.

Thanks for this!!

It's the only counter-argument that I've seen which demonstrates that
my proposal was unpythonic.

As such, I'd like to withdraw my proposal -- sorry for taking up
everybody's time =(

But, hey, live and learn =)

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From tjreedy at udel.edu  Mon Mar  9 22:26:11 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Mar 2009 17:26:11 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903091026x3e467904hfc971439d0365961@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>	<200903100403.04733.steve@pearwood.info>	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<ca471dc20903091026x3e467904hfc971439d0365961@mail.gmail.com>
Message-ID: <gp41ha$afk$1@ger.gmane.org>

Guido van Rossum wrote:

>> For precedence let's take a look at decorators. Fundamentally,
>> decorators save a user nothing more than a single line of code.
> 
> I guess you weren't there at the time. If it was about saving a line
> of code it would have been boohed out of the room. (Especially since
> the line count is actually the same with or without using the
> decorator syntax!) The big improvement that decorators offer is to
> move the "decoration" from the end of the function body, where it is
> easily missed, to the front of the declaration, where it changes the
> emphasis for the reader. I don't see a similar advantage in your
> example; it looks more like "Ruby-envy" to me.

Plus decorators save 2 retypings of the function name, which was 
especially important for people who use very_long_multi_word_func_names.


From greg.ewing at canterbury.ac.nz  Mon Mar  9 22:33:04 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 Mar 2009 10:33:04 +1300
Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples]
In-Reply-To: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
References: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
Message-ID: <49B58B10.4090806@canterbury.ac.nz>

tav wrote:

>   q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now())
> 
>   using Entry.filter do (entry):
>       if entry.headline.startswith('What') and entry.pub_date <= datetime.now():
>           return entry

How is this any better than

     [entry for entry in Entry.filter
       if entry.headline.startswith('What') and entry.pub_date <= datetime.now()]

But note that neither of these is an adequate replacement for
the Django expression, because Django can generate an SQL
query incorporating the filter criteria. Neither a list
comprehension nor the proposed using-statement are capable
of doing that.

The same thing applies to the App Engine example, or any
other relational database wrapper.

By the way, having the 'return' in there doing something
other than return from the function containing the 'using'
statement would be confusing and inconsistent with the
rest of the language.

> # Event-driven Programming

I've been exploring this a bit myself in relation to my
yield-from proposal, and doing this sort of thing using
generators is a much better idea, I think, especially given
something like a yield-from construct.

-- 
Greg


From g.brandl at gmx.net  Mon Mar  9 22:51:50 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 09 Mar 2009 22:51:50 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <200903100450.56903.steve@pearwood.info>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>	<200903100403.04733.steve@pearwood.info>	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info>
Message-ID: <gp432r$hgg$1@ger.gmane.org>

Steven D'Aprano schrieb:
> On Tue, 10 Mar 2009 04:15:13 am you wrote:
>> Hey Steven,
>>
>> > Can you explain why:
>> >   func(named_function)
>> > is radically different from:
>> >   func(multi-line-code-block-without-the-name)
>>
>> Hmz, the intention isn't to support multi-line lambdas. It's to make
>> passing in anonymous functions easier.
> 
> Lambdas are single-line (technically, single-statement) anonymous 

If you really want to get technical, it's single-expression.

nit-pickingly-yrs,
Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From tjreedy at udel.edu  Mon Mar  9 23:04:32 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Mar 2009 18:04:32 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <200903100450.56903.steve@pearwood.info>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>	<200903100403.04733.steve@pearwood.info>	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info>
Message-ID: <gp43p8$kk5$1@ger.gmane.org>

Steven D'Aprano wrote:

> Lambdas are single-line (technically, single-statement) anonymous 
> functions,

Lambdas are function-defining expressions used *within* statements that 
give the resulting function object a stock .__name__ of '<lambda>'.  The 
syntax could have been augmented to include a real name, so the 
stock-name anonymity is a side-effect of the chosen syntax.
Possibilities include
   lambda name(args): expression
   lambda <name> args: expression
The latter, assuming it is LL(1) parse-able, would even be compatible 
with existing code and could still be added.

Contrarywise, function-defining def statements could have been allowed 
to omit the name.  To be useful, the object (with a .__name__ such as 
'<def>', would have to get a default namespace binding such as to '_', 
even in batch mode.

> caller(lambda args: statement)

Change 'statement' to 'expression'.

> A multi-line lambda (technically, multi-statement)

The problem is that 'multi-statement expression' is an oxymoron in 
Pythonland.

 > would also be an anonymous function.

Not necessarily, and irrelevant to the essence of lambda expressions, 
which is that they are expressions that can be used within statements.

Terry Jan Reedy


From tjreedy at udel.edu  Mon Mar  9 23:31:36 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Mar 2009 18:31:36 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>	<ca471dc20903090945t53d87e51p1ef8d56c17d7703f@mail.gmail.com>
	<913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com>
Message-ID: <gp45bv$q07$1@ger.gmane.org>

average wrote:
>>> [RE:"using" keyword].Setting it off with a new keyword, or

> MY point was really about how the programming art *itself* hasn't
> fully explored this concept to even be able to *evaluate* the power
> and usefulness of employing techniques such as code blocks.  To me,
> Python and Ruby are both exciting and interesting examples of how
> language design is evolving to find ways to express and evolve that
> power.  In my mind, there is no doubt that languages will have to find
> elegant ways to express that power.  What's cool about Python and Ruby
> is that it's taking that vast general space of the "mind's assembly
> language" and distilling it down into nicely manageble and elegant
> chunks of language syntax.  The concept of distinct, passable code
> blocks is a nice example of that compression, one that certainly has
> correspondence within our biology.

A function is a possibly parameterized code block that can be passed 
around and called.  Anonymity is a defect, not an advantage.  So your 
attempted differentiation looks a bit like mystical gibberish to me.  Sorry.


From tjreedy at udel.edu  Tue Mar 10 00:09:52 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 Mar 2009 19:09:52 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090309154505.GA18115@panix.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>
Message-ID: <gp47jo$1s7$1@ger.gmane.org>

Aahz wrote:
> On Mon, Mar 09, 2009, Sturla Molden wrote:
>> I see no reason for introducing two new keywords to do this, as you are  
>> really just enhancing the current lambda keyword.
>>
>> On the other hand, turning blocks into anonymous functions would be very  
>> useful for functional programming. As such, I like your suggestion.
> 
> There's a substantial minority (possibly even a majority) in the Python
> community that abhors functional programming.

I am not one of them, if there really are such.

> Even among those who like functional programming,
 > there's a substantial population that dislikes
> extensive use of anonymous functions.

Like many other Pythonistas I recognize that that an uninformative stock 
name of '<lambda>' is defective relative to an informative name that 
points back to readable code.  What I dislike is the anonymity-cult 
claim that the defect is a virtue.

Since I routinely use standard names 'f' and 'g' (from math) to name 
functions whose name I do not care about, I am baffled (and annoyed) by 
(repeated) claims such as "Having to name a one-off function adds 
additional cognitive overload to a developer." (Tav).  Golly gee, if one 
cannot decide on standard one-char name, how can he manage the rest of 
Python?

(I also, like others, routinely use 'C' for class and 'c' for C 
instance.  What next?  A demand for anonymous classes? Whoops, I just 
learned that Java has those.)

But I have no problem with the use of lambda expressions as a 
convenience, where appropriate.

Terry Jan Reedy


From steve at pearwood.info  Tue Mar 10 00:17:00 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 Mar 2009 10:17:00 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp43p8$kk5$1@ger.gmane.org>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info>
	<gp43p8$kk5$1@ger.gmane.org>
Message-ID: <200903101017.00551.steve@pearwood.info>

On Tue, 10 Mar 2009 09:04:32 am Terry Reedy wrote:
> Steven D'Aprano wrote:
> > Lambdas are single-line (technically, single-statement) anonymous
> > functions,
>
> Lambdas are function-defining expressions used *within* statements
[...]

Ah, sorry for the brain-o, I was thinking "expression" and 
typing "statement". But thanks for the detailed explanation anyway, it 
is helpful.


-- 
Steven D'Aprano


From guido at python.org  Tue Mar 10 00:55:18 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Mar 2009 16:55:18 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp47jo$1s7$1@ger.gmane.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
Message-ID: <ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>

On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Like many other Pythonistas I recognize that that an uninformative stock
> name of '<lambda>' is defective relative to an informative name that points
> back to readable code. ?What I dislike is the anonymity-cult claim that the
> defect is a virtue.
>
> Since I routinely use standard names 'f' and 'g' (from math) to name
> functions whose name I do not care about, I am baffled (and annoyed) by
> (repeated) claims such as "Having to name a one-off function adds additional
> cognitive overload to a developer." (Tav). ?Golly gee, if one cannot decide
> on standard one-char name, how can he manage the rest of Python?
>
> (I also, like others, routinely use 'C' for class and 'c' for C instance.
> ?What next? ?A demand for anonymous classes? Whoops, I just learned that
> Java has those.)
>
> But I have no problem with the use of lambda expressions as a convenience,
> where appropriate.

Andrew Koening once gave me a good use case where lambdas are really a
lot more convenient than named functions. He was initializing a large
data structure that was used by an interpreter for some language. It
was a single expression (probably a list of tuples or a dict). Each
record contained various bits of information (e.g. the operator symbol
and its precedence and associativity) as well as a function (almost
always a very simple lambda) that implemented it. Since this table was
100s of records long, it would have been pretty inconvenient to first
have to define 100s of small one-line functions and give them names,
only to reference them once in the initializer.

This use case doesn't have a nice equivalent without anonymous
functions (though I'm sure that if there really was no other way it
could be done, e.g. using registration=style decorators).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jan.kanis at phil.uu.nl  Tue Mar 10 01:13:42 2009
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Tue, 10 Mar 2009 01:13:42 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
Message-ID: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>

This being python-ideas, I'll also have a go at it.

Being someone who does like functional programing when used in limited
quantities, I also think multi line lambdas (or blocks, whatever you
call them) are a good thing if a good way could be found to embed them
into Python. But I don't like the part of tavs proposal of handling
them with a magic __do__ function. So what about this slightly
modified syntax and semantics:

def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME:
	BODY

eg:

 def callback(param) as result with do_something(with_our(callback),
other, args):
	print("called back with "+param)
	return foobar(param)


this would be equivalent to

 def callback(param):
	print("called back with "+param)
	return foobar(param)
	
 result = do_something(with_our(callback), other_args)


This example tav gave

 for url in urls:
     using pool.execute do:
         print "%s fetching %s" % (time.asctime(), url)
         httpc.get(url)
         print "%s fetched %s" % (time.asctime(), url)

would be written as

 for url in urls:
	def fetch(url) with pool.execute(fetch):
		print "%s fetching %s" % (time.asctime(), url)
		httpc.get(url)
		print "%s fetched %s" % (time.asctime(), url)


Compared to tavs proposal this would:
- allow for use in expressions where the block is not the only
argument, the block could even be passed in multiple parameter
positions
- make it clear that a function is being defined, and this syntax even
allows for re-using the function later on, including testing etc.
- not use any new keywords, as it is both syntactically and
semantically an extension of the 'def' keyword.

I think it also caters to those circumstances where you'd want to use
a multiline lambda, without having the awkward reordering of first
having to define the function and then passing it by name where the
emphasis has to be on what you do with the function (analogous to
decorators). It also does everything (I think) that Rubys blocks do.
It does not solve the case where you'd want to pass multiple
blocks/multiline lambdas to a function, but hey, you can't solve
everything.


There are several variations of this syntax that could also be
considered, eg having the 'with' clause before the 'as' clause, or
making the 'with' clause optional as well. Or doing 'using
foo(callback, other, args) as answer with callback(param):', that
would require a new keyword, but would allow the expression using the
new function to be first, as it is presumably the most important part.


my ?0,02,
- Jan


From santagada at gmail.com  Tue Mar 10 02:04:52 2009
From: santagada at gmail.com (Leonardo Santagada)
Date: Mon, 9 Mar 2009 22:04:52 -0300
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
Message-ID: <73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com>


On Mar 9, 2009, at 9:13 PM, Jan Kanis wrote:

> def callback(param) as result with do_something(with_our(callback),
> other, args):
> 	print("called back with "+param)
> 	return foobar(param)
>
>
> this would be equivalent to
>
> def callback(param):
> 	print("called back with "+param)
> 	return foobar(param)
> 	
> result = do_something(with_our(callback), other_args)


Not only the equivalent code looks much cleaner, the only good thing  
it actually do (not having to first define a function to then use it)  
can be accomplished with a decorator.

Thanks GvR and all that finally shed some light in ruby blocks. I  
never understood what was so special about them, now I know it is  
nothing really. :)


--
Leonardo Santagada
santagada at gmail.com


From grosser.meister.morti at gmx.net  Tue Mar 10 03:16:48 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Tue, 10 Mar 2009 03:16:48 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
Message-ID: <49B5CD90.1060205@gmx.net>

Guido van Rossum wrote:
 > On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
 >> Like many other Pythonistas I recognize that that an uninformative stock
 >> name of '<lambda>' is defective relative to an informative name that points
 >> back to readable code.  What I dislike is the anonymity-cult claim that the
 >> defect is a virtue.
 >>
 >> Since I routinely use standard names 'f' and 'g' (from math) to name
 >> functions whose name I do not care about, I am baffled (and annoyed) by
 >> (repeated) claims such as "Having to name a one-off function adds additional
 >> cognitive overload to a developer." (Tav).  Golly gee, if one cannot decide
 >> on standard one-char name, how can he manage the rest of Python?
 >>
 >> (I also, like others, routinely use 'C' for class and 'c' for C instance.
 >>  What next?  A demand for anonymous classes? Whoops, I just learned that
 >> Java has those.)
 >>
 >> But I have no problem with the use of lambda expressions as a convenience,
 >> where appropriate.
 >
 > Andrew Koening once gave me a good use case where lambdas are really a
 > lot more convenient than named functions. He was initializing a large
 > data structure that was used by an interpreter for some language. It
 > was a single expression (probably a list of tuples or a dict). Each
 > record contained various bits of information (e.g. the operator symbol
 > and its precedence and associativity) as well as a function (almost
 > always a very simple lambda) that implemented it. Since this table was
 > 100s of records long, it would have been pretty inconvenient to first
 > have to define 100s of small one-line functions and give them names,
 > only to reference them once in the initializer.
 >
 > This use case doesn't have a nice equivalent without anonymous
 > functions (though I'm sure that if there really was no other way it
 > could be done, e.g. using registration=style decorators).
 >

With a small trick you don't need lambdas for this, if the keys are python 
identifiers. If they aren't you can add the real key to the info tuple and then 
generate a new dict in a oneliner. But yeah, it's a bit ugly/abusive. But it is 
possible without defining parts of the tuple at different places! :)

def info(*args):
	def wrapper(f):
		return args + (f,)
	
	return wrapper

def mkdict():
	@info("a",12,9.9,None)
	def foo():
		print "this is foo"

	@info("b",23,0.0,foo)
	def bar():
		print "this is bar"
		
	@info("c",42,3.1415,None)
	def baz():
		print "this is baz"

	return locals()
	
print mkdict()


	-panzi


From stephen at xemacs.org  Tue Mar 10 05:14:10 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 10 Mar 2009 13:14:10 +0900
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp47jo$1s7$1@ger.gmane.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
Message-ID: <87iqmi5819.fsf@xemacs.org>

Terry Reedy writes:

 > Like many other Pythonistas I recognize that that an uninformative stock 
 > name of '<lambda>' is defective relative to an informative name that 
 > points back to readable code.  What I dislike is the anonymity-cult 
 > claim that the defect is a virtue.

That's unfair.  Python has "anonymous blocks" all over the place,
since every control structure controls one or more of them.  It simply
requires that they be forgotten at the next DEDENT.  Surely you don't
advocate that each of them should get a name!

I think this is a difference of cognition.  Specifically, people who
don't want to name blocks as functions may not abstract processes to
signatures as easily, and reify whole processes (including all free
identifiers!) as objects more easily, as those who don't think naming
is a problem.

 > Since I routinely use standard names 'f' and 'g' (from math) to name 
 > functions whose name I do not care about, I am baffled (and annoyed) by 

If the cognition hypothesis is correct, of course you're baffled.  You
"just don't" think that way, while he really does.  The annoyance can
probably be relieved by s/a developer/some developers/ here:

 > (repeated) claims such as "Having to name a one-off function adds 
 > additional cognitive overload to a developer." (Tav).

I suspect that "overload" is a pun, here.  Your rhetorical question

 > Golly gee, if one cannot decide on standard one-char name, how can
 > he manage the rest of Python?

has an unexpected answer: in the rest of Python name overloading is
carefully controlled and scoped into namespaces.  If my cognition
hypothesis is correct, then a standard one-character name really does
bother/confuse his cognition, where maintaining the whole structure of
the block from one use to the next somehow does not.  (This baffles
me, too!)

The question then becomes "can Python become more usable to developers
unlike you and me without losing some Pythonicity?"  Guido seems to
think not (I read him as pessimistic on both grounds: the proposed
syntax is neither as useful nor as Pythonic as Tav thinks it is).


From cmjohnson.mailinglist at gmail.com  Tue Mar 10 06:59:39 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Mon, 9 Mar 2009 19:59:39 -1000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
Message-ID: <3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com>

> Sounds like you might as well write a decorator named @using:
>
>  @using(employees.select)
>  def _(employee):
>     if employee.salary > developer.salary:
>         fireEmployee(employee)
>     else:
>         extendContract(employee)

I like that (which is why I proposed allowing lambda decorators last
month), but I'm uncomfortable with how after all is said and done, _
will either be set to something that's not a callable or to a callable
that no one is ever supposed to call. Perhaps if we allowed for this:

@using(employees.select) as results
def (employee): #It is mandatory that no name be used here and
               #that parentheses are included
  if employee.salary > developer.salary:
      fireEmployee(employee)
  else:
      extendContract(employee)

#results = [ list of employees ]

to be syntatic sugar for this:

def callback(employee):
  if employee.salary > developer.salary:
      fireEmployee(employee)
  else:
      extendContract(employee)

results = using(employees.select)(callback)
#results = [ list of employees ]

Similarly, the horrible Java-style callback mentioned earlier in the thread

self.Bind(wx.BUTTON, lamda: evt
                       <some huge block of code here>
                            , mybutton)

might with a change in API become something like

@Bind(wx.BUTTON, mybutton) as connected_button_object
def (event):
   #do event handling stuff?
   return

Ruby blocks let them write,

>> (1..10).map { |x| 2 * x }
=> [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

which for us would be a simple [2 * x for x in range(1, 11)], but
their version has the advantage of being able to be expanded to a
series of expressions and statements instead of a single expression if
need be. With my proposal, someone could write:

>>> def map_dec(l):
...     def f(callback):
...         return [callback(item) for item in l]
...     return f
...
>>> @map_dec(range(10)) as results
... def (item):
...     return 2 * item
...
>>> results
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Basically, the "as" would be used to indicate "no one cares about the
following function by itself, they just want to use it as a callback
to get some result".

My-doomed-proposally-yours,

-- Carl


From stephen at xemacs.org  Tue Mar 10 08:38:18 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 10 Mar 2009 16:38:18 +0900
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <87iqmi5819.fsf@xemacs.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org> <87iqmi5819.fsf@xemacs.org>
Message-ID: <87d4cp6d5h.fsf@xemacs.org>

Stephen J. Turnbull writes:

 > If my cognition hypothesis is correct, then a standard
 > one-character name really does bother/confuse his cognition, where
 > maintaining the whole structure of the block from one use to the
 > next somehow does not.  (This baffles me, too!)

This is mis-written.  Since the block is "one-off", there is no "next
use".  So I guess the thought process is "I'm in a context, and I need
to operate on it, so now I define the process: <block>."  Not
baffling, but still foreign to me personally.


From denis.spir at free.fr  Tue Mar 10 09:19:36 2009
From: denis.spir at free.fr (spir)
Date: Tue, 10 Mar 2009 09:19:36 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp43p8$kk5$1@ger.gmane.org>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info>
	<gp43p8$kk5$1@ger.gmane.org>
Message-ID: <20090310091936.6e782ca3@o>

Le Mon, 09 Mar 2009 18:04:32 -0400,
Terry Reedy <tjreedy at udel.edu> s'exprima ainsi:

> Lambdas are function-defining expressions used *within* statements that 
> give the resulting function object a stock .__name__ of '<lambda>'.  The 
> syntax could have been augmented to include a real name, so the 
> stock-name anonymity is a side-effect of the chosen syntax.
> Possibilities include
>    lambda name(args): expression
>    lambda <name> args: expression
> The latter, assuming it is LL(1) parse-able, would even be compatible 
> with existing code and could still be added.
> 
> Contrarywise, function-defining def statements could have been allowed 
> to omit the name.  To be useful, the object (with a .__name__ such as 
> '<def>', would have to get a default namespace binding such as to '_', 
> even in batch mode.

I do not agree with that. It is missing the point of lambdas. Lambdas are snippets of code equivalent to expressions to be used in place. Lambdas are *not* called, need not beeing callable, rather they are *used* by higher order functions like map. The fact that they do not have any name in syntax thus properly matches their semantic "anonymousity" ;-)

> > A multi-line lambda (technically, multi-statement)  
>
> The problem is that 'multi-statement expression' is an oxymoron in 
> Pythonland.
>
> > would also be an anonymous function.  
>
> Not necessarily, and irrelevant to the essence of lambda expressions, 
> which is that they are expressions that can be used within statements.

I do not see any contradiction with the "essence of lambda expressions" here. We could have a syntax for multi-statement lambdas without any semantic contradiction. The issue is more probably that it does not fit well python's style and syntax (esp. indent) and would hinder legibility and simplicity.

print map(lambda x:
        {fact = None if x<0 else factorial(x); return "%s: %s" %(x,fact)},
    seq)

It's ugly, sure.
Still, I do not see the "essence of lambda expressions" twisted.
I vote -1 to "block-lambdas" for the sake of clarity only.

Denis
------
la vita e estrany


From denis.spir at free.fr  Tue Mar 10 09:58:22 2009
From: denis.spir at free.fr (spir)
Date: Tue, 10 Mar 2009 09:58:22 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
Message-ID: <20090310095822.0f957a4d@o>

Le Tue, 10 Mar 2009 01:13:42 +0100,
Jan Kanis <jan.kanis at phil.uu.nl> s'exprima ainsi:

> This being python-ideas, I'll also have a go at it.
> 
> Being someone who does like functional programing when used in limited
> quantities, I also think multi line lambdas (or blocks, whatever you
> call them) are a good thing if a good way could be found to embed them
> into Python. But I don't like the part of tavs proposal of handling
> them with a magic __do__ function. So what about this slightly
> modified syntax and semantics:
> 
> def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME:
> 	BODY
> 
> eg:
> 
>  def callback(param) as result with do_something(with_our(callback),
> other, args):
> 	print("called back with "+param)
> 	return foobar(param)

I like this proposal much more than all previous ones.
Still, how would you (or anybody else) introduce the purpose, meaning, use of this construct, and its language-level semantics? [This is not disguised critics, neither rethoric question: I'm really interested in answers.]


Denis
------
la vita e estrany


From sturla at molden.no  Tue Mar 10 15:51:24 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 10 Mar 2009 15:51:24 +0100
Subject: [Python-ideas] cd statement?
Message-ID: <49B67E6C.6020206@molden.no>


When working with Python interactive shell (particularly IDLE started 
from Windows start menu), one thing I miss is a cd statement. Ok, I can do

 >>> import os
 >>> os.setcwd('e:\\work')

But I keep feeling that Matlab's cd statement is more handy:

 >> cd e:\work

One other feature that makes Matlab's shell more handy, is the whos 
statement. It lists all variables created form the shell, types, etc. 
Yes it is possible to get all local and global names in Python/IDLE, but 
that is not the same. The variables created interactively get hidden in 
the clutter.

IDLE also lacks a command history. If I e.g. make a typo, why do I have 
to copy and paste, instead of just hitting the arrow button?

Although cosmetically, these three small things keep annoying me. :-(

Sturla Molden


From phd at phd.pp.ru  Tue Mar 10 15:58:59 2009
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Tue, 10 Mar 2009 17:58:59 +0300
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B67E6C.6020206@molden.no>
References: <49B67E6C.6020206@molden.no>
Message-ID: <20090310145859.GA20242@phd.pp.ru>

On Tue, Mar 10, 2009 at 03:51:24PM +0100, Sturla Molden wrote:
> >> cd e:\work

   See DirChanger at http://phd.pp.ru/Software/dotfiles/init.py.html . With
it you can do

>>> cd('work')
>>> cd
/home/phd/work

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From 8mayday at gmail.com  Tue Mar 10 16:07:01 2009
From: 8mayday at gmail.com (Andrey Popp)
Date: Tue, 10 Mar 2009 18:07:01 +0300
Subject: [Python-ideas] cd statement?
In-Reply-To: <20090310145859.GA20242@phd.pp.ru>
References: <49B67E6C.6020206@molden.no> <20090310145859.GA20242@phd.pp.ru>
Message-ID: <ad08f5800903100807u17038909se1d1c41b6199f956@mail.gmail.com>

Why not to use IPython?

On Tue, Mar 10, 2009 at 5:58 PM, Oleg Broytmann <phd at phd.pp.ru> wrote:
> On Tue, Mar 10, 2009 at 03:51:24PM +0100, Sturla Molden wrote:
>> >> cd e:\work
>
> ? See DirChanger at http://phd.pp.ru/Software/dotfiles/init.py.html . With
> it you can do
>
>>>> cd('work')
>>>> cd
> /home/phd/work
>
> Oleg.
> --
> ? ? Oleg Broytmann ? ? ? ? ? ?http://phd.pp.ru/ ? ? ? ? ? ?phd at phd.pp.ru
> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
? ?????????, ?????? ????.
+7 911 740 24 91


From malaclypse2 at gmail.com  Tue Mar 10 16:08:27 2009
From: malaclypse2 at gmail.com (Jerry Hill)
Date: Tue, 10 Mar 2009 11:08:27 -0400
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B67E6C.6020206@molden.no>
References: <49B67E6C.6020206@molden.no>
Message-ID: <16651e80903100808t6abc47at329fec2df7ee4ae@mail.gmail.com>

On Tue, Mar 10, 2009 at 10:51 AM, Sturla Molden <sturla at molden.no> wrote:
> IDLE also lacks a command history. If I e.g. make a typo, why do I have to
> copy and paste, instead of just hitting the arrow button?

Command history in IDLE is bound to alt-n (next) and alt-p (previous)
by default.

-- 
Jerry


From jjb5 at cornell.edu  Tue Mar 10 15:56:01 2009
From: jjb5 at cornell.edu (Joel Bender)
Date: Tue, 10 Mar 2009 10:56:01 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
Message-ID: <49B67F81.5030305@cornell.edu>

Guido van Rossum wrote:

>  @using(employees.select)
>  def _(employee):
>      if employee.salary > developer.salary:
>          fireEmployee(employee)
>      else:
>          extendContract(employee)

I personally don't mind anonymous functions, I use them when I can fit 
everything on mostly one line and they don't have any side effects. 
None of my decorators that I write actually call the function they are 
passed either.

So there could be a lambda statement...

     @using(employees.select)
     lambda employee:
         if employee.developer and employee.not_using_python:
             fireDeveloper(employee)

...which would make a function that isn't named, purely for its side 
effects.

-$0.02


Joel


From arnodel at googlemail.com  Tue Mar 10 17:17:25 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Tue, 10 Mar 2009 16:17:25 +0000
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B67E6C.6020206@molden.no>
References: <49B67E6C.6020206@molden.no>
Message-ID: <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>

2009/3/10 Sturla Molden <sturla at molden.no>:
>
> When working with Python interactive shell (particularly IDLE started from
> Windows start menu), one thing I miss is a cd statement. Ok, I can do
>
>>>> import os
>>>> os.setcwd('e:\\work')
>
> But I keep feeling that Matlab's cd statement is more handy:
>
>>> cd e:\work
>
> One other feature that makes Matlab's shell more handy, is the whos
> statement. It lists all variables created form the shell, types, etc. Yes it
> is possible to get all local and global names in Python/IDLE, but that is
> not the same. The variables created interactively get hidden in the clutter.
>
> IDLE also lacks a command history. If I e.g. make a typo, why do I have to
> copy and paste, instead of just hitting the arrow button?
>
> Although cosmetically, these three small things keep annoying me. :-(

Have you tried IPython?

-- 
Arnaud


From sturla at molden.no  Tue Mar 10 17:29:12 2009
From: sturla at molden.no (Sturla Molden)
Date: Tue, 10 Mar 2009 17:29:12 +0100
Subject: [Python-ideas] cd statement?
In-Reply-To: <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>
References: <49B67E6C.6020206@molden.no>
	<9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>
Message-ID: <49B69558.3090000@molden.no>

Arnaud Delobelle wrote:
> Have you tried IPython?
Yes, it has all that I miss, but it's ugly (at least on Windows, where 
it runs in a DOS shell).

S.M.


From guido at python.org  Tue Mar 10 17:33:27 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 09:33:27 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <20090310095822.0f957a4d@o>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
	<20090310095822.0f957a4d@o>
Message-ID: <ca471dc20903100933m22bd831bu9e8fb7b30e6161db@mail.gmail.com>

On Tue, Mar 10, 2009 at 1:58 AM, spir <denis.spir at free.fr> wrote:
> Le Tue, 10 Mar 2009 01:13:42 +0100,
> Jan Kanis <jan.kanis at phil.uu.nl> s'exprima ainsi:
>
>> This being python-ideas, I'll also have a go at it.
>>
>> Being someone who does like functional programing when used in limited
>> quantities, I also think multi line lambdas (or blocks, whatever you
>> call them) are a good thing if a good way could be found to embed them
>> into Python. But I don't like the part of tavs proposal of handling
>> them with a magic __do__ function. So what about this slightly
>> modified syntax and semantics:
>>
>> def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME:
>> ? ? ? BODY
>>
>> eg:
>>
>> ?def callback(param) as result with do_something(with_our(callback),
>> other, args):
>> ? ? ? print("called back with "+param)
>> ? ? ? return foobar(param)
>
> I like this proposal much more than all previous ones.

Just to avoid getting your hopes up too high, this gets a solid -1
from me, since it just introduces unwieldy new additions to the
currently clean 'def' syntax, to accomplish something you can already
do just as easily with a decorator.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From taleinat at gmail.com  Tue Mar 10 17:52:48 2009
From: taleinat at gmail.com (Tal Einat)
Date: Tue, 10 Mar 2009 18:52:48 +0200
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B69558.3090000@molden.no>
References: <49B67E6C.6020206@molden.no>
	<9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>
	<49B69558.3090000@molden.no>
Message-ID: <7afdee2f0903100952s15e0a836l366bd2158b13c125@mail.gmail.com>

Sturla Molden wrote:
> Arnaud Delobelle wrote:
>>
>> Have you tried IPython?
>
> Yes, it has all that I miss, but it's ugly (at least on Windows, where it
> runs in a DOS shell).
>
> S.M.
>

Hear, hear! GUI interactive prompts FTW!

In IDLE you can also just move the cursor to a previous line of code
(or code block), hit Return and you'll have that code on your current
command line, ready to be edited and executed.

As for changing directories, I find "from os import chdir as cd,
getcwd as cwd" satisfactory. I have it in the python file referenced
by the PYTHONSTARTUP environment variable, and I've changed all the
relevant shortcuts (on Windows) to run IDLE with the -s flag (which
causes the PYTHONSTARTUP file to be imported before anything else).

- Tal


From guido at python.org  Tue Mar 10 18:05:30 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 10:05:30 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090310091936.6e782ca3@o>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info> <gp43p8$kk5$1@ger.gmane.org>
	<20090310091936.6e782ca3@o>
Message-ID: <ca471dc20903101005i6bf90445u4ff129a2d07e4636@mail.gmail.com>

On Tue, Mar 10, 2009 at 1:19 AM, spir <denis.spir at free.fr> wrote:
> I do not agree with that. It is missing the point of lambdas. Lambdas are snippets of code equivalent to expressions to be used in place. Lambdas are *not* called, need not beeing callable, rather they are *used* by higher order functions like map. The fact that they do not have any name in syntax thus properly matches their semantic "anonymousity" ;-)

Eh? On what planet do you live? What use is a lambda if it is never
called? It will *eventually* be called -- if it is never called you
might as well substitute None and your program would run the same.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Tue Mar 10 18:07:58 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 10:07:58 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
	<3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com>
Message-ID: <ca471dc20903101007k309a9c11te799c37e7dd2d5f1@mail.gmail.com>

On Mon, Mar 9, 2009 at 10:59 PM, Carl Johnson
<cmjohnson.mailinglist at gmail.com> wrote:
[Guido]
>> Sounds like you might as well write a decorator named @using:
>>
>> ?@using(employees.select)
>> ?def _(employee):
>> ? ? if employee.salary > developer.salary:
>> ? ? ? ? fireEmployee(employee)
>> ? ? else:
>> ? ? ? ? extendContract(employee)
>
> I like that (which is why I proposed allowing lambda decorators last
> month), but I'm uncomfortable with how after all is said and done, _
> will either be set to something that's not a callable or to a callable
> that no one is ever supposed to call.

Well, _ is by convention often used as a "throw-away" result. So I am
totally comfortable with this. Or, at least, I am as comfortable with
it as I am with writing

  name, phone, _, _ = record

to unpack a 4-tuple when the last two elements are not needed.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ironfroggy at gmail.com  Tue Mar 10 18:13:09 2009
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Tue, 10 Mar 2009 13:13:09 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
Message-ID: <76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com>

On Mon, Mar 9, 2009 at 12:39 PM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Mar 9, 2009 at 6:04 AM, tav <tav at espians.com> wrote:
>> I've come up with a way to do Ruby-style blocks in what I feel to be a
>> Pythonic way:
>>
>> ?using employees.select do (employee):
>> ? ? ?if employee.salary > developer.salary:
>> ? ? ? ? ?fireEmployee(employee)
>> ? ? ?else:
>> ? ? ? ? ?extendContract(employee)
>
> Sounds like you might as well write a decorator named @using:
>
> ?@using(employees.select)
> ?def _(employee):
> ? ? if employee.salary > developer.salary:
> ? ? ? ? fireEmployee(employee)
> ? ? else:
> ? ? ? ? extendContract(employee)

What would `using` here do that decorating the temp function with
employees.select itself wouldn't do? All you are doing is saying "Pass
this function to this function" which is exactly what decorators
already do.

> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From guido at python.org  Tue Mar 10 18:24:59 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 10:24:59 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903090939n3e6117a4u10367dfff59d39bd@mail.gmail.com>
	<76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com>
Message-ID: <ca471dc20903101024j5252b049y6ae30c0629affe51@mail.gmail.com>

On Tue, Mar 10, 2009 at 10:13 AM, Calvin Spealman <ironfroggy at gmail.com> wrote:
> On Mon, Mar 9, 2009 at 12:39 PM, Guido van Rossum <guido at python.org> wrote:
>> On Mon, Mar 9, 2009 at 6:04 AM, tav <tav at espians.com> wrote:
>>> I've come up with a way to do Ruby-style blocks in what I feel to be a
>>> Pythonic way:
>>>
>>> ?using employees.select do (employee):
>>> ? ? ?if employee.salary > developer.salary:
>>> ? ? ? ? ?fireEmployee(employee)
>>> ? ? ?else:
>>> ? ? ? ? ?extendContract(employee)
>>
>> Sounds like you might as well write a decorator named @using:
>>
>> ?@using(employees.select)
>> ?def _(employee):
>> ? ? if employee.salary > developer.salary:
>> ? ? ? ? fireEmployee(employee)
>> ? ? else:
>> ? ? ? ? extendContract(employee)
>
> What would `using` here do that decorating the temp function with
> employees.select itself wouldn't do? All you are doing is saying "Pass
> this function to this function" which is exactly what decorators
> already do.

I think the original proposal was implying some kind of loop over the
values returned by employees.select(). But it's really irrelevant for
the equivalency.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From denis.spir at free.fr  Tue Mar 10 19:38:15 2009
From: denis.spir at free.fr (spir)
Date: Tue, 10 Mar 2009 19:38:15 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903101005i6bf90445u4ff129a2d07e4636@mail.gmail.com>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info>
	<gp43p8$kk5$1@ger.gmane.org> <20090310091936.6e782ca3@o>
	<ca471dc20903101005i6bf90445u4ff129a2d07e4636@mail.gmail.com>
Message-ID: <20090310193815.454f1975@o>

Le Tue, 10 Mar 2009 10:05:30 -0700,
Guido van Rossum <guido at python.org> s'exprima ainsi:

> On Tue, Mar 10, 2009 at 1:19 AM, spir <denis.spir at free.fr> wrote:
> > I do not agree with that. It is missing the point of lambdas. Lambdas are
> > snippets of code equivalent to expressions to be used in place. Lambdas
> > are *not* called, need not beeing callable, rather they are *used* by
> > higher order functions like map. The fact that they do not have any name
> > in syntax thus properly matches their semantic "anonymousity" ;-)
> 
> Eh? On what planet do you live? What use is a lambda if it is never
> called? It will *eventually* be called -- if it is never called you
> might as well substitute None and your program would run the same.
> 

Should have written: "called in code" (!). Thought it was obvious, sorry. It's executed indeed through the func it is passed to, but not (explicitely) called. The point I meant is: it is locally used -- not called from anywhere else -- the reason why it needs no name.

denis
------
la vita e estrany


From guido at python.org  Tue Mar 10 19:47:21 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 11:47:21 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090310193815.454f1975@o>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>
	<200903100403.04733.steve@pearwood.info>
	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>
	<200903100450.56903.steve@pearwood.info> <gp43p8$kk5$1@ger.gmane.org>
	<20090310091936.6e782ca3@o>
	<ca471dc20903101005i6bf90445u4ff129a2d07e4636@mail.gmail.com>
	<20090310193815.454f1975@o>
Message-ID: <ca471dc20903101147y58b5b90fl8395b7815e5572f@mail.gmail.com>

On Tue, Mar 10, 2009 at 11:38 AM, spir <denis.spir at free.fr> wrote:
> Le Tue, 10 Mar 2009 10:05:30 -0700,
> Guido van Rossum <guido at python.org> s'exprima ainsi:
>
>> On Tue, Mar 10, 2009 at 1:19 AM, spir <denis.spir at free.fr> wrote:
>> > I do not agree with that. It is missing the point of lambdas. Lambdas are
>> > snippets of code equivalent to expressions to be used in place. Lambdas
>> > are *not* called, need not beeing callable, rather they are *used* by
>> > higher order functions like map. The fact that they do not have any name
>> > in syntax thus properly matches their semantic "anonymousity" ;-)
>>
>> Eh? On what planet do you live? What use is a lambda if it is never
>> called? It will *eventually* be called -- if it is never called you
>> might as well substitute None and your program would run the same.
>>
>
> Should have written: "called in code" (!). Thought it was obvious, sorry. It's executed indeed through the func it is passed to, but not (explicitely) called. The point I meant is: it is locally used -- not called from anywhere else -- the reason why it needs no name.

OK, I understand what you're saying now. But I don't agree with your
position that if it isn't used locally it doesn't deserve a name.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From tjreedy at udel.edu  Tue Mar 10 20:40:33 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Mar 2009 15:40:33 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <20090310091936.6e782ca3@o>
References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com>	<200903100403.04733.steve@pearwood.info>	<eb24b25b0903091015t27eca982ka90ec1b526af4c9f@mail.gmail.com>	<200903100450.56903.steve@pearwood.info>	<gp43p8$kk5$1@ger.gmane.org>
	<20090310091936.6e782ca3@o>
Message-ID: <gp6fni$alq$1@ger.gmane.org>

spir wrote:
> Le Mon, 09 Mar 2009 18:04:32 -0400, Terry Reedy
> <tjreedy at udel.edu> s'exprima ainsi:
> 
>> Lambdas are function-defining expressions used *within* statements
>> that give the resulting function object a stock .__name__ of
>> '<lambda>'.  The syntax could have been augmented to include a real
>> name, so the stock-name anonymity is a side-effect of the chosen
>> syntax. Possibilities include lambda name(args): expression lambda
>> <name> args: expression The latter, assuming it is LL(1)
>> parse-able, would even be compatible with existing code and could
>> still be added.
>> 
>> Contrarywise, function-defining def statements could have been
>> allowed to omit the name.  To be useful, the object (with a
>> .__name__ such as '<def>', would have to get a default namespace
>> binding such as to '_', even in batch mode.
> 
> I do not agree with that. It is missing the point of lambdas. Lambdas
> are snippets of code equivalent to expressions to be used in place.
> Lambdas are *not* called, need not beeing callable, rather they are
> *used* by higher order functions like map. The fact that they do not
> have any name in syntax thus properly matches their semantic
> "anonymousity" ;-)

??? A lambda expression and a def statement both produce function 
objects.  The *only* difference between the objects produced by a lambda 
expression and the equivalent def statement is that the former gets the 
stock .__name__ of '<lambda>, which is less useful that a specific name 
should there be a traceback.  The important difference between a 
function-defining expression and statement is that the former can be 
used in expression context within statements and the latter cannot.

tjr


From tjreedy at udel.edu  Tue Mar 10 21:00:50 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Mar 2009 16:00:50 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <87iqmi5819.fsf@xemacs.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>	<gp47jo$1s7$1@ger.gmane.org>
	<87iqmi5819.fsf@xemacs.org>
Message-ID: <gp6gtj$f95$1@ger.gmane.org>

Stephen J. Turnbull wrote:
> Terry Reedy writes:
> 
>  > Like many other Pythonistas I recognize that that an uninformative stock 
>  > name of '<lambda>' is defective relative to an informative name that 
>  > points back to readable code.  What I dislike is the anonymity-cult 
>  > claim that the defect is a virtue.
> 
> That's unfair.

It is unfair to dislike false statements?
I think that *that* is unfair ;-)

> Python has "anonymous blocks" all over the place,
> since every control structure controls one or more of them.  It simply
> requires that they be forgotten at the next DEDENT.  Surely you don't
> advocate that each of them should get a name!

Surely, I did not.  And surely you cannot really think I suggested such.

Every expression and every statement or group of statements defines a 
function on the current namespaces, but I was talking about Python 
function objects.  And I never said that they necessarily should get an 
individual name (and indeed I went on to say that I too use 'f' and 'g' 
as stock, don't-care names) but only that I dislike the silly claim that 
being named '<lambda>' is a virtue.  And this was in the context you 
snipped of Aahz saying that some disliked the *use* of lambda 
expressions (as opposed to the promotion of their result as superior).

> I think this is a difference of cognition.

I do not think it a 'difference of cognition', in the usual sense of the 
term, to think that a more informative traceback is a teeny bit 
superior, and certainly not inferior, to a less informative traceback. 
Unless of course you mean that all disagreements are such.

Terry Jan Reedy


From tjreedy at udel.edu  Tue Mar 10 21:15:01 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Mar 2009 16:15:01 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>	<49B534A7.2010001@molden.no>
	<20090309154505.GA18115@panix.com>	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
Message-ID: <gp6ho6$iis$1@ger.gmane.org>

Guido van Rossum wrote:
> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>> Like many other Pythonistas I recognize that that an uninformative stock
>> name of '<lambda>' is defective relative to an informative name that points
>> back to readable code.  What I dislike is the anonymity-cult claim that the
>> defect is a virtue.
>>
>> Since I routinely use standard names 'f' and 'g' (from math) to name
>> functions whose name I do not care about, I am baffled (and annoyed) by
>> (repeated) claims such as "Having to name a one-off function adds additional
>> cognitive overload to a developer." (Tav).  Golly gee, if one cannot decide
>> on standard one-char name, how can he manage the rest of Python?
>>
>> (I also, like others, routinely use 'C' for class and 'c' for C instance.
>>  What next?  A demand for anonymous classes? Whoops, I just learned that
>> Java has those.)
>>
>> But I have no problem with the use of lambda expressions as a convenience,
>> where appropriate.
> 
> Andrew Koening once gave me a good use case where lambdas are really a
> lot more convenient than named functions. He was initializing a large
> data structure that was used by an interpreter for some language. It
> was a single expression (probably a list of tuples or a dict). Each
> record contained various bits of information (e.g. the operator symbol
> and its precedence and associativity) as well as a function (almost
> always a very simple lambda) that implemented it. Since this table was
> 100s of records long, it would have been pretty inconvenient to first
> have to define 100s of small one-line functions and give them names,
> only to reference them once in the initializer.

Initializing such structures is one of the use cases I intended under 
'where appropriate'.  Adding more powerful expressions, like 
comprehensions (and g.e's) that do not break Python's basic syntactic 
model of mixed expressions and indented statements has added to the 
convenience.

> This use case doesn't have a nice equivalent without anonymous
> functions (though I'm sure that if there really was no other way it
> could be done, e.g. using registration=style decorators).

The convenience is from having function expressions.  If the expression 
syntax allowed the optional attachment of a name, it would be just as 
convenient.  In some cases, I am sure people would find it even more 
convenient if they could add in a name, especially when there is nothing 
else in the structure to serve as a substitute.

'Anonymous' is a different concept from 'expression-defined' despite the 
tendency to conflate the two.

Terry Jan Reedy


From guido at python.org  Tue Mar 10 21:25:33 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 13:25:33 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp6ho6$iis$1@ger.gmane.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
Message-ID: <ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>

On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Guido van Rossum wrote:
>>
>> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>>>
>>> Like many other Pythonistas I recognize that that an uninformative stock
>>> name of '<lambda>' is defective relative to an informative name that
>>> points
>>> back to readable code. ?What I dislike is the anonymity-cult claim that
>>> the
>>> defect is a virtue.
>>>
>>> Since I routinely use standard names 'f' and 'g' (from math) to name
>>> functions whose name I do not care about, I am baffled (and annoyed) by
>>> (repeated) claims such as "Having to name a one-off function adds
>>> additional
>>> cognitive overload to a developer." (Tav). ?Golly gee, if one cannot
>>> decide
>>> on standard one-char name, how can he manage the rest of Python?
>>>
>>> (I also, like others, routinely use 'C' for class and 'c' for C instance.
>>> ?What next? ?A demand for anonymous classes? Whoops, I just learned that
>>> Java has those.)
>>>
>>> But I have no problem with the use of lambda expressions as a
>>> convenience,
>>> where appropriate.
>>
>> Andrew Koening once gave me a good use case where lambdas are really a
>> lot more convenient than named functions. He was initializing a large
>> data structure that was used by an interpreter for some language. It
>> was a single expression (probably a list of tuples or a dict). Each
>> record contained various bits of information (e.g. the operator symbol
>> and its precedence and associativity) as well as a function (almost
>> always a very simple lambda) that implemented it. Since this table was
>> 100s of records long, it would have been pretty inconvenient to first
>> have to define 100s of small one-line functions and give them names,
>> only to reference them once in the initializer.
>
> Initializing such structures is one of the use cases I intended under 'where
> appropriate'. ?Adding more powerful expressions, like comprehensions (and
> g.e's) that do not break Python's basic syntactic model of mixed expressions
> and indented statements has added to the convenience.
>
>> This use case doesn't have a nice equivalent without anonymous
>> functions (though I'm sure that if there really was no other way it
>> could be done, e.g. using registration=style decorators).
>
> The convenience is from having function expressions. ?If the expression
> syntax allowed the optional attachment of a name, it would be just as
> convenient. ?In some cases, I am sure people would find it even more
> convenient if they could add in a name, especially when there is nothing
> else in the structure to serve as a substitute.
>
> 'Anonymous' is a different concept from 'expression-defined' despite the
> tendency to conflate the two.

If I read you correctly you're saying that having an expression that
returns a function (other than referencing it by name) is not the same
as having anonymous functions. This sounds like quite the
hairsplitting argument. Why is it important to you to split this
particular hair?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From tjreedy at udel.edu  Tue Mar 10 21:29:42 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Mar 2009 16:29:42 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
	<73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com>
Message-ID: <gp6ijn$lkt$1@ger.gmane.org>

Leonardo Santagada wrote:
> 
> On Mar 9, 2009, at 9:13 PM, Jan Kanis wrote:
> 
>> def callback(param) as result with do_something(with_our(callback),
>> other, args):
>>     print("called back with "+param)
>>     return foobar(param)
>>
>>
>> this would be equivalent to
>>
>> def callback(param):
>>     print("called back with "+param)
>>     return foobar(param)
>>     
>> result = do_something(with_our(callback), other_args)
> 
> 
> Not only the equivalent code looks much cleaner,

I completely agree.

 > the only good thing it
> actually do (not having to first define a function to then use it) can 
> be accomplished with a decorator.

If a decolib were ever assembled, a callback(receiving_func, args) would 
be a good one to include.

I think I understand now that one of the reasons to use a decorator is 
to say what you are going to do with a function before you define it so 
that the person reading the definition can read it in that light.  What 
I like is that the decorator form still leaves the definition cleanly 
separate from the context.

tjr


From tjreedy at udel.edu  Tue Mar 10 21:34:06 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 Mar 2009 16:34:06 -0400
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B67E6C.6020206@molden.no>
References: <49B67E6C.6020206@molden.no>
Message-ID: <gp6iru$lkt$2@ger.gmane.org>

Sturla Molden wrote:

> 
> IDLE also lacks a command history. If I e.g. make a typo, why do I have 
> to copy and paste, instead of just hitting the arrow button?

There is already a patch on the tracker to make IDLE's history work the 
same (with arrow keys) as the command window.  It is one of about 70 
open issues which are languishing due to the lack of a 
maintainer/committer.  Someone perhaps volunteered a few days ago.


From jimjjewett at gmail.com  Tue Mar 10 23:05:02 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 10 Mar 2009 18:05:02 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp6gtj$f95$1@ger.gmane.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org> <87iqmi5819.fsf@xemacs.org>
	<gp6gtj$f95$1@ger.gmane.org>
Message-ID: <fb6fbf560903101505x50a2f493i8d0afc2e9f6a07af@mail.gmail.com>

On 3/10/09, Terry Reedy <tjreedy at udel.edu> wrote:

> ... I dislike the silly claim that
> being named '<lambda>' is a virtue.  And this was in the context you
> snipped of Aahz saying that some disliked the *use* of lambda
> expressions (as opposed to the promotion of their result as superior).

> Stephen J. Turnbull wrote:
>> I think this is a difference of cognition.

> I do not think it a 'difference of cognition', in the usual sense of the
> term, to think that a more informative traceback is a teeny bit
> superior, and certainly not inferior, to a less informative traceback.
> Unless of course you mean that all disagreements are such.

The question is which traceback will be more informative.  A 50-Meg
memory dump will be even more informative, but few people will want to
sift through it.  I *think* at least some of the lambda lovers are
saying something akin to:

'''
This little piece of logic isn't a worth naming as a section; it is
just something that I would do interactively once I got to this point.
 I don't *want* a debugging pointer right to this line, I *want* to go
to the enclosing function to get my bearings.
'''

I'm not convinced, because I've seen so many times when a lambda
actually is crucial to the bug.  That said, I'm the sort of person who
will break up and name subunits even if I have to resort to names like
fn_XXX_slave_1.

And I will certainly admit that there are times when it would be more
useful if the traceback showed the source code of the lambda instead
of showing <lambda> or skipping the frame.

-jJ


From jimjjewett at gmail.com  Tue Mar 10 23:22:03 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 10 Mar 2009 18:22:03 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
Message-ID: <fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>

On 3/10/09, Guido van Rossum <guido at python.org> wrote:
> On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>>> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:

>>>> ... What I dislike is the anonymity-cult claim that the defect is a virtue.

>> The convenience is from having function expressions.  If the expression
>> syntax allowed the optional attachment of a name, it would be just as
>> convenient.  In some cases, I am sure people would find it even more
>> convenient if they could add in a name, especially when there is nothing
>> else in the structure to serve as a substitute.

>> 'Anonymous' is a different concept from 'expression-defined' despite the
>> tendency to conflate the two.

> If I read you correctly you're saying that having an expression that
> returns a function (other than referencing it by name) is not the same
> as having anonymous functions. This sounds like quite the
> hairsplitting argument. Why is it important to you to split this
> particular hair?

An expression that *creates* and returns a function is useful.

A way to create unnamed functions may or may not be useful.

Right now, the two are tied together, as lambda is the best way to do
either.  Mentally untangling them might lead to better code.


If the name in a def were optional, it would meet the perceived need
for anonymity, but still wouldn't meet the need for creating and
returning a function within a single expression.

    # Would this really ever be useful?
    # Not to me, but the anon-lovers suggest yes.
    # Cognition difference, or just confounding the two uses of lambda?
    def (a): return a+3

On the other hand, if def became an expression, it would meet the need
for function-creating expressions (and would have at least reduced the
need for decorators).

    add_callback(button1, def add3(a): return a+3)

(And yes, I understand that there are reasons why class, def, and
import do not return values, even if I sometimes wish they did.)

-jJ


From guido at python.org  Tue Mar 10 23:27:28 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 15:27:28 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
Message-ID: <ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>

On Tue, Mar 10, 2009 at 3:22 PM, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 3/10/09, Guido van Rossum <guido at python.org> wrote:
>> On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>>>> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
>>>>> ... What I dislike is the anonymity-cult claim that the defect is a virtue.
>
>>> The convenience is from having function expressions. ?If the expression
>>> syntax allowed the optional attachment of a name, it would be just as
>>> convenient. ?In some cases, I am sure people would find it even more
>>> convenient if they could add in a name, especially when there is nothing
>>> else in the structure to serve as a substitute.
>
>>> 'Anonymous' is a different concept from 'expression-defined' despite the
>>> tendency to conflate the two.
>
>> If I read you correctly you're saying that having an expression that
>> returns a function (other than referencing it by name) is not the same
>> as having anonymous functions. This sounds like quite the
>> hairsplitting argument. Why is it important to you to split this
>> particular hair?
>
> An expression that *creates* and returns a function is useful.
>
> A way to create unnamed functions may or may not be useful.
>
> Right now, the two are tied together, as lambda is the best way to do
> either. ?Mentally untangling them might lead to better code.

I'm feeling really dense right now -- I still don't see the difference
between the two. Are you saying that you would prefer an expression
that creates a *named* function? That seems to be really bizarre --
like claiming that you don't like expressions that return anonymous
numbers.

> If the name in a def were optional, it would meet the perceived need
> for anonymity, but still wouldn't meet the need for creating and
> returning a function within a single expression.

Moreover, unless you used a decorator, there would be no way to do
anything with the anonymous function, so it would be useless.

> ? ?# Would this really ever be useful?
> ? ?# Not to me, but the anon-lovers suggest yes.
> ? ?# Cognition difference, or just confounding the two uses of lambda?
> ? ?def (a): return a+3
>
> On the other hand, if def became an expression, it would meet the need
> for function-creating expressions (and would have at least reduced the
> need for decorators).

I don't see the conceptual difference between a "def-expression" (if
it were syntactically possible) and a lambda-expression. What is the
difference in your view? Are you sure that difference exists? (It
wouldn't be the first time that people ascribe powers to lambda that
it doesn't have. :-)

> ? ?add_callback(button1, def add3(a): return a+3)

Two questions about this example: (1) Do you expect the name 'add3' to
be bound in the surrounding scope? (2) What is the purpose of the name
other than documenting the obvious?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From dreamingforward at gmail.com  Wed Mar 11 00:50:13 2009
From: dreamingforward at gmail.com (average)
Date: Tue, 10 Mar 2009 16:50:13 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 28, Issue 19
In-Reply-To: <mailman.5912.1236644023.11745.python-ideas@python.org>
References: <mailman.5912.1236644023.11745.python-ideas@python.org>
Message-ID: <913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com>

From: "Stephen J. Turnbull" <stephen at xemacs.org>

Terry Reedy writes:
> > Like many other Pythonistas I recognize that that an uninformative stock
> > name of '<lambda>' is defective relative to an informative name that
> > points back to readable code.  What I dislike is the anonymity-cult
> > claim that the defect is a virtue.
>
> That's unfair.  Python has "anonymous blocks" all over the place,
> since every control structure controls one or more of them.  It simply
> requires that they be forgotten at the next DEDENT.  Surely you don't
> advocate that each of them should get a name!
>
> I think this is a difference of cognition.  Specifically, people who
> don't want to name blocks as functions may not abstract processes to
> signatures as easily, and reify whole processes (including all free
> identifiers!) as objects more easily, as those who don't think naming
> is a problem.
>
> > Since I routinely use standard names 'f' and 'g' (from math) to name
> > functions whose name I do not care about, I am baffled (and annoyed) by...
>
> If the cognition hypothesis is correct, of course you're baffled.  You
> "just don't" think that way, while he really does.  The annoyance can
> probably be relieved by s/a developer/some developers/ here:

I'm glad you're bringing out the cognitive aspect of this, because to
me, though it may seem "gratuitously mystical or mystifying" , there
is an essential epistemological component to this issue related to the
[bidirectional] cognitive mapping of and between mathematical <-- and
--> psychological identity and the confusion stems from the inability
to frame the issue around a single linear "classical" construct as
generally imposed by typed text (did you follow that?).  Even writing
this sentence confounds my multidimensional meaning which I'm trying
to compress into standard language constructs.  Normally, I'd use hand
gestures and inflection to split out the two different orthogonal
aspects of what I'm trying to convey which simply cannot occupy the
same space at the same time, but (obviously) that luxury is
unavailable here.  So in the end, unless there's sufficient purchase
in the listener, I have to be somewhat content sounding like a moron
spouting gibberish.

What Tav's proposal, in my mind, is aiming to do is provide greater
syntactic support within Python so as to minimize cognitive gibberish
when the code is reified in the mind of the viewer.   Of course, it
doesn't help that were culturally trained into VonNeuman
architecture-thinking were such conflation of dimensionality is built
into the hardware itself.  Really, like Stephan is pointing out,
"re-ification" *IS* the best analogy to help elucidate of this issue
(better in German: Verdinglichung).  See wikipedia's "Reification
(Marxism)"  (--though be prepared that, depending on your state of
mind, it will either make sense or sound like its logic is [perfectly]
backward, like some flipped bit because it borders that special
interplay between subject-object.)

These kind of [Anonymous] functions/code blocks explicitly tell the
user that "This is NOT part of my program", yet (due to the classical,
flat nature of standard computer programming) I must "include" (in a
constrained way since I'm not able to include the context or
externalized identity in which this code will be run) it here [in my
editor window text] even though its logical geometry is orthogonal to
my program.  It's like a vortex out of flatland--an interface into a
different dimension, hence it's difficulty in explaining it to the
natives of flatlandia.  To put a name on it puts an identity label
upon something pointing in the wrong direction (i.e. to the
surrounding code) which isn't *meant* to be an an independent block of
usable code or be part of the social context of its surroundings.
It's like seeing your own body's innards mapped inside-out into a
computer program and calling it "marcos" while I continue to function
normally in some other dimensionality in some mysterious way to
magically maintain my normal cognition elsewhere.  Better to see those
innards as anonymous data (that for whatever reason I'm needing to
interface to) even though they are perfectly functioning blocks with
an identity elsewhere (i.e.:  me).  So, yes, "anonymity" can be a
virtue from a given perspective.

...Seems to be a parallel to meta-programming but on the other side of
the scale--instead of abstracting "upwards" into greater levels of
abstraction, it abstracts  sideways and downwards into levels of
concreteness.  Naming in both cases is problematic if you want to
avoid the categorical errors easily made by the flatland of the typed
text.

gibberish?

marcos


From dreamingforward at gmail.com  Wed Mar 11 01:01:41 2009
From: dreamingforward at gmail.com (average)
Date: Tue, 10 Mar 2009 17:01:41 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
Message-ID: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com>

FW:  Sorry, forgot to change the subject line for the sake of threaded
mail readers and archives....

---------- Forwarded message ----------
From: "Stephen J. Turnbull" <stephen at xemacs.org>

Terry Reedy writes:
> > Like many other Pythonistas I recognize that that an uninformative stock
> > name of '<lambda>' is defective relative to an informative name that
> > points back to readable code. ?What I dislike is the anonymity-cult
> > claim that the defect is a virtue.
>
> That's unfair. ?Python has "anonymous blocks" all over the place,
> since every control structure controls one or more of them. ?It simply
> requires that they be forgotten at the next DEDENT. ?Surely you don't
> advocate that each of them should get a name!
>
> I think this is a difference of cognition. ?Specifically, people who
> don't want to name blocks as functions may not abstract processes to
> signatures as easily, and reify whole processes (including all free
> identifiers!) as objects more easily, as those who don't think naming
> is a problem.
>
> > Since I routinely use standard names 'f' and 'g' (from math) to name
> > functions whose name I do not care about, I am baffled (and annoyed) by...
>
> If the cognition hypothesis is correct, of course you're baffled. ?You
> "just don't" think that way, while he really does. ?The annoyance can
> probably be relieved by s/a developer/some developers/ here:

I'm glad you're bringing out the cognitive aspect of this, because to
me, though it may seem "gratuitously mystical or mystifying" , there
is an essential epistemological component to this issue related to the
[bidirectional] cognitive mapping of and between mathematical <-- and
--> psychological identity and the confusion stems from the inability
to frame the issue around a single linear "classical" construct as
generally imposed by typed text (did you follow that?). ?Even writing
this sentence confounds my multidimensional meaning which I'm trying
to compress into standard language constructs. ?Normally, I'd use hand
gestures and inflection to split out the two different orthogonal
aspects of what I'm trying to convey which simply cannot occupy the
same space at the same time, but (obviously) that luxury is
unavailable here. ?So in the end, unless there's sufficient purchase
in the listener, I have to be somewhat content sounding like a moron
spouting gibberish.

What Tav's proposal, in my mind, is aiming to do is provide greater
syntactic support within Python so as to minimize cognitive gibberish
when the code is reified in the mind of the viewer. ? Of course, it
doesn't help that were culturally trained into VonNeuman
architecture-thinking were such conflation of dimensionality is built
into the hardware itself. ?Really, like Stephan is pointing out,
"re-ification" *IS* the best analogy to help elucidate of this issue
(better in German: Verdinglichung). ?See wikipedia's "Reification
(Marxism)" ?(--though be prepared that, depending on your state of
mind, it will either make sense or sound like its logic is [perfectly]
backward, like some flipped bit because it borders that special
interplay between subject-object.)

These kind of [Anonymous] functions/code blocks explicitly tell the
user that "This is NOT part of my program", yet (due to the classical,
flat nature of standard computer programming) I must "include" (in a
constrained way since I'm not able to include the context or
externalized identity in which this code will be run) it here [in my
editor window text] even though its logical geometry is orthogonal to
my program. ?It's like a vortex out of flatland--an interface into a
different dimension, hence it's difficulty in explaining it to the
natives of flatlandia. ?To put a name on it puts an identity label
upon something pointing in the wrong direction (i.e. to the
surrounding code) which isn't *meant* to be an an independent block of
usable code or be part of the social context of its surroundings.
It's like seeing your own body's innards mapped inside-out into a
computer program and calling it "marcos" while I continue to function
normally in some other dimensionality in some mysterious way to
magically maintain my normal cognition elsewhere. ?Better to see those
innards as anonymous data (that for whatever reason I'm needing to
interface to) even though they are perfectly functioning blocks with
an identity elsewhere (i.e.: ?me). ?So, yes, "anonymity" can be a
virtue from a given perspective.

...Seems to be a parallel to meta-programming but on the other side of
the scale--instead of abstracting "upwards" into greater levels of
abstraction, it abstracts ?sideways and downwards into levels of
concreteness. ?Naming in both cases is problematic if you want to
avoid the categorical errors easily made by the flatland of the typed
text.

gibberish?

marcos


From jan.kanis at phil.uu.nl  Wed Mar 11 01:11:34 2009
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Wed, 11 Mar 2009 01:11:34 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <20090310095822.0f957a4d@o>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
	<20090310095822.0f957a4d@o>
Message-ID: <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com>

On Tue, Mar 10, 2009 at 09:58, spir <denis.spir at free.fr> wrote:
> Le Tue, 10 Mar 2009 01:13:42 +0100,
> Jan Kanis <jan.kanis at phil.uu.nl> s'exprima ainsi:
>
>> This being python-ideas, I'll also have a go at it.
>>
>> Being someone who does like functional programing when used in limited
>> quantities, I also think multi line lambdas (or blocks, whatever you
>> call them) are a good thing if a good way could be found to embed them
>> into Python. But I don't like the part of tavs proposal of handling
>> them with a magic __do__ function. So what about this slightly
>> modified syntax and semantics:
>>
>> def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME:
>> ? ? ? BODY
>>
>> eg:
>>
>> ?def callback(param) as result with do_something(with_our(callback),
>> other, args):
>> ? ? ? print("called back with "+param)
>> ? ? ? return foobar(param)
>
> I like this proposal much more than all previous ones.
> Still, how would you (or anybody else) introduce the purpose, meaning, use of this construct, and its language-level semantics? [This is not disguised critics, neither rethoric question: I'm really interested in answers.]


>From my side, this is more of an idea I had than something I've
thought all the way through, so while I think it looks nice it's not
something I'm 100% committed to. It sprung up in my mind as a slightly
better alternative to tavs blocks proposal.

The language level semantics are easily explained by showing the
equivalent code I gave in my first mail. The purpose and use (and
meaning?) are the same as with tavs original proposal for blocks, and
with those of multistatement lambdas. So I guess a major consideration
is whether someone thinks multiline lambdas are in principle a good
idea (setting aside that they can't be implemented in full generality
non-ugly in the python grammar).

On Tue, Mar 10, 2009 at 17:33, Guido van Rossum <guido at python.org> wrote:
> ... to accomplish something you can already
> do just as easily with a decorator.

Almost the same thing could be done with decorators, the difference
being that the final name the result is assigned to occurs in a weird
place, and that decorators are restricted to one-argument functions.
The first could be solved by allowing an 'as name' clause on
decorators (like Carl Johnson is proposing in the other thread), the
second by e.g. allowing lambdas as decorators so one could do

 @lambda f: do_something(with_our(f))
 def result(param):
       print("called back with "+param)
       return foobar(param)

But I don't really have a specific actual use case for this, so I
think I'll just have to go with Guidos gut feeling[1], they seem to
work out, usually.

[1] http://www.python.org/dev/peps/pep-0318/#id32


From jimjjewett at gmail.com  Wed Mar 11 03:44:43 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 10 Mar 2009 22:44:43 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
Message-ID: <fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>

On 3/10/09, Guido van Rossum <guido at python.org> wrote:
> On Tue, Mar 10, 2009 at 3:22 PM, Jim Jewett <jimjjewett at gmail.com> wrote:
>> On 3/10/09, Guido van Rossum <guido at python.org> wrote:
>>> On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy <tjreedy at udel.edu> wrote:

>>>> 'Anonymous' is a different concept from 'expression-defined' despite
>>>> the tendency to conflate the two.

>>> ... Why is it important to you to split this particular hair?

>> An expression that *creates* and returns a function is useful.

>> A way to create unnamed functions may or may not be useful.

>> Right now, the two are tied together, as lambda is the best way to do
>> either.  Mentally untangling them might lead to better code.

> I'm feeling really dense right now -- I still don't see the difference
> between the two. Are you saying that you would prefer an expression
> that creates a *named* function?

Yes.

The __name__ attribute might never be used, but I personally still
prefer that it be meaningful.

When someone says they need anonymous functions, I hear:

     "I really, really *need* the __name__ to be useless!"

In the past, I had sometimes just assumed they were wrong.

Stephen has given me a glimpse of a mindset which really might need
the name to be useless.  But neither he nor I normally think that way,
so it might still be YAGNI.

Terry has pointed out that they may actually mean "I need def to be an
expression", with the reference to anonymity being a red herring,
because the two concepts are currently confounded.

>> If the name in a def were optional, it would meet the perceived need
>> for anonymity, but still wouldn't meet the need for creating and
>> returning a function within a single expression.

> Moreover, unless you used a decorator, there would be no way to do
> anything with the anonymous function, so it would be useless.

Thus my troubles seeing the point of the people who care about
"anonymous functions."

>> On the other hand, if def became an expression, it would meet the need
>> for function-creating expressions (and would have at least reduced the
>> need for decorators).

> I don't see the conceptual difference between a "def-expression" (if
> it were syntactically possible) and a lambda-expression. What is the
> difference in your view?

The only differences *I* see are syntactical warts in lambda.

That said, I may be missing something myself, as lambda has had
passionate defenders.

>>    add_callback(button1, def add3(a): return a+3)

> (1) Do you expect the name 'add3' to be bound in the surrounding scope?

No.  But I agree that expectations would differ, so that either would
be acceptable, but either would feel like a wart to at least some
people.

> (2) What is the purpose of the name other than documenting the obvious?

It isn't always quite this obvious.

It is more useful in tracebacks.  (Though perhaps printing source
instead of name would be even better, for functions short enough to be
reasonable expressions.)

The __name__ is available in case you want to use it for a dispatch
table, or to populate fields in an alternative User Interface.  (For
example, accessibility APIs)

-jJ


From guido at python.org  Wed Mar 11 03:52:26 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Mar 2009 19:52:26 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
Message-ID: <ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>

On Tue, Mar 10, 2009 at 7:44 PM, Jim Jewett <jimjjewett at gmail.com> wrote:
> When someone says they need anonymous functions, I hear:
>
> ? ? "I really, really *need* the __name__ to be useless!"

I think you need your hearing tuned. By convention anonymous functions
and expressions creating functions are almost always synonymous, in
almost all languages. So people use the shorter term "anonymous
functions" when what they really care about is "expression syntax for
creating new functions on the fly".

>> I don't see the conceptual difference between a "def-expression" (if
>> it were syntactically possible) and a lambda-expression. What is the
>> difference in your view?
>
> The only differences *I* see are syntactical warts in lambda.

Well, those have been discussed at length and depth, and nobody has
come up with an acceptable syntax to embed a block of statements in
the midst of an expression, in Python. That's why they are separate.

> The __name__ is available in case you want to use it for a dispatch
> table, or to populate fields in an alternative User Interface. ?(For
> example, accessibility APIs)

When people want the name, they can give it a name using a def statement.

I don't accept your argument against that which seems to go along the
lines of "but maybe they might want the name later". You can write
unreadable code without lambda too. And yes, lambda can be abused; for
a really evil example see this recipe:
http://code.activestate.com/recipes/148061/ (note the mis-use of the
term "one liner" :-).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From tav at espians.com  Wed Mar 11 04:34:57 2009
From: tav at espians.com (tav)
Date: Wed, 11 Mar 2009 03:34:57 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<20090309154505.GA18115@panix.com> <gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
Message-ID: <eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>

Dearest all,

Forgive me if I seem frustrated, but it's been a very long day.

After catching up on a lot of the discussion here, I am feeling
something between astonishment and disappointment. Forgive me as I am
new here, but I had expected python-ideas to be filled with remarkable
people.

Instead I find that many people don't even understand *why* Python is
the way it is!! Never mind understanding what it is that they are
talking about. It seems as if jumping in with an opinion counts for
more than doing some basic research.

Forgive me, but the likes of equating expressions with statements or
the desire for anonymous functions with some bizarre dislike of the
__name__ attribute just shouts out sheer ignorance/stupidity. I had
expected more. From everyone.

As for the various lambda proposals I've seen, none of them are
anywhere near Pythonic!! Some of you, could you please google "lambda
site:mail.python.org" and then read for a little while?

For some bizarre reason, I had expected those on this list to be
Masters of Python. If not why would they care to improve the language?

But some of you clearly are not and should spend a lot of time
*reading*. And, please, get hold of as many Python libraries as you
can and read the code until you feel you truly do get the essence of
what is Pythonic.

I am really sorry if I've offended anyone. Frustrations aside, I do
mean this in a constructive way. If we all individually would apply a
little bit more effort, then we'd collectively benefit and have a much
better language.

Thanks for bearing with me.

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From stephen at xemacs.org  Wed Mar 11 04:43:29 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 11 Mar 2009 12:43:29 +0900
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <gp6gtj$f95$1@ger.gmane.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org> <87iqmi5819.fsf@xemacs.org>
	<gp6gtj$f95$1@ger.gmane.org>
Message-ID: <871vt467xa.fsf@xemacs.org>

Terry Reedy writes:
 > Stephen J. Turnbull wrote:
 > > Terry Reedy writes:

 > >  > Like many other Pythonistas I recognize that that an
 > >  > uninformative stock name of '<lambda>' is defective relative
 > >  > to an informative name that points back to readable code.
 > >  > What I dislike is the anonymity-cult claim that the defect is
 > >  > a virtue.
 > > 
 > > That's unfair.
 > 
 > It is unfair to dislike false statements?

No.  It is unfair to use factives: the proponents of code blocks don't
claim that the uninformative name is a virtue.  They claim that the
effort required to deal with an unnecessary name is a defect, and so
removing that effort is a virtue.  You evidently have no answer for
that argument, so you reinterpret in precisely the kind of word-
twisting way that bothers you when I do it:

 > > Python has "anonymous blocks" all over the place, since every
 > > control structure controls one or more of them.  It simply
 > > requires that they be forgotten at the next DEDENT.  Surely you
 > > don't advocate that each of them should get a name!
 > 
 > Surely, I did not.  And surely you cannot really think I suggested such.

That was a rhetorical question, properly marked as such with a
exclamation point rather than a question mark.  Not to mention being
immediately preceded by the obviously correct rationale for the
obviously correct answer.  How did you miss it?

I did have a real point, which was that if the only use of an
anonymous block is immediately juxtaposed to that DEDENT, is there
really any harm to the lack of the name?  In fact, in debugging you
have the name of the using function (which should be short and
readable, or you're not going to have fun anyway), and a line number,
so there is no trouble identifying the problematic code, nor the
execution history that led to it.  A good debugger might even provide
the arguments to the code block as part of the stack trace, which you
would have to go to extra effort to get if it were presented merely as
a suite. Ie, in this kind of use case a code block could be considered
a kind of meta-syntax that tells debuggers "these variables are of
interest, so present me, and them, as a pseudo-stack frame".

The sticking point, AIUI, is that the code block proponents have not
identified a use case where the anonymous block is immediately used,
while there is no Pythonic equivalent.  So that "harmless" (YMMV)
extension is unnecessary, and the extension violates TOOWTDI if that's
all it's good for.  The real power of "code blocks" (that Python
doesn't have) comes when they can be passed around as objects ... but
there doesn't seem to be a way to define them so as to exclude the
obnoxious uses such as in callbacks, or indeed such a use case other
than callbacks.  And that violates the Pythonista's sense of good
style.

So what I want to know (and my question is directed to the code block
proponents, not to you) is where is the Pythonic use case?  All those
Ruby programmers can't be wrong ... can they?  Of course they can!<wink>
But even so, I'd like to understand what the code block proponents
think they're seeing that isn't there, at least not for you and me.
*We* (including the BDFL!?) could be wrong, too, maybe there is
something special about code blocks that Python could benefit from
incorporating.  Or maybe there's a better way to teach Python, so that
people will use Pythonic idioms instead of reaching for "code blocks."


From ben+python at benfinney.id.au  Wed Mar 11 04:52:42 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 11 Mar 2009 14:52:42 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<20090309154505.GA18115@panix.com> <gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
Message-ID: <87sklkzpf9.fsf@benfinney.id.au>

tav <tav at espians.com> writes:

> After catching up on a lot of the discussion here, I am feeling
> something between astonishment and disappointment. Forgive me as I
> am new here, but I had expected python-ideas to be filled with
> remarkable people.

I'm glad your expectations have been re-adjusted early on. I don't
know what would have led you to such an expectation.

> Instead I find that many people don't even understand *why* Python
> is the way it is!!

That's because there are many people who have yet to find these things
out. Such people are not excluded from this list.

> Never mind understanding what it is that they are talking about. It
> seems as if jumping in with an opinion counts for more than doing
> some basic research.

Again, that state may be unfortunate, but I don't know why you would
have expectations that it would be otherwise.

> For some bizarre reason, I had expected those on this list to be
> Masters of Python.

That is rather bizarre. If you find something that led you to expect
that, please let us know so the fallacy can be corrected.

> If not why would they care to improve the language?

Because they are the Users of Python who are interested in improving
the language.

-- 
 \        ?During my service in the United States Congress, I took the |
  `\                    initiative in creating the Internet.? ?Al Gore |
_o__)                                                                  |
Ben Finney


From tav at espians.com  Wed Mar 11 06:08:30 2009
From: tav at espians.com (tav)
Date: Wed, 11 Mar 2009 05:08:30 +0000
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <87sklkzpf9.fsf@benfinney.id.au>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
	<87sklkzpf9.fsf@benfinney.id.au>
Message-ID: <eb24b25b0903102208g21bc4aa9x15e30801719d5f2a@mail.gmail.com>

Hey Ben and others,

Sorry if my previous message came across as being rude. It really
wasn't meant so. And it definitely wasn't meant as a personal attack
on anyone. I just wish that a little bit of time would be taken before
jumping in with comments.

As for Python-ideas, I had taken as granted that people posting would
have more than a passing interest in language design and the nature of
Python. Obviously a false assumption. I apologise for this.

I think there is a *lot* of value in many of the ideas that float
around. On this list, python-dev, irc channels and even in the
blogosphere. The problem is that few of these ideas get the real time
and attention that they deserve.

And seeing as all of our time is limited. And that the resources for
the development of Python itself is limited. Ideally we would focus
them more constructively and selectively.

But then, this is the internet -- I should stop being an idealist ;p

Thanks again for bearing with me.

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From stephen at xemacs.org  Wed Mar 11 06:05:28 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 11 Mar 2009 14:05:28 +0900
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
Message-ID: <87zlfs4pk7.fsf@xemacs.org>

Guido van Rossum writes:

 > I'm feeling really dense right now -- I still don't see the difference
 > between the two. Are you saying that you would prefer an expression
 > that creates a *named* function? That seems to be really bizarre --
 > like claiming that you don't like expressions that return anonymous
 > numbers.

Here's a use-case from Emacs.  Various modes have callbacks so that
users can customize them.  A typical case is that a text-mode hook
will turn on auto-fill-mode, which is documented as an example to be
done like this:

(add-hook 'text-mode-hook (lambda () (auto-fill-mode 1)))

where *-mode functions called with nil toggle the mode, positive
numbers turn it on, and non-positive numbers turn it off.  The
add-hook function is supposed to be idempotent: it won't add the same
hook function if it is already in the hook.  The problem is if you
change the lambda and execute that form, the changed lambda is now not
identical to the lambda on the hook, so the old version won't be
removed.  This

(add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1)))

neatly avoids the problem by returning the name of the function, the
symbol `turn-on-auto-fill', which is callable and so suitable for
hanging on the hook.  If you change the definition and execute the
above form, add-hook *mutates nothing* (the symbol is already
present), but because the hook is indirect through the function's
symbol and the defun *is* executed, the definition changes ... which
is exactly what you want.[1]

AMK's use-case could be post-processed as something like

(let ((i 0))
  (mapcar (lambda ()
            (let ((name (intern (format "foo-%d-callback" i))))
              (define-function name (aref slots i))
              (aset slots i name)
              (setq i (1+ i))
              name))
          slots))

where slots is the vector of anonymous functions.  Providing names in
this way costs one indirection per callback invocation in Emacs Lisp,
but the benefits in readability of tracebacks are large, especially
for compiled code.

 > I don't see the conceptual difference between a "def-expression" (if
 > it were syntactically possible) and a lambda-expression. What is the
 > difference in your view? Are you sure that difference exists? (It
 > wouldn't be the first time that people ascribe powers to lambda that
 > it doesn't have. :-)

AIUI, a def-expression binds a callable to an object, while a lambda
expression returns a callable.  An anonymous def is just lambda by a
different name (and I think the code block proponents agree, based on
their willingness to accept syntax using def instead of lambda).

I don't see how the kind of thing exemplified above would be useful in
Python, and from a parallel reply I just saw, I gather Jim agrees.
The point is to show how a function-defining expression can be useful
in some contexts.  This works in Emacs Lisp because in

(setq foo (lambda ...))

tools (including the Lisp interpreter) will not recognize foo as a
function identifier although its value is a function, while

(defun foo ...)

marks foo as a function identifier.  But in Python (like Scheme)
they're basically the same operation, with a little syntactic sugar.
Anything that is based on the separation of variable namespace from
function namespace is DOA, right?

The renaming mapper is a different issue, I think; it depends on
computing object names at runtime (ie, the Lisp `intern' operation),
not on separate namespaces.  I'm not sure offhand how to do that in
Python, or even it it's possible; I've never wanted it.


Footnotes: 
[1]  N.B.  Of course modern Emacsen define turn-on-auto-fill as a
standard function.  But this is ugly (because of the single flat
namespace of Emacs Lisp), and not all modes have their turn-on,
turn-off variants.  auto-fill-mode was chosen because (a) the
semantics are easy to imagine and (b) the use of lambda in a hook is
explained by exactly this example in the Emacs Lisp Manual.


From tav at espians.com  Wed Mar 11 06:13:11 2009
From: tav at espians.com (tav)
Date: Wed, 11 Mar 2009 05:13:11 +0000
Subject: [Python-ideas] cd statement?
In-Reply-To: <49B69558.3090000@molden.no>
References: <49B67E6C.6020206@molden.no>
	<9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>
	<49B69558.3090000@molden.no>
Message-ID: <eb24b25b0903102213w3040e20emfe572faa00e1f55c@mail.gmail.com>

Hey Sturla,

>> Have you tried IPython?
>
> Yes, it has all that I miss, but it's ugly (at least on Windows, where it
> runs in a DOS shell).

Have you tried running it with http://sourceforge.net/projects/console/ ?

I found that to be a lot prettier than the standard DOS prompt.

And IPython really is great -- it increases your productivity in
Python dramatically. Especially with its ? and ?? commands.

I would heartily recommend it.

-- 
love, tav

plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369
http://tav.espians.com | http://twitter.com/tav | skype:tavespian


From ben+python at benfinney.id.au  Wed Mar 11 06:29:05 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 11 Mar 2009 16:29:05 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<20090309154505.GA18115@panix.com> <gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
	<87sklkzpf9.fsf@benfinney.id.au>
Message-ID: <87ocw8zkym.fsf@benfinney.id.au>

Ben Finney <ben+python at benfinney.id.au> writes:

> tav <tav at espians.com> writes:
> > For some bizarre reason, I had expected those on this list to be
> > Masters of Python.
> 
> That is rather bizarre. If you find something that led you to expect
> that, please let us know so the fallacy can be corrected.

My message had rather an imperious tone that was not intended.

I hasten to note that I'm not claiming any special status for myself
with regard to this list, or Python's community. I'm a mere interested
party, and my response was not intended to speak authoritatively about
How Things Are?. My apologies for any mistaken impressions I might
have given.

-- 
 \      ?Every valuable human being must be a radical and a rebel, for |
  `\      what he must aim at is to make things better than they are.? |
_o__)                                                      ?Niels Bohr |
Ben Finney


From leif.walsh at gmail.com  Wed Mar 11 08:25:16 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Wed, 11 Mar 2009 03:25:16 -0400 (EDT)
Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples]
In-Reply-To: <eb24b25b0903091147w794253d8n38899edee4eb99d@mail.gmail.com>
Message-ID: <fs5os13iwqcm2kkoshUYAxe124vaj_firegpg@mail.gmail.com>

Apart from the fact that I don't think blocks are needed for any of the above, I feel compelled to poke a hole in one of your examples:

On Mon, Mar 9, 2009 at 2:47 PM, tav <tav at espians.com> wrote:
> # Django/App Engine Query
>
> Frameworks like Django or App Engine define DSLs to enable easy
> querying of datastores by users. Wouldn't it better if this could be
> done in pure Python syntax?
>
> Compare the current Django:
>
> ?q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now())
>
> with a hypothetical:
>
> ?using Entry.filter do (entry):
> ? ? ?if entry.headline.startswith('What') and entry.pub_date <= datetime.now():
> ? ? ? ? ?return entry
>
> Wouldn't the latter be easier for a developer to read/maintain?

Probably, but it doesn't matter, since Django lazily evaluates chained querys, and doesn't actually evaluate anything until you tell it you want elements.  I don't believe there's any way to do this with blocks.

-- 
Cheers,
Leif

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090311/0c9917d2/attachment.pgp>

From cmjohnson.mailinglist at gmail.com  Wed Mar 11 08:36:29 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Tue, 10 Mar 2009 21:36:29 -1000
Subject: [Python-ideas] [Python-Dev] Deprecated __cmp__ and total
	ordering
In-Reply-To: <AC20FEE00E6F46D3B222875B62491425@RaymondLaptop1>
References: <ad1f81530903100500w11db61a8m65787d73f3cc5eda@mail.gmail.com>
	<B1A76387858E45638B3C051F98788331@RaymondLaptop1>
	<49B671BB.1030207@voidspace.org.uk>
	<ad1f81530903100958hb585e11u99453f8be7de297b@mail.gmail.com>
	<AC20FEE00E6F46D3B222875B62491425@RaymondLaptop1>
Message-ID: <3bdda690903110036q34529d25va8671874bab9cbb@mail.gmail.com>

> The basic idea isn't controversial, but there probably would
> be a lengthy discussion on what to call it (total_ordering is one
> possibilty) and where to put it (functools is a possibility).

It's not really a *func* tool though. Maybe there should be a
classtools module? Can anyone think of other things to put into such a
module if it were to exist? If nothing else, the existing classmethod,
etc. family could be mirrored out there.

-- Carl Johnson


From denis.spir at free.fr  Wed Mar 11 12:28:28 2009
From: denis.spir at free.fr (spir)
Date: Wed, 11 Mar 2009 12:28:28 +0100
Subject: [Python-ideas] Python-ideas Digest, Vol 28, Issue 19
In-Reply-To: <913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com>
References: <mailman.5912.1236644023.11745.python-ideas@python.org>
	<913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com>
Message-ID: <20090311122828.37e871ac@o>

Le Tue, 10 Mar 2009 16:50:13 -0700,
average <dreamingforward at gmail.com> s'exprima ainsi:

> What Tav's proposal, in my mind, is aiming to do is provide greater
> syntactic support within Python so as to minimize cognitive gibberish
> when the code is reified in the mind of the viewer.   Of course, it
> doesn't help that were culturally trained into VonNeuman
> architecture-thinking were such conflation of dimensionality is built
> into the hardware itself.  Really, like Stephan is pointing out,
> "re-ification" *IS* the best analogy to help elucidate of this issue
> (better in German: Verdinglichung).  See wikipedia's "Reification
> (Marxism)"  (--though be prepared that, depending on your state of
> mind, it will either make sense or sound like its logic is [perfectly]
> backward, like some flipped bit because it borders that special
> interplay between subject-object.)
> 
> These kind of [Anonymous] functions/code blocks explicitly tell the
> user that "This is NOT part of my program", yet (due to the classical,
> flat nature of standard computer programming) I must "include" (in a
> constrained way since I'm not able to include the context or
> externalized identity in which this code will be run) it here [in my
> editor window text] even though its logical geometry is orthogonal to
> my program.  It's like a vortex out of flatland--an interface into a
> different dimension, hence it's difficulty in explaining it to the
> natives of flatlandia.  To put a name on it puts an identity label
> upon something pointing in the wrong direction (i.e. to the
> surrounding code) which isn't *meant* to be an an independent block of
> usable code or be part of the social context of its surroundings.
> It's like seeing your own body's innards mapped inside-out into a
> computer program and calling it "marcos" while I continue to function
> normally in some other dimensionality in some mysterious way to
> magically maintain my normal cognition elsewhere.  Better to see those
> innards as anonymous data (that for whatever reason I'm needing to
> interface to) even though they are perfectly functioning blocks with
> an identity elsewhere (i.e.:  me).  So, yes, "anonymity" can be a
> virtue from a given perspective.
> 
> ...Seems to be a parallel to meta-programming [...]

Indeed. In the concatenative jargon such code-data constructs are called "quotations".

	:squares [dup *] map
	... [1 2 3] squares yields [1 4 9]

[dup *] (dup means duplicate) is sensibly called a quotation, I guess, by analogy to meta-linguistic expressions that "objectify" a snippet of speech. [dup *] holds the literal expression of a valid func def, as illustrated by:

	:square dup *

It is pushed on the data stack that should already hold a sequence ([1 2 3]), then both are data items used by map. Read: the higher order func map takes a func def and a sequence as arguments.
Now this is alien (fremd ;-) to anybody used to languages in which code is not, conceptually, *really* data -- even if it has a type and can be denoted, in python, simply by letting down the ().

denis
------
la vita e estrany


From steve at pearwood.info  Wed Mar 11 13:03:58 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 11 Mar 2009 23:03:58 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com>
References: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com>
Message-ID: <200903112303.58784.steve@pearwood.info>

On Wed, 11 Mar 2009 11:01:41 am average wrote:

> These kind of [Anonymous] functions/code blocks explicitly tell the
> user that "This is NOT part of my program"

I'm a user, and they don't tell me any such thing.

By the way, that word, "explicitly" -- I think you are 
using "explicitly" as if it actually meant "implicitly". If the 
function were to explicitly tell the user, it would look something like 
this:

caller( function_that_is_not_part_of_the_program x: x+1 )

which of course is ridiculously verbose. What you say might be true if 
lambda had that extended meaning, but it doesn't. In Python, lambda 
merely means "create a nameless function object from a single 
expression".

It's also nonsense, because there the function is, inside the program as 
clear as day. Arguing that a function that appears inside your program 
is not part of your program is rather like saying that your stomach is 
not part of your body.

Perhaps what you are getting at is that anonymous functions blur the 
difference between code and data, and that if you consider them as 
data, then they are outside the program in some sense? That's not 
entirely unreasonable: it's common to distinguish between code and 
data, even though fundamentally they're all just bits and, really, the 
distinction is all in our mind. (That doesn't mean the distinction 
isn't important.)

But even accepting that an anonymous function used as data is outside of 
the program in some sense, anonymity is strictly irrelevant. If you 
consider the lambda function here:

caller(lambda x: x+1)

to be data, then so is the named function foo here:

def foo(x):
    return x+1
caller(foo)


[...]
> gibberish?

I'm trying to give you the benefit of the doubt.


-- 
Steven D'Aprano


From sturla at molden.no  Wed Mar 11 13:26:51 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 11 Mar 2009 13:26:51 +0100
Subject: [Python-ideas] math module and complex numbers
Message-ID: <49B7AE0B.5040004@molden.no>

 >>> import math
 >>> math.sqrt(-1)

Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    math.sqrt(-1)
ValueError: math domain error


I'd say math.sqrt(-1) should return 1j.

Sturla Molden


From veloso at verylowsodium.com  Wed Mar 11 13:34:48 2009
From: veloso at verylowsodium.com (Greg Falcon)
Date: Wed, 11 Mar 2009 08:34:48 -0400
Subject: [Python-ideas] math module and complex numbers
In-Reply-To: <49B7AE0B.5040004@molden.no>
References: <49B7AE0B.5040004@molden.no>
Message-ID: <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com>

On Wed, Mar 11, 2009 at 8:26 AM, Sturla Molden <sturla at molden.no> wrote:
>>>> import math
>>>> math.sqrt(-1)
>
> Traceback (most recent call last):
> ?File "<pyshell#7>", line 1, in <module>
> ? math.sqrt(-1)
> ValueError: math domain error
>
>
> I'd say math.sqrt(-1) should return 1j.

>>> import cmath
>>> cmath.sqrt(-1)
1j

Greg F


From sturla at molden.no  Wed Mar 11 14:12:10 2009
From: sturla at molden.no (Sturla Molden)
Date: Wed, 11 Mar 2009 14:12:10 +0100
Subject: [Python-ideas] math module and complex numbers
In-Reply-To: <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com>
References: <49B7AE0B.5040004@molden.no>
	<3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com>
Message-ID: <49B7B8AA.9050305@molden.no>

Greg Falcon wrote:
>>>> import cmath
>>>> cmath.sqrt(-1)
>>>>         
> 1j
>   

What is the point of having two math modules? It just adds confusion, 
like pickle and cPickle.

S.M.


From phd at phd.pp.ru  Wed Mar 11 14:31:30 2009
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Wed, 11 Mar 2009 16:31:30 +0300
Subject: [Python-ideas] math module and complex numbers
In-Reply-To: <49B7B8AA.9050305@molden.no>
References: <49B7AE0B.5040004@molden.no>
	<3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com>
	<49B7B8AA.9050305@molden.no>
Message-ID: <20090311133130.GA21137@phd.pp.ru>

On Wed, Mar 11, 2009 at 02:12:10PM +0100, Sturla Molden wrote:
> Greg Falcon wrote:
>>>>> import cmath
>>>>> cmath.sqrt(-1)
>>>>>         
>> 1j
>>   
>
> What is the point of having two math modules? It just adds confusion,  
> like pickle and cPickle.

   'c' in 'cmath' stands for 'complex'. There is a difference between
float and complex math. See http://docs.python.org/library/math.html :
   "These functions cannot be used with complex numbers; use the functions
of the same name from the cmath module if you require support for complex
numbers. The distinction between functions which support complex numbers
and those which don't is made since most users do not want to learn quite
as much mathematics as required to understand complex numbers. Receiving an
exception instead of a complex result allows earlier detection"

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From guido at python.org  Wed Mar 11 15:26:29 2009
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Mar 2009 07:26:29 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <87zlfs4pk7.fsf@xemacs.org>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<87zlfs4pk7.fsf@xemacs.org>
Message-ID: <ca471dc20903110726i6a96e4edq3dcaa9a06493abe5@mail.gmail.com>

On Tue, Mar 10, 2009 at 10:05 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Guido van Rossum writes:
>
> ?> I'm feeling really dense right now -- I still don't see the difference
> ?> between the two. Are you saying that you would prefer an expression
> ?> that creates a *named* function? That seems to be really bizarre --
> ?> like claiming that you don't like expressions that return anonymous
> ?> numbers.
>
> Here's a use-case from Emacs. ?Various modes have callbacks so that
> users can customize them. ?A typical case is that a text-mode hook
> will turn on auto-fill-mode, which is documented as an example to be
> done like this:
>
> (add-hook 'text-mode-hook (lambda () (auto-fill-mode 1)))
>
> where *-mode functions called with nil toggle the mode, positive
> numbers turn it on, and non-positive numbers turn it off. ?The
> add-hook function is supposed to be idempotent: it won't add the same
> hook function if it is already in the hook. ?The problem is if you
> change the lambda and execute that form, the changed lambda is now not
> identical to the lambda on the hook, so the old version won't be
> removed. ?This
>
> (add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1)))
>
> neatly avoids the problem by returning the name of the function, the
> symbol `turn-on-auto-fill', which is callable and so suitable for
> hanging on the hook. ?If you change the definition and execute the
> above form, add-hook *mutates nothing* (the symbol is already
> present), but because the hook is indirect through the function's
> symbol and the defun *is* executed, the definition changes ... which
> is exactly what you want.[1]

Got it -- sort of like using assignment in an expression in C, to set
a variable and return the valule (but not quite, don't worry :-).

> AMK's use-case could be post-processed as something like

I assume you're talking about Andrew Koenig's use case -- ANK is
Andrew Kuchling, who AFAIK didn't participate in this thread. :-)

> (let ((i 0))
> ?(mapcar (lambda ()
> ? ? ? ? ? ?(let ((name (intern (format "foo-%d-callback" i))))
> ? ? ? ? ? ? ?(define-function name (aref slots i))
> ? ? ? ? ? ? ?(aset slots i name)
> ? ? ? ? ? ? ?(setq i (1+ i))
> ? ? ? ? ? ? ?name))
> ? ? ? ? ?slots))

IIUC (my Lisp is very rusty) this just assigns unique names to the
functions right? You're saying this to satisfy the people who insist
that __name__ is always useful right? But it seems to be marginally
useful here since the names don't occur in the source. (?)

> where slots is the vector of anonymous functions. ?Providing names in
> this way costs one indirection per callback invocation in Emacs Lisp,
> but the benefits in readability of tracebacks are large, especially
> for compiled code.
>
> ?> I don't see the conceptual difference between a "def-expression" (if
> ?> it were syntactically possible) and a lambda-expression. What is the
> ?> difference in your view? Are you sure that difference exists? (It
> ?> wouldn't be the first time that people ascribe powers to lambda that
> ?> it doesn't have. :-)
>
> AIUI, a def-expression binds a callable to an object, while a lambda
> expression returns a callable. ?An anonymous def is just lambda by a
> different name (and I think the code block proponents agree, based on
> their willingness to accept syntax using def instead of lambda).
>
> I don't see how the kind of thing exemplified above would be useful in
> Python, and from a parallel reply I just saw, I gather Jim agrees.
> The point is to show how a function-defining expression can be useful
> in some contexts. ?This works in Emacs Lisp because in
>
> (setq foo (lambda ...))
>
> tools (including the Lisp interpreter) will not recognize foo as a
> function identifier although its value is a function, while
>
> (defun foo ...)
>
> marks foo as a function identifier. ?But in Python (like Scheme)
> they're basically the same operation, with a little syntactic sugar.
> Anything that is based on the separation of variable namespace from
> function namespace is DOA, right?

Right, and so are separations between value and type namespaces (as
other languages use, e.g. C++ and Haskell).

> The renaming mapper is a different issue, I think; it depends on
> computing object names at runtime (ie, the Lisp `intern' operation),
> not on separate namespaces. ?I'm not sure offhand how to do that in
> Python, or even it it's possible; I've never wanted it.

You can assign new values to to f.__name__.

> Footnotes:
> [1] ?N.B. ?Of course modern Emacsen define turn-on-auto-fill as a
> standard function. ?But this is ugly (because of the single flat
> namespace of Emacs Lisp), and not all modes have their turn-on,
> turn-off variants. ?auto-fill-mode was chosen because (a) the
> semantics are easy to imagine and (b) the use of lambda in a hook is
> explained by exactly this example in the Emacs Lisp Manual.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Wed Mar 11 15:28:22 2009
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 Mar 2009 07:28:22 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <87ocw8zkym.fsf@benfinney.id.au>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
	<87sklkzpf9.fsf@benfinney.id.au> <87ocw8zkym.fsf@benfinney.id.au>
Message-ID: <ca471dc20903110728i5f48431oe2deb956c18564d7@mail.gmail.com>

On Tue, Mar 10, 2009 at 10:29 PM, Ben Finney <ben+python at benfinney.id.au> wrote:
> Ben Finney <ben+python at benfinney.id.au> writes:
>
>> tav <tav at espians.com> writes:
>> > For some bizarre reason, I had expected those on this list to be
>> > Masters of Python.
>>
>> That is rather bizarre. If you find something that led you to expect
>> that, please let us know so the fallacy can be corrected.
>
> My message had rather an imperious tone that was not intended.
>
> I hasten to note that I'm not claiming any special status for myself
> with regard to this list, or Python's community. I'm a mere interested
> party, and my response was not intended to speak authoritatively about
> How Things Are?. My apologies for any mistaken impressions I might
> have given.

Don't worry, Tav's wording was rather offensive so strong reactions
are understandable. I read your message as a totally fine tit-for-tat
reply, and in fact it made me discard my own similar draft.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From dickinsm at gmail.com  Wed Mar 11 15:28:45 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Wed, 11 Mar 2009 14:28:45 +0000
Subject: [Python-ideas] math module and complex numbers
In-Reply-To: <20090311133130.GA21137@phd.pp.ru>
References: <49B7AE0B.5040004@molden.no>
	<3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com>
	<49B7B8AA.9050305@molden.no> <20090311133130.GA21137@phd.pp.ru>
Message-ID: <5c6f2a5d0903110728j3223dbftb0178b669db2085e@mail.gmail.com>

On Wed, Mar 11, 2009 at 1:31 PM, Oleg Broytmann <phd at phd.pp.ru> wrote:
> numbers. The distinction between functions which support complex numbers
> and those which don't is made since most users do not want to learn quite
> as much mathematics as required to understand complex numbers. Receiving an
> exception instead of a complex result allows earlier detection"

Furthermore, even those users who *do* understand complex numbers don't
always want sqrt(-1) to return 1j.  I find the math/cmath duality useful.

Mark

>>> from cmath import sqrt
>>> sqrt(-complex(1))
-1j
>>> sqrt(complex(-1))
1j


From bruce at leapyear.org  Wed Mar 11 19:06:55 2009
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 11 Mar 2009 11:06:55 -0700
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903110728i5f48431oe2deb956c18564d7@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
	<87sklkzpf9.fsf@benfinney.id.au> <87ocw8zkym.fsf@benfinney.id.au>
	<ca471dc20903110728i5f48431oe2deb956c18564d7@mail.gmail.com>
Message-ID: <cf5b87740903111106q59deb367hc605bc229e1e6c29@mail.gmail.com>

(1) I can easily write an expression that returns a function of arbitrary
complexity.
(2) What I can't easily write is an expression that is arbitrarily complex
and returns a function with that complexity.

Sorry if that's unclear. My point is that if I know what the complexity is
in advance I can write it down. If I can't, do I really want to embed this
in the middle of some other function?

For example:

>>> def f(name, x):
    def _(i, x=x):
        // do complicated stuff
        return i+x
    _.__name__ = name
    return _

Then f('bar', 3) returns a function equivalent to lambda i: i+3, etc. and of
course the definition of f can be arbitrarily complicated although that
complexity must be planned in advance. I can stick f('bar', 3) anywhere I
want a function.

What I have yet to hear is an explanation of the advantages of providing (2)
since we already have (1) and lambda.

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090311/0444ef90/attachment.html>

From wilk at flibuste.net  Wed Mar 11 20:25:38 2009
From: wilk at flibuste.net (William Dode)
Date: Wed, 11 Mar 2009 19:25:38 +0000 (UTC)
Subject: [Python-ideas] float vs decimal
Message-ID: <gp937i$6ka$1@ger.gmane.org>

Hi,

I just read the blog post of gvr :
http://python-history.blogspot.com/2009/03/problem-with-integer-division.html

And i wonder why
>>> .33
0.33000000000000002
is still possible in a "very-high-level langage" like python3 ?

Why .33 could not be a Decimal directly ?

bye

-- 
William Dod? - http://flibuste.net
Informaticien Ind?pendant


From pyideas at rebertia.com  Wed Mar 11 20:48:17 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 11 Mar 2009 12:48:17 -0700
Subject: [Python-ideas] float vs decimal
In-Reply-To: <gp937i$6ka$1@ger.gmane.org>
References: <gp937i$6ka$1@ger.gmane.org>
Message-ID: <50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com>

On Wed, Mar 11, 2009 at 12:25 PM, William Dode <wilk at flibuste.net> wrote:
> Hi,
>
> I just read the blog post of gvr :
> http://python-history.blogspot.com/2009/03/problem-with-integer-division.html
>
> And i wonder why
>>>> .33
> 0.33000000000000002
> is still possible in a "very-high-level langage" like python3 ?
>
> Why .33 could not be a Decimal directly ?

I proposed something like this earlier, see:
http://mail.python.org/pipermail/python-ideas/2008-December/002379.html
Obviously, the proposal didn't go anywhere, the reason being that
Decimal is currently implemented in Python and is thus much too
inefficient to be the default (efficiency/practicality beating
correctness/purity here apparently). There are non-Python
implementations of the decimal standard in C, but no one could locate
one with a Python-compatible license. The closest was the IBM
implementation whose spec the decimal PEP was based off of, but
unfortunately it uses the ICU License which has a classic-BSD-like
attribution clause.

Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com


From stefan_ml at behnel.de  Wed Mar 11 20:49:18 2009
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 11 Mar 2009 20:49:18 +0100
Subject: [Python-ideas] Adding a test discovery into Python
In-Reply-To: <ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
Message-ID: <gp94jv$caf$1@ger.gmane.org>

Guido van Rossum wrote:
> On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote:
>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple
>> because there are enough frameworks for elaborate unit testing.
>>
>> Such a tool should
>>
>> - find all modules and packages named 'tests' for a given package name
> 
> I predict that this part is where you'll have a hard time getting
> consensus. There are lots of different naming conventions. It would be
> nice if people could use the new discovery feature without having to
> move all their tests around.

Still, there should be one way to do it, so that future projects can start
to use a common pattern. I actually think the selection of such a pattern
can be completely arbitrary, as it will be impossible to get a clear vote
on this.

Obviously, the OWTDI does not lift the requirement that the test finder
must support alternate patterns to make it work smoothly with existing test
suites. It's just meant to avoid the configuration overhead if you do it
'the right way'.

Stefan


From jjb5 at cornell.edu  Wed Mar 11 21:32:12 2009
From: jjb5 at cornell.edu (Joel Bender)
Date: Wed, 11 Mar 2009 16:32:12 -0400
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>	<20090310095822.0f957a4d@o>
	<59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com>
Message-ID: <49B81FCC.40207@cornell.edu>

Jan Kanis wrote:

>  @lambda f: do_something(with_our(f))
>  def result(param):
>        print("called back with "+param)
>        return foobar(param)

To keep result from stomping on the name, I would expect result to 
actually be a result rather than a function :-):

     @lambda f: do_something(with_our(f))
     lambda param:
         print("called back with "+param)
         return foobar(param)

> But I don't really have a specific actual use case for this...

Looks interesting anyway.


Joel


From wilk at flibuste.net  Wed Mar 11 22:32:10 2009
From: wilk at flibuste.net (William Dode)
Date: Wed, 11 Mar 2009 21:32:10 +0000 (UTC)
Subject: [Python-ideas] float vs decimal
References: <gp937i$6ka$1@ger.gmane.org>
	<50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com>
Message-ID: <gp9akq$2jv$1@ger.gmane.org>

On 11-03-2009, Chris Rebert wrote:
> On Wed, Mar 11, 2009 at 12:25 PM, William Dode <wilk at flibuste.net> wrote:
>> Hi,
>>
>> I just read the blog post of gvr :
>> http://python-history.blogspot.com/2009/03/problem-with-integer-division.html
>>
>> And i wonder why
>>>>> .33
>> 0.33000000000000002
>> is still possible in a "very-high-level langage" like python3 ?
>>
>> Why .33 could not be a Decimal directly ?
>
> I proposed something like this earlier, see:
> http://mail.python.org/pipermail/python-ideas/2008-December/002379.html
> Obviously, the proposal didn't go anywhere, the reason being that
> Decimal is currently implemented in Python and is thus much too
> inefficient to be the default (efficiency/practicality beating
> correctness/purity here apparently). There are non-Python
> implementations of the decimal standard in C, but no one could locate
> one with a Python-compatible license. The closest was the IBM
> implementation whose spec the decimal PEP was based off of, but
> unfortunately it uses the ICU License which has a classic-BSD-like
> attribution clause.

Thanks to resume the situation. 

I think of another question. Why it's so difficult to mix float and 
decimal ?

For example we cannot do Decimal(float) or float * Decimal.

And i'm afraid that with python 3 it will more often a pain because 
operation with two integers can sometimes return integer (and accept an 
operation with decimal) and sometimes not when the operation will return 
a float.

>>> a = Decimal('5.3')
>>> i = 4
>>> j = 3
>>> i*a/j
Decimal('7.066666666666666666666666667')
>>> i/j*a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for *: 'float' and 'Decimal'

I mean, why if a Decimal is in the middle of an operation, float didn't 
silently become a Decimal ?


-- 
William Dod? - http://flibuste.net
Informaticien Ind?pendant


From pyideas at rebertia.com  Wed Mar 11 22:39:31 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 11 Mar 2009 14:39:31 -0700
Subject: [Python-ideas] float vs decimal
In-Reply-To: <gp9akq$2jv$1@ger.gmane.org>
References: <gp937i$6ka$1@ger.gmane.org>
	<50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com>
	<gp9akq$2jv$1@ger.gmane.org>
Message-ID: <50697b2c0903111439k21171e3t19d18bff1601d0fd@mail.gmail.com>

On Wed, Mar 11, 2009 at 2:32 PM, William Dode <wilk at flibuste.net> wrote:
> On 11-03-2009, Chris Rebert wrote:
>> On Wed, Mar 11, 2009 at 12:25 PM, William Dode <wilk at flibuste.net> wrote:
>>> Hi,
>>>
>>> I just read the blog post of gvr :
>>> http://python-history.blogspot.com/2009/03/problem-with-integer-division.html
>>>
>>> And i wonder why
>>>>>> .33
>>> 0.33000000000000002
>>> is still possible in a "very-high-level langage" like python3 ?
>>>
>>> Why .33 could not be a Decimal directly ?
>>
>> I proposed something like this earlier, see:
>> http://mail.python.org/pipermail/python-ideas/2008-December/002379.html
>> Obviously, the proposal didn't go anywhere, the reason being that
>> Decimal is currently implemented in Python and is thus much too
>> inefficient to be the default (efficiency/practicality beating
>> correctness/purity here apparently). There are non-Python
>> implementations of the decimal standard in C, but no one could locate
>> one with a Python-compatible license. The closest was the IBM
>> implementation whose spec the decimal PEP was based off of, but
>> unfortunately it uses the ICU License which has a classic-BSD-like
>> attribution clause.
>
> Thanks to resume the situation.
>
> I think of another question. Why it's so difficult to mix float and
> decimal ?
>
> For example we cannot do Decimal(float) or float * Decimal.
>
> And i'm afraid that with python 3 it will more often a pain because
> operation with two integers can sometimes return integer (and accept an
> operation with decimal) and sometimes not when the operation will return
> a float.
>
>>>> a = Decimal('5.3')
>>>> i = 4
>>>> j = 3
>>>> i*a/j
> Decimal('7.066666666666666666666666667')
>>>> i/j*a
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: unsupported operand type(s) for *: 'float' and 'Decimal'
>
> I mean, why if a Decimal is in the middle of an operation, float didn't
> silently become a Decimal ?

It's in the FAQ section of the decimal module -
http://docs.python.org/library/decimal.html :

17. Is there a way to convert a regular float to a Decimal?
A. Yes, all binary floating point numbers can be exactly expressed as
a Decimal. An exact conversion may take more precision than intuition
would suggest, so we trap Inexact to signal a need for more precision:
  def float_to_decimal(f):
    [definition snipped]

17. Why isn?t the float_to_decimal() routine included in the module?
A. There is some question about whether it is advisable to mix binary
and decimal floating point. Also, its use requires some care to avoid
the representation issues associated with binary floating point:
>>> float_to_decimal(1.1)
Decimal('1.100000000000000088817841970012523233890533447265625')

Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com


From greg.ewing at canterbury.ac.nz  Wed Mar 11 22:44:47 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Mar 2009 10:44:47 +1300
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <fb6fbf560903101505x50a2f493i8d0afc2e9f6a07af@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org> <87iqmi5819.fsf@xemacs.org>
	<gp6gtj$f95$1@ger.gmane.org>
	<fb6fbf560903101505x50a2f493i8d0afc2e9f6a07af@mail.gmail.com>
Message-ID: <49B830CF.8070800@canterbury.ac.nz>

Jim Jewett wrote:

> I'm not convinced, because I've seen so many times when a lambda
> actually is crucial to the bug.

I think this is just a special case of a more general
problem, that a line number is not always a sufficiently
fine-grained piece of information when you're trying to
pinpoint an error.

You can get the same thing even when lambdas are not
involved. It's particularly bad when an expression spans
more than one line, because CPython currently doesn't even
tell you the line containing the error, but the one where
the whole statement started.

Ideally, the traceback would show you not just the exact
line, but the exact *token* where the error occurred. The
technology exists to do this, it's just a matter of deciding
to incorporate it into Python.

-- 
Greg


From ggpolo at gmail.com  Wed Mar 11 23:24:03 2009
From: ggpolo at gmail.com (Guilherme Polo)
Date: Wed, 11 Mar 2009 19:24:03 -0300
Subject: [Python-ideas] Adding a test discovery into Python
In-Reply-To: <gp94jv$caf$1@ger.gmane.org>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
Message-ID: <ac2200130903111524u768f86dcv8924c6fbf36b9d0@mail.gmail.com>

On Wed, Mar 11, 2009 at 4:49 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Guido van Rossum wrote:
>> On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote:
>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple
>>> because there are enough frameworks for elaborate unit testing.
>>>
>>> Such a tool should
>>>
>>> - find all modules and packages named 'tests' for a given package name
>>
>> I predict that this part is where you'll have a hard time getting
>> consensus. There are lots of different naming conventions. It would be
>> nice if people could use the new discovery feature without having to
>> move all their tests around.
>
> Still, there should be one way to do it, so that future projects can start
> to use a common pattern. I actually think the selection of such a pattern
> can be completely arbitrary, as it will be impossible to get a clear vote
> on this.
>
> Obviously, the OWTDI does not lift the requirement that the test finder
> must support alternate patterns to make it work smoothly with existing test
> suites. It's just meant to avoid the configuration overhead if you do it
> 'the right way'.
>

A little unrelated to your reply but thanks for "reviving" the thread.
I still have the intention to do the proposed idea, I just happened to
have very busy weeks, month, etc.. new house and others.

> Stefan
>

Regards,


-- 
-- Guilherme H. Polo Goncalves


From python at rcn.com  Wed Mar 11 23:37:46 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 11 Mar 2009 15:37:46 -0700
Subject: [Python-ideas] Adding a test discovery into Python
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>	<gm57o1$5g6$1@ger.gmane.org><ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
Message-ID: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>

[Christian Heimes]
>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple
>>> because there are enough frameworks for elaborate unit testing.

Test discovery is not the interesting part of the problem.
I'm strongly for offering tools that make it easier to write
the tests in the first place.   The syntax used by py.test
and nose is vastly superior to the one used by unittest.py,
a module that is more Javathonic than Pythonic.

Even if we never adopt that syntax for our own test suite
(because we like to run tests with and without -O), it would
still be a good service to our users to offer a tool with
a lighter weight syntax for writing tests.

Raymond


P.S. I'm not a partisan on this one.  I've been a *heavy* user
of unittest.py, doctest.py, py.test, and some personal tools
that I wrote long ago in awk.  Extensive use of each makes
merits of the py.test and nose approaches self-evident.

Axiom:  The more work involved in writing tests, the fewer
tests that will get written.  

Factoid of the Day:  In Py2.7's test_datetime module,
the phrase self.assertEqual occurs 578 times.


From steve at pearwood.info  Wed Mar 11 23:41:03 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 12 Mar 2009 09:41:03 +1100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903110726i6a96e4edq3dcaa9a06493abe5@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<87zlfs4pk7.fsf@xemacs.org>
	<ca471dc20903110726i6a96e4edq3dcaa9a06493abe5@mail.gmail.com>
Message-ID: <200903120941.03721.steve@pearwood.info>

On Thu, 12 Mar 2009 01:26:29 am Guido van Rossum wrote:
> You can assign new values to to f.__name__.

But it is only used in function repr, not tracebacks. From Python 2.6:

>>> f = lambda x: x+1  # __name__ is '<lambda>'
>>> f.__name__ = 'f'
>>> f
<function f at 0xb7f0d304>
>>> f(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'


-- 
Steven D'Aprano


From steve at pearwood.info  Wed Mar 11 23:48:27 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 12 Mar 2009 09:48:27 +1100
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <loom.20090311T141849-464@post.gmane.org>
References: <gp6hhq$hss$1@ger.gmane.org> <49B7C6BA.8010100@cheimes.de>
	<loom.20090311T141849-464@post.gmane.org>
Message-ID: <200903120948.28250.steve@pearwood.info>

On Thu, 12 Mar 2009 01:21:25 am Antoine Pitrou wrote:
> Christian Heimes <lists <at> cheimes.de> writes:
> > In my initial proposal one and a half hour earlier I suggested
> > 'sync()' as the name of the method and 'synced' as the name of the
> > flag that forces a fsync() call during the close operation.
>
> I think your "synced" flag is too vague. Some applications may need
> the file to be synced on close(), but some others may need it to be
> synced at regular intervals, or after each write(), etc.
>
> Calling the flag "sync_on_close" would be much more explicit. Also,
> given the current API I think it should be an argument to open()
> rather than a writable attribute.

Perhaps we should have a module containing rich file tools, e.g. classes 
FileSyncOnWrite, FileSyncOnClose, functions for common file-related 
operations, etc. This will make it easy for conscientious programmers 
to do the right thing for their app without needing to re-invent the 
wheel all the time, but without handcuffing them into a single "one 
size fits all" solution.

File operations are *hard*, because many error conditions are uncommon, 
and consequently many (possibly even the majority) of programmers never 
learn that something like this:

f = open('myfile', 'w')
f.write(data)
f.close()

(or the equivalent in whatever language they use) may cause data loss. 
Worse, we train users to accept that data loss as normal instead of 
reporting it as a bug -- possibly because it is unclear whether it is a 
bug in the application, the OS, the file system, or all three. (It's 
impossible to avoid *all* risk of data loss, of course -- what if the 
computer loses power in the middle of a write? But we can minimize that 
risk significantly.)

Even when programmers try to do the right thing, it is hard to know what 
the right thing is: there are trade-offs to be made, and having made a 
trade-off, the programmer then has to re-invent what usually turns out 
to be a quite complicated wheel. To do the right thing in Python often 
means delving into the world of os.O_* constants and file descriptors, 
which is intimidating and unpythonic. They're great for those who 
want/need them, but perhaps we should expose a Python interface to the 
more common operations? To my mind, that means classes instead of magic 
constants.

Would there be interest in a filetools module? Replies and discussion to 
python-ideas please.


-- 
Steven D'Aprano


From robert.kern at gmail.com  Thu Mar 12 00:00:14 2009
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 11 Mar 2009 18:00:14 -0500
Subject: [Python-ideas] Ext4 data loss
In-Reply-To: <200903120948.28250.steve@pearwood.info>
References: <gp6hhq$hss$1@ger.gmane.org>
	<49B7C6BA.8010100@cheimes.de>	<loom.20090311T141849-464@post.gmane.org>
	<200903120948.28250.steve@pearwood.info>
Message-ID: <gp9fpu$i5c$1@ger.gmane.org>

On 2009-03-11 17:48, Steven D'Aprano wrote:

> Would there be interest in a filetools module? Replies and discussion to
> python-ideas please.

Yes, please. I am of the opinion that, wherever possible, these kinds of 
patterns should be codified in reusable libraries. For something as fundamental 
as writing files, something aimed towards standard library acceptance seems like 
a very good idea to me.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From ggpolo at gmail.com  Thu Mar 12 00:05:46 2009
From: ggpolo at gmail.com (Guilherme Polo)
Date: Wed, 11 Mar 2009 20:05:46 -0300
Subject: [Python-ideas] Fwd:  Adding a test discovery into Python
In-Reply-To: <ac2200130903111604j5d34a21ex6aa8a668f6bd8130@mail.gmail.com>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
	<ac2200130903111604j5d34a21ex6aa8a668f6bd8130@mail.gmail.com>
Message-ID: <ac2200130903111605q2d3ed2c7sa572edc62e34e775@mail.gmail.com>

---------- Forwarded message ----------
From: Guilherme Polo <ggpolo at gmail.com>
Date: Wed, Mar 11, 2009 at 8:04 PM
Subject: Re: [Python-ideas] Adding a test discovery into Python
To: Raymond Hettinger <python at rcn.com>, python-dev at python.org


On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger <python at rcn.com> wrote:
> [Christian Heimes]
>>>>
>>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple
>>>> because there are enough frameworks for elaborate unit testing.
>
> Test discovery is not the interesting part of the problem.

Interesting or not, it is a problem that is asking for a solution,
this kind of code is being duplicated in several places for no good
reason.

>
> Axiom: ?The more work involved in writing tests, the fewer
> tests that will get written.

At some point you will have to run them too, I don't think you want to
reimplement the discovery part yet another time.


-- 
-- Guilherme H. Polo Goncalves


From jan.kanis at phil.uu.nl  Thu Mar 12 00:08:38 2009
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Thu, 12 Mar 2009 00:08:38 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative)
In-Reply-To: <49B81FCC.40207@cornell.edu>
References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com>
	<20090310095822.0f957a4d@o>
	<59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com>
	<49B81FCC.40207@cornell.edu>
Message-ID: <59a221a0903111608i2a69c6f5ke8eac8e73bdc073@mail.gmail.com>

On Wed, Mar 11, 2009 at 21:32, Joel Bender <jjb5 at cornell.edu> wrote:
> Jan Kanis wrote:
>
>> ?@lambda f: do_something(with_our(f))
>> ?def result(param):
>> ? ? ? print("called back with "+param)
>> ? ? ? return foobar(param)
>
> To keep result from stomping on the name, I would expect result to actually
> be a result rather than a function :-):

'result' is the actual result. To try it out in current python:

def do_something(func):
    print("doing something")
    return func(41)**2
	
def id(x):
    return x
	
@id(lambda f: do_something(f))
def result(param):
    print("called back with", param)
    return param + 1

print("result is", result, "should be", 42**2)

-->
doing something
called back with 41
result is 1764 should be 1764

Or did I misinterpret what you were saying?


From zooko at zooko.com  Thu Mar 12 02:26:40 2009
From: zooko at zooko.com (zooko)
Date: Wed, 11 Mar 2009 19:26:40 -0600
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <200903120948.28250.steve@pearwood.info>
References: <gp6hhq$hss$1@ger.gmane.org> <49B7C6BA.8010100@cheimes.de>
	<loom.20090311T141849-464@post.gmane.org>
	<200903120948.28250.steve@pearwood.info>
Message-ID: <C772320F-990D-441F-9989-88C69B5F730B@zooko.com>

> Would there be interest in a filetools module? Replies and  
> discussion to python-ideas please.


I've been using and maintaining a few filesystem hacks for, let's  
see, almost nine years now:

http://allmydata.org/trac/pyutil/browser/pyutil/pyutil/fileutil.py

(The first version of that was probably written by Greg Smith in  
about 1999.)

I'm sure there are many other such packages.  A couple of quick  
searches of pypi turned up these two:

http://pypi.python.org/pypi/Pythonutils
http://pypi.python.org/pypi/fs

I wonder if any of them have the sort of functionality you're  
thinking of.

Regards,

Zooko


From stephen at xemacs.org  Thu Mar 12 02:55:52 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 12 Mar 2009 10:55:52 +0900
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <ca471dc20903110726i6a96e4edq3dcaa9a06493abe5@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com>
	<gp47jo$1s7$1@ger.gmane.org>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<87zlfs4pk7.fsf@xemacs.org>
	<ca471dc20903110726i6a96e4edq3dcaa9a06493abe5@mail.gmail.com>
Message-ID: <87ocw74i8n.fsf@xemacs.org>

Guido van Rossum writes:

 > > (add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1)))
 > >
 > > neatly avoids the problem by returning the name of the function, the
 > > symbol `turn-on-auto-fill', which is callable and so suitable for
 > > hanging on the hook.
 > 
 > Got it -- sort of like using assignment in an expression in C, to set
 > a variable and return the valule (but not quite, don't worry :-).

Yes.  And no, I don't worry about you, but I do worry about what else
may be lurking in a language whose designer(s) chose to return the
function definition (rather than the name) from define-function. ;-)

 > I assume you're talking about Andrew Koenig's use case -- ANK is
 > Andrew Kuchling, who AFAIK didn't participate in this thread. :-)

Oops, my bad.  Very sorry to all concerned.

 > IIUC (my Lisp is very rusty) this just assigns unique names to the
 > functions right?

Yes.

 > You're saying this to satisfy the people who insist that __name__
 > is always useful right? But it seems to be marginally useful here
 > since the names don't occur in the source. (?)

But they are at least cosmetically useful to the runtime system (eg,
they will be used in reporting tracebacks -- bytecode in the backtrace
is hard to read) and accessible to the user (for redefining a callback
on-the-fly).  The user can't necessarily access the array of callbacks
directly (eg, it might be in C) or conveniently (it may be buried deep
in a complex structure).  It seems plausible to me that the user is
most likely to want to redefine a callback that just blew up, too, and
this would give you the necessary "handle" in the backtrace.

Also, I haven't thought this through, but use of numbers to
differentiate the names was just an easy example.  An appropriate
naming scheme might make it easy to find a skeleton in the source for
the generated callback.  Eg, if instead of numbers the identifiers
were "foo-abort", "foo-retry", and "foo-fail".  I don't know if that
would be useful in Andrew's use-case.

So, yes, marginal, in the sense that I doubt the use cases are common,
but I suspect in a few it could be a great convenience.  How useful in
Python, I don't know ... Emacs Lisp is full of "seemed like the thing
to do at the time" design, so the more handles I have the happier I am.


From santagada at gmail.com  Thu Mar 12 04:04:40 2009
From: santagada at gmail.com (Leonardo Santagada)
Date: Thu, 12 Mar 2009 00:04:40 -0300
Subject: [Python-ideas] Fwd:  Adding a test discovery into Python
In-Reply-To: <ac2200130903111605q2d3ed2c7sa572edc62e34e775@mail.gmail.com>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
	<ac2200130903111604j5d34a21ex6aa8a668f6bd8130@mail.gmail.com>
	<ac2200130903111605q2d3ed2c7sa572edc62e34e775@mail.gmail.com>
Message-ID: <A7D4E02B-95B1-4AC7-A96C-DA45FF5D2A8C@gmail.com>


On Mar 11, 2009, at 8:05 PM, Guilherme Polo wrote:
> On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger <python at rcn.com>  
> wrote:
>> [Christian Heimes]
>>>>>
>>>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing  
>>>>> on simple
>>>>> because there are enough frameworks for elaborate unit testing.
>>
>> Test discovery is not the interesting part of the problem.
>
> Interesting or not, it is a problem that is asking for a solution,
> this kind of code is being duplicated in several places for no good
> reason.
>
>>
>> Axiom:  The more work involved in writing tests, the fewer
>> tests that will get written.
>
> At some point you will have to run them too, I don't think you want to
> reimplement the discovery part yet another time.


What I think he was getting at is that 20-30 lines of test discovery  
have to be written once for each project (or none if using py.test/ 
nose), but self.assertequals and all of the other quirks of unittest  
are all over a test suite and you need to write all of it each time  
you have to make a test.

Not that what you are trying to do is pointless, but fixing this other  
problem is so much more interesting...


--
Leonardo Santagada
santagada at gmail.com


From python at rcn.com  Thu Mar 12 04:45:24 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 11 Mar 2009 20:45:24 -0700
Subject: [Python-ideas] [Python-Dev] Formatting mini-language suggestion
References: <D6F1551CA8894C3B95497076B0D74A35@RaymondLaptop1><loom.20090312T005653-93@post.gmane.org>
	<ca471dc20903112020s71612e32h2605aa0db7624349@mail.gmail.com>
Message-ID: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1>


[Guido van Rossum]
> I suggest moving this to python-ideas and
> writing a proper PEP. 

Okay, it's moved.

Will write up a PEP, do research on what  other languages 
do and collect everyone's ideas on what to put in the shed.
(hundreds and ten thousands grouping, various choices of 
decimal points, mayan number systems and whatnot).  Will
start with Nick's simple proposal as a starting point.

[Nick Coghlan]
>  [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

Other suggestions and comments welcome.


Raymond


From ben+python at benfinney.id.au  Thu Mar 12 04:57:11 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 12 Mar 2009 14:57:11 +1100
Subject: [Python-ideas] Draft PEP: Standard daemon process library
References: <87wscj11fl.fsf@benfinney.id.au>
Message-ID: <874oxzxujs.fsf@benfinney.id.au>

Howdy all,

Significant changes in this release:

* Name the daemon process context class `DaemonContext`, since it
  doesn't actually represent a separate daemon. (The reference
  implementation will also have a `DaemonRunner` class, but that's
  outside the scope of this PEP.)

* Implement the context manager protocol, allowing use as a ?with?
  context manager or via explicit ?open? and ?close? calls.

* Delegate PID file handling to a `pidfile` object handed to the
  `DaemonContext` instance, and used simply as a context manager.

* Simplify the set of options by using a mapping for signal handlers.

* Target Python 3.2, since the reference implementation will very
  likely not be complete in time for anything earlier.


:PEP:               XXX
:Title:             Standard daemon process library
:Version:           0.5
:Last-Modified:     2009-03-12 14:50
:Author:            Ben Finney <ben+python at benfinney.id.au>
:Status:            Draft
:Type:              Standards Track
:Content-Type:      text/x-rst
:Created:           2009-01-26
:Python-Version:    3.2
:Post-History:


========
Abstract
========

Writing a program to become a well-behaved Unix daemon is somewhat
complex and tricky to get right, yet the steps are largely similar for
any daemon regardless of what else the program may need to do.

This PEP introduces a package to the Python standard library that
provides a simple interface to the task of becoming a daemon process.


..  contents::
..
    Table of Contents: 
    Abstract
    Specification
      Example usage
      Interface
      ``DaemonContext`` objects
      ``DaemonError`` objects
    Motivation
    Rationale
      Correct daemon behaviour
      A daemon is not a service
    Reference Implementation
      Other daemon implementations
    References
    Copyright


=============
Specification
=============

Example usage
=============

Simple example of direct `DaemonContext` usage::

    import daemon

    from spam import do_main_program

    with daemon.DaemonContext() as daemon_context:
        do_main_program()

More complex example usage::

    import os
    import grp
    import signal
    import daemon
    import lockfile

    from spam import (
        initial_program_setup,
        do_main_program,
        program_cleanup,
        reload_program_config,
        )

    context = daemon.DaemonContext(
        working_directory='/var/lib/foo',
        umask=0o002,
        pidfile=lockfile.FileLock('/var/run/spam.pid'),
        )

    context.signal_map = {
        signal.SIGTERM: program_cleanup,
        signal.SIGHUP: 'close',
        signal.SIGUSR1: reload_program_config,
        }

    mail_gid = grp.getgrnam('mail').gr_gid
    context.gid = mail_gid

    important_file = open('spam.data', 'w')
    interesting_file = open('eggs.data', 'w')
    context.files_preserve = [important_file, interesting_file]

    initial_program_setup()

    with context:
        do_main_program()


Interface
=========

A new package, `daemon`, is added to the standard library.

An exception class, `DaemonError`, is defined for exceptions raised
from the package.

A class, `DaemonContext`, is defined to represent the settings and
process context for the program running as a daemon process.


``DaemonContext`` objects
=========================

A `DaemonContext` instance represents the behaviour settings and
process context for the program when it becomes a daemon. The
behaviour and environment is customised by setting options on the
instance, before calling the `open` method.

Each option can be passed as a keyword argument to the `DaemonContext`
constructor, or subsequently altered by assigning to an attribute on
the instance at any time prior to calling `open`. That is, for
options named `wibble` and `wubble`, the following invocation::

    foo = daemon.DaemonContext(wibble=bar, wubble=baz)
    foo.open()

is equivalent to::

    foo = daemon.DaemonContext()
    foo.wibble = bar
    foo.wubble = baz
    foo.open()

The following options are defined.

`files_preserve`
    :Default: ``None``

    List of files that should *not* be closed when starting the
    daemon. If ``None``, all open file descriptors will be closed.

    Elements of the list are file descriptors (as returned by a file
    object's `fileno()` method) or Python `file` objects. Each
    specifies a file that is not to be closed during daemon start.

`chroot_directory`
    :Default: ``None``

    Full path to a directory to set as the effective root directory of
    the process. If ``None``, specifies that the root directory is not
    to be changed.

`working_directory`
    :Default: ``'/'``

    Full path of the working directory to which the process should
    change on daemon start.

    Since a filesystem cannot be unmounted if a process has its
    current working directory on that filesystem, this should either
    be left at default or set to a directory that is a sensible ?home
    directory? for the daemon while it is running.

`umask`
    :Default: ``0``

    File access creation mask (?umask?) to set for the process on
    daemon start.

    Since a process inherits its umask from its parent process,
    starting the daemon will reset the umask to this value so that
    files are created by the daemon with access modes as it expects.

`pidfile`
    :Default: ``None``

    Context manager for a PID lock file. When the daemon context opens
    and closes, it enters and exits the `pidfile` context manager.

`signal_map`
    :Default: ``{signal.SIGTOU: None, signal.SIGTTIN: None,
        signal.SIGTSTP: None, signal.SIGTERM: 'close'}``

    Mapping from operating system signals to callback actions.

    The mapping is used when the daemon context opens, and determines
    the action for each signal's signal handler:

    * A value of ``None`` will ignore the signal (by setting the
      signal action to ``signal.SIG_IGN``).

    * A string value will be used as the name of an attribute on the
      ``DaemonContext`` instance. The attribute's value will be used
      as the action for the signal handler.

    * Any other value will be used as the action for the signal
      handler.

`uid`
    :Default: ``None``

    The user ID (?uid?) value to switch the process to on daemon start.

`gid`
    :Default: ``None``

    The group ID (?gid?) value to switch the process to on daemon start.

`prevent_core`
    :Default: ``True``

    If true, prevents the generation of core files, in order to avoid
    leaking sensitive information from daemons run as `root`.

`stdin`
    :Default: ``None``

`stdout`
    :Default: ``None``

`stderr`
    :Default: ``None``

    Each of `stdin`, `stdout`, and `stderr` is a file-like object
    which will be used as the new file for the standard I/O stream
    `sys.stdin`, `sys.stdout`, and `sys.stderr` respectively. The file
    should therefore be open, with a minimum of mode 'r' in the case
    of `stdin`, and mode 'w+' in the case of `stdout` and `stderr`.

    If the object has a `fileno()` method that returns a file
    descriptor, the corresponding file will be excluded from being
    closed during daemon start (that is, it will be treated as though
    it were listed in `files_preserve`).

    If ``None``, the corresponding system stream is re-bound to the
    file named by `os.devnull`.


The following methods are defined.

`open()`
    :Return: ``None``

    Open the daemon context, turning the current program into a daemon
    process. This performs the following steps:

    * If the `chroot_directory` attribute is not ``None``, set the
      effective root directory of the process to that directory (via
      `os.chroot`). This allows running the daemon process inside a
      ?chroot gaol? as a means of limiting the system's exposure to
      rogue behaviour by the process.

    * Close all open file descriptors. This excludes those listed in
      the `files_preserve` attribute, and those that correspond to the
      `stdin`, `stdout`, or `stderr` attributes.

    * Change current working directory to the path specified by the
      `working_directory` attribute.

    * Reset the file access creation mask to the value specified by
      the `umask` attribute.

    * Detach the current process into its own process group, and
      disassociate from any controlling terminal.

      This step is skipped if it is determined to be redundant: if the
      process was started by `init`, by `initd`, or by `inetd`.

    * Set signal handlers as specified by the `signal_map` attribute.

    * If the `prevent_core` attribute is true, set the resource limits
      for the process to prevent any core dump from the process.

    * Set the process uid and gid to the true uid and gid of the
      process, to relinquish any elevated privilege.

    * If the `pidfile` attribute is not ``None``, enter its context
      manager.

    * If either of the attributes `uid` or `gid` are not ``None``, set
      the process uid and/or gid to the specified values.

    * If any of the attributes `stdin`, `stdout`, `stderr` are not
      ``None``, bind the system streams `sys.stdin`, `sys.stdout`,
      and/or `sys.stderr` to the files represented by the
      corresponding attributes. Where the attribute has a file
      descriptor, the descriptor is duplicated (instead of re-binding
      the name).

    When the function returns, the running program is a daemon
    process.

`close()`
    :Return: ``None``

    Terminate the daemon context. This performs the following step:

    * If the `pidfile` attribute is not ``None``, exit its context
      manager.

The class also implements the context manager protocol via
``__enter__`` and ``__exit__`` methods.

`__enter__()`
    :Return: The ``DaemonContext`` instance

    Call the instance's `open()` method, then return the instance.

`__exit__(exc_type, exc_value, exc_traceback)`
    :Return: ``True`` or ``False`` as defined by the context manager
        protocol

    Call the instance's `close()` method, then return ``True`` if the
    exception was handled or ``False`` if it was not.


``DaemonError`` objects
=======================

The `DaemonError` class inherits from `Exception`. The `daemon`
package implementation will raise an instance of `DaemonError` when an
error occurs in processing daemon behaviour.


==========
Motivation
==========

The majority of programs written to be Unix daemons either implement
behaviour very similar to that in the `specification`_, or are
poorly-behaved daemons by the `correct daemon behaviour`_.

Since these steps should be much the same in most implementations but
are very particular and easy to omit or implement incorrectly, they
are a prime target for a standard well-tested implementation in the
standard library.


=========
Rationale
=========

Correct daemon behaviour
========================

According to Stevens in [stevens]_ ?2.6, a program should perform the
following steps to become a Unix daemon process.

* Close all open file descriptors.

* Change current working directory.

* Reset the file access creation mask.

* Run in the background.

* Disassociate from process group.

* Ignore terminal I/O signals.

* Disassociate from control terminal.

* Don't reacquire a control terminal.

* Correctly handle the following circumstances:

  * Started by System V `init` process.

  * Daemon termination by ``SIGTERM`` signal.

  * Children generate ``SIGCLD`` signal.

The `daemon` tool [slack-daemon]_ lists (in its summary of features)
behaviour that should be performed when turning a program into a
well-behaved Unix daemon process. It differs from this PEP's intent in
that it invokes a *separate* program as a daemon process. The
following features are appropriate for a daemon that starts itself
once the program is already running:

* Sets up the correct process context for a daemon.

* Behaves sensibly when started by `initd(8)` or `inetd(8)`.

* Revokes any suid or sgid privileges to reduce security risks in case
  daemon is incorrectly installed with special privileges.

* Prevents the generation of core files to prevent leaking sensitive
  information from daemons run as root (optional).

* Names the daemon by creating and locking a PID file to guarantee
  that only one daemon with the given name can execute at any given
  time (optional).

* Sets the user and group under which to run the daemon (optional,
  root only).

* Creates a chroot gaol (optional, root only).

* Captures the daemon's stdout and stderr and directs them to syslog
  (optional).

A daemon is not a service
=========================

This PEP addresses only Unix-style daemons, for which the above
correct behaviour is relevant, as opposed to comparable behaviours on
other operating systems.

There is a related concept in many systems, called a ?service?. A
service differs from the model in this PEP, in that rather than having
the *current* program continue to run as a daemon process, a service
starts an *additional* process to run in the background, and the
current process communicates with that additional process via some
defined channels.

The Unix-style daemon model in this PEP can be used, among other
things, to implement the background-process part of a service; but
this PEP does not address the other aspects of setting up and managing
a service.


========================
Reference Implementation
========================

The `python-daemon` package [python-daemon]_.

As of `python-daemon` version 1.3 (2009-03-12), the package is under
active development and is not yet a full implementation of this PEP.

Other daemon implementations
============================

Prior to this PEP, several existing third-party Python libraries or
tools implemented some of this PEP's `correct daemon behaviour`_.

The `reference implementation`_ is a fairly direct successor from the
following implementations:

* Many good ideas were contributed by the community to Python cookbook
  recipes #66012 [cookbook-66012]_ and #278731 [cookbook-278731]_.

* The `bda.daemon` library [bda.daemon]_ is an implementation of
  [cookbook-66012]_. It is the predecessor of [python-daemon]_.

Other Python daemon implementations that differ from this PEP:

* The `zdaemon` tool [zdaemon]_ was written for the Zope project. Like
  [slack-daemon]_, it differs from this specification because it is
  used to run another program as a daemon process.

* The Python library `daemon` [clapper-daemon]_ is (according to its
  homepage) no longer maintained. As of version 1.0.1, it implements
  the basic steps from [stevens]_.

* The `daemonize` library [seutter-daemonize]_ also implements the
  basic steps from [stevens]_.

* Ray Burr's `daemon.py` module [burr-daemon]_ provides the [stevens]_
  procedure as well as PID file handling and redirection of output to
  syslog.

* Twisted [twisted]_ includes, perhaps unsurprisingly, an
  implementation of a process daemonisation API that is integrated
  with the rest of the Twisted framework; it differs significantly
  from the API in this PEP.

* The Python `initd` library [dagitses-initd]_, which uses
  [clapper-daemon]_, implements an equivalent of Unix `initd(8)` for
  controlling a daemon process.


==========
References
==========

..  [stevens]

    `Unix Network Programming`, W. Richard Stevens, 1994 Prentice
    Hall.

..  [slack-daemon]

    The (non-Python) ?libslack? implementation of a `daemon` tool
    `<http://www.libslack.org/daemon/>`_ by ?raf? <raf at raf.org>.

..  [python-daemon]

    The `python-daemon` library
    `<http://pypi.python.org/pypi/python-daemon/>`_ by Ben Finney et
    al.

..  [cookbook-66012]

    Python Cookbook recipe 66012, ?Fork a daemon process on Unix?
    `<http://code.activestate.com/recipes/66012/>`_.

..  [cookbook-278731]

    Python Cookbook recipe 278731, ?Creating a daemon the Python way?
    `<http://code.activestate.com/recipes/278731/>`_.

..  [bda.daemon]

    The `bda.daemon` library
    `<http://pypi.python.org/pypi/bda.daemon/>`_ by Robert
    Niederreiter et al.

..  [zdaemon]

    The `zdaemon` tool `<http://pypi.python.org/pypi/zdaemon/>`_ by
    Guido van Rossum et al.

..  [clapper-daemon]

    The `daemon` library `<http://pypi.python.org/pypi/daemon/>`_ by
    Brian Clapper.

..  [seutter-daemonize]

    The `daemonize` library `<http://daemonize.sourceforge.net/>`_ by
    Jerry Seutter.

..  [burr-daemon]

    The `daemon.py` module
    `<http://www.nightmare.com/~ryb/code/daemon.py>`_ by Ray Burr.

..  [twisted]

    The `Twisted` application framework
    `<http://pypi.python.org/pypi/Twisted/>`_ by Glyph Lefkowitz et
    al.

..  [dagitses-initd]

    The Python `initd` library `<http://pypi.python.org/pypi/initd/>`_
    by Michael Andreas Dagitses.


=========
Copyright
=========

This work is hereby placed in the public domain. To the extent that
placing a work in the public domain is not legally possible, the
copyright holder hereby grants to all recipients of this work all
rights and freedoms that would otherwise be restricted by copyright.


..
    Local variables:
    mode: rst
    coding: utf-8
    time-stamp-start: "^:Last-Modified:[         ]+"
    time-stamp-end: "$"
    time-stamp-line-limit: 20
    time-stamp-format: "%:y-%02m-%02d %02H:%02M"
    End:
    vim: filetype=rst fileencoding=utf-8 :


From stephen at xemacs.org  Thu Mar 12 07:02:32 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 12 Mar 2009 15:02:32 +0900
Subject: [Python-ideas] Formatting mini-language suggestion
In-Reply-To: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1>
References: <D6F1551CA8894C3B95497076B0D74A35@RaymondLaptop1>
	<loom.20090312T005653-93@post.gmane.org>
	<ca471dc20903112020s71612e32h2605aa0db7624349@mail.gmail.com>
	<7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1>
Message-ID: <87k56v46tj.fsf@xemacs.org>

Raymond Hettinger writes:

 > Will start with Nick's simple proposal as a starting point.
 > 
 > [Nick Coghlan]
 > >  [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

+1 for making that the stopping point, too.

I can't speak for the Chinese, but the Japanese also use the Chinese
numbering system where the verbal expression of large numbers is
grouped by 10000s.  However, in tables of government expenditure and
the like, the commas usually occur every three places.  Eg, the
official GDP figures from the Japanese Ministry of Economy and Trade:

http://www.mext.go.jp/b_menu/toukei/001/08030520/013.htm


From lists at janc.be  Thu Mar 12 07:36:26 2009
From: lists at janc.be (Jan Claeys)
Date: Thu, 12 Mar 2009 07:36:26 +0100
Subject: [Python-ideas] Ruby-style Blocks in Python Idea
In-Reply-To: <eb24b25b0903102208g21bc4aa9x15e30801719d5f2a@mail.gmail.com>
References: <eb24b25b0903090604t3cc874f0se69c6638e8c3fdf1@mail.gmail.com>
	<ca471dc20903091655n2e043c53l39b3b4c82225b8c3@mail.gmail.com>
	<gp6ho6$iis$1@ger.gmane.org>
	<ca471dc20903101325y2d3cfd51pbd5537adea8a07c7@mail.gmail.com>
	<fb6fbf560903101522w33887db7j79f6791ce3512c54@mail.gmail.com>
	<ca471dc20903101527y4a60b91cvb98903f0a765bc16@mail.gmail.com>
	<fb6fbf560903101944k45b6b7ffm596b7736dbb884ed@mail.gmail.com>
	<ca471dc20903101952g1831af51i15dd888c0b482983@mail.gmail.com>
	<eb24b25b0903102034t64ae65f7v7b046fb5cb38eb95@mail.gmail.com>
	<87sklkzpf9.fsf@benfinney.id.au>
	<eb24b25b0903102208g21bc4aa9x15e30801719d5f2a@mail.gmail.com>
Message-ID: <1236839786.16233.40.camel@saeko.local>

Op woensdag 11-03-2009 om 05:08 uur [tijdzone +0000], schreef tav:
> As for Python-ideas, I had taken as granted that people posting would
> have more than a passing interest in language design and the nature of
> Python. Obviously a false assumption. I apologise for this.

But you were right: most of them have a serious interest in such things.

That doesn't mean they are all thinking in the same direction though...


-- 
Jan Claeys


From stefan_ml at behnel.de  Thu Mar 12 07:47:31 2009
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 12 Mar 2009 07:47:31 +0100
Subject: [Python-ideas] Adding a test discovery into Python
In-Reply-To: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>	<gm57o1$5g6$1@ger.gmane.org><ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
Message-ID: <gpab64$ef1$1@ger.gmane.org>

Raymond Hettinger wrote:
> I'm strongly for offering tools that make it easier to write
> the tests in the first place.   The syntax used by py.test
> and nose is vastly superior to the one used by unittest.py,
> a module that is more Javathonic than Pythonic.
> [...]
> Factoid of the Day:  In Py2.7's test_datetime module,
> the phrase self.assertEqual occurs 578 times.

Doesn't that just scream for using a doctest instead?

The interpreter driven type-think-copy-paste pattern works pretty well for
these things.

Stefan


From denis.spir at free.fr  Thu Mar 12 08:13:13 2009
From: denis.spir at free.fr (spir)
Date: Thu, 12 Mar 2009 08:13:13 +0100
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <200903120948.28250.steve@pearwood.info>
References: <gp6hhq$hss$1@ger.gmane.org> <49B7C6BA.8010100@cheimes.de>
	<loom.20090311T141849-464@post.gmane.org>
	<200903120948.28250.steve@pearwood.info>
Message-ID: <20090312081313.40f8b68f@o>

Le Thu, 12 Mar 2009 09:48:27 +1100,
Steven D'Aprano <steve at pearwood.info> s'exprima ainsi:

> Even when programmers try to do the right thing, it is hard to know what 
> the right thing is: there are trade-offs to be made, and having made a 
> trade-off, the programmer then has to re-invent what usually turns out 
> to be a quite complicated wheel. To do the right thing in Python often 
> means delving into the world of os.O_* constants and file descriptors, 
> which is intimidating and unpythonic. They're great for those who 
> want/need them, but perhaps we should expose a Python interface to the 
> more common operations? To my mind, that means classes instead of magic 
> constants.
> 
> Would there be interest in a filetools module? Replies and discussion to 
> python-ideas please.

Sure. +1
Also: a programmer is not (always) a filesystem expert.

denis
------
la vita e estrany


From python at rcn.com  Thu Mar 12 08:17:02 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 00:17:02 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
Message-ID: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>

Motivation:

    Provide a simple, non-locale aware way to format a number
    with a thousands separator.

    Adding thousands separators is one of the simplest ways to
    improve the professional appearance and readability of
    output exposed to end users.

    In the finance world, output with commas is the norm.  Finance users
    and non-professional programmers find the locale approach to be
    frustrating, arcane and non-obvious.

    It is not the goal to replace locale or to accommodate every
    possible convention.  The goal is to make a common task easier
    for many users.


Research so far:

    Scanning the web, I've found that thousands separators are
    usually one of COMMA, PERIOD, SPACE, or UNDERSCORE.  The
    COMMA is used when a PERIOD is the decimal separator.

    James Knight observed that Indian/Pakistani numbering systems
    group by hundreds.   Ben Finney noted that Chinese group by
    ten-thousands.

    Visual Basic and its brethren (like MS Excel) use a completely
    different style and have ultra-flexible custom format specifiers
    like: "_($* #,##0_)".

    
Proposal I (from Nick Coghlan]:
    
    A comma will be added to the format() specifier mini-language:

    [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

    The ',' option indicates that commas should be included in the output as a
    thousands separator. As with locales which do not use a period as the
    decimal point, locales which use a different convention for digit
    separation will need to use the locale module to obtain appropriate
    formatting.

    The proposal works well with floats, ints, and decimals.  It also
    allows easy substitution for other separators.  For example:

        format(n, "6,f").replace(",", "_")

    This technique is completely general but it is awkward in the one
    case where the commas and periods need to be swapped.

        format(n, "6,f").replace(",", "X").replace(".", ",").replace("X", ".")


Proposal II (to meet Antoine Pitrou's request):

    Make both the thousands separator and decimal separator user specifiable
    but not locale aware.  For simplicity, limit the choices to a comma, period, 
    space, or underscore..

    [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type]
    
    Examples:

        format(1234, "8.1f")    -->     '  1234.0'
        format(1234, "8,1f")    -->     '  1234,0'        
        format(1234, "8T.,1f")  -->     ' 1.234,0'
        format(1234, "8T .f")   -->     ' 1 234,0'
        format(1234, "8d")      -->     '    1234'
        format(1234, "8T,d")      -->   '   1,234'

    This proposal meets mosts needs (except for people wanting grouping
    for hundreds or ten-thousands), but iIt comes at the expense of
    being a little more complicated to learn and remember.  Also, it makes it
    more challenging to write custom __format__ methods that follow the
    format specification mini-language.

    For the locale module, just the "T" is necessary in a formatting string
    since the tool already has procedures for figuring out the actual
    separators from the local context.

Comments and suggestions are welcome but I draw the line at Mayan
numbering conventions ;-)


Raymond


From denis.spir at free.fr  Thu Mar 12 08:24:29 2009
From: denis.spir at free.fr (spir)
Date: Thu, 12 Mar 2009 08:24:29 +0100
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <200903120948.28250.steve@pearwood.info>
References: <gp6hhq$hss$1@ger.gmane.org> <49B7C6BA.8010100@cheimes.de>
	<loom.20090311T141849-464@post.gmane.org>
	<200903120948.28250.steve@pearwood.info>
Message-ID: <20090312082429.47dc09c8@o>

Le Thu, 12 Mar 2009 09:48:27 +1100,
Steven D'Aprano <steve at pearwood.info> s'exprima ainsi:

> Even when programmers try to do the right thing, it is hard to know what 
> the right thing is: there are trade-offs to be made, and having made a 
> trade-off, the programmer then has to re-invent what usually turns out 
> to be a quite complicated wheel. To do the right thing in Python often 
> means delving into the world of os.O_* constants and file descriptors, 
> which is intimidating and unpythonic. They're great for those who 
> want/need them, but perhaps we should expose a Python interface to the 
> more common operations? To my mind, that means classes instead of magic 
> constants.
> 
> Would there be interest in a filetools module? Replies and discussion to 
> python-ideas please.

Sure. +1
Also: a programmer is not (always) a filesystem expert.

PS: What I meant is: the point of view from the filesystem is very different. A proper interface will to have to take the programmer's point of view while exposing the filesystem issues. I think (like always at the interface of two worlds -- cf specification talks between developper and client ;-) *terminology* choices will be very important.

denis
------
la vita e estrany


From lie.1296 at gmail.com  Thu Mar 12 08:47:12 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Thu, 12 Mar 2009 18:47:12 +1100
Subject: [Python-ideas] Formatting mini-language suggestion
In-Reply-To: <87k56v46tj.fsf@xemacs.org>
References: <D6F1551CA8894C3B95497076B0D74A35@RaymondLaptop1>	<loom.20090312T005653-93@post.gmane.org>	<ca471dc20903112020s71612e32h2605aa0db7624349@mail.gmail.com>	<7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1>
	<87k56v46tj.fsf@xemacs.org>
Message-ID: <gpaem2$m35$1@ger.gmane.org>

Stephen J. Turnbull wrote:
> Raymond Hettinger writes:
> 
>  > Will start with Nick's simple proposal as a starting point.
>  > 
>  > [Nick Coghlan]
>  > >  [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

could maximumwidth be possible? It's useful if we rather break the 
display of the numbers than breaking the display of the table (and 
possibly add a special sign if width overflow occur, like <62432)

> +1 for making that the stopping point, too.
> 
> I can't speak for the Chinese, but the Japanese also use the Chinese
> numbering system where the verbal expression of large numbers is
> grouped by 10000s.  However, in tables of government expenditure and
> the like, the commas usually occur every three places.  Eg, the
> official GDP figures from the Japanese Ministry of Economy and Trade:
> 
> http://www.mext.go.jp/b_menu/toukei/001/08030520/013.htm

Should there should be a convenience function that will help construct 
the format string. Some kind of: create_format(self, type='i', base=16, 
seppos=4, sep=':', charset='0123456789abcdef', maxwidth=32, minwidth=32, 
pad='0')

-- 
(cookies for you if you noticed that it is ipv6 number format)


From python at rcn.com  Thu Mar 12 08:49:24 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 00:49:24 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
Message-ID: <F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>


[spir]
> Probably you know that already, but it doesn't hurt anyway.
> In french and most rroman languages comma is the standard decimal sep; and either space or dot is used, when necessary, to sep 
> thousands. (It's veeery difficult for me to read even short numbers with commas used as thousand separator.)
>
> en: 1,234,567.89
> fr: 1.234.567,89
> or: 1 234 567,89

Thanks for the informative comment.  It looks like your needs are best met by Proposal II where those would be written as:

   en_num = format(x, "12T, 2f")
   fr_num = format(x, "12T.,2f")
  or_num =format(x, "12T ,2f")

Raymond 


From leif.walsh at gmail.com  Thu Mar 12 09:29:40 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Thu, 12 Mar 2009 04:29:40 -0400
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <20090312082429.47dc09c8@o>
References: <gp6hhq$hss$1@ger.gmane.org> <49B7C6BA.8010100@cheimes.de>
	<loom.20090311T141849-464@post.gmane.org>
	<200903120948.28250.steve@pearwood.info>  <20090312082429.47dc09c8@o>
Message-ID: <1236846580.5214.4.camel@swarley>

On Thu, 2009-03-12 at 08:24 +0100, spir wrote:
> > Would there be interest in a filetools module? Replies and discussion to 
> > python-ideas please.
> 
> Sure. +1
> Also: a programmer is not (always) a filesystem expert.
> 
> PS: What I meant is: the point of view from the filesystem is very different. A proper interface will to have to take the programmer's point of view while exposing the filesystem issues. I think (like always at the interface of two worlds -- cf specification talks between developper and client ;-) *terminology* choices will be very important.

Dealing with different types of OSes and filesystems in a generic way is
difficult.  I would urge everyone to err on the side of less generality,
because I think it would be better for a programmer to write bad code,
and be able to figure out why, than to write code that looks perfectly
fine, and have a harder time discovering the problem.

-- 
Cheers,
Leif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090312/680c6279/attachment.pgp>

From stephen at xemacs.org  Thu Mar 12 10:18:28 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 12 Mar 2009 18:18:28 +0900
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
In-Reply-To: <F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
Message-ID: <87ab7r3xqz.fsf@xemacs.org>

Raymond Hettinger writes:

 > Thanks for the informative comment.  It looks like your needs are
 > best met by Proposal II where those would be written as:
 > 
 >    en_num = format(x, "12T, 2f")
 >    fr_num = format(x, "12T.,2f")
 >    or_num = format(x, "12T ,2f")

That is way unreadable to me, especially the difference between en_num
and or_num.  Also, I wonder if

    en_num = format(x, "12T,.2f")

isn't more explicit.


From eric at trueblade.com  Thu Mar 12 10:33:54 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 05:33:54 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
Message-ID: <49B8D702.4040004@trueblade.com>

Thanks for doing this, Raymond.

I don't have any comments on the specific proposals, yet. I'm still 
thinking it over. But here are a few comments.

Raymond Hettinger wrote:
 > Motivation:

You might want to mention the existing 'n' format type. I don't think 
it's widely known. It handles the odd cases of locales that have odd 
groupings, such as James Knight's example from India (1,00,00,00,000).

James: If you know the locale name for that, I'd like to know it. It 
would be handy for testing.

floats are not terribly useful for 'n', however:
 >>> format(1000000, 'n')
'1,000,000'
 >>> format(1000000.111111, 'n')
'1e+06'
 >>> format(100000.111111, 'n')
'100,000'

> Proposal I (from Nick Coghlan]:
>       A comma will be added to the format() specifier mini-language:
> 
>    [[fill]align][sign][#][0][minimumwidth][,][.precision][type]

Could you add the existing PEP-3101 specifier, just so we know what 
we're changing (and so that I don't have to look it up constantly!)?

[[fill]align][sign][#][0][width][.precision][type]

(As an aside, I copied this from 
http://docs.python.org/library/string.html#formatstrings, I just noticed 
that PEP 3101 differs in the name of the width/minwidth field.)

>    for hundreds or ten-thousands), but iIt comes at the expense of

Typo (iIt).

>   Also, it makes it
>    more challenging to write custom __format__ methods that follow the
>    format specification mini-language.

For this exact reason, I've always wanted to add a method somewhere that 
parses the mini-language. The code exists in the C implementation, it 
would just need to be exposed, probably returning a namedtuple with the 
various fields.

>    For the locale module, just the "T" is necessary in a formatting string
>    since the tool already has procedures for figuring out the actual
>    separators from the local context.

Is this needed at all? That is, having just the "T"? How is this 
different from using type=n? Having asked the question, I guess the 
answer is it lets you use it with the more useful float type=f.

> Comments and suggestions are welcome but I draw the line at Mayan
> numbering conventions ;-)

That's only a problem until December 21, 2012 anyway!

Eric.


From solipsis at pitrou.net  Thu Mar 12 12:03:18 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Mar 2009 11:03:18 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Rough_draft=3A_Proposed_format_specifier?=
	=?utf-8?q?_for_a=09thousands_separator_=28discussion_moved_from_py?=
	=?utf-8?q?thon-dev=29?=
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
Message-ID: <loom.20090312T110240-380@post.gmane.org>

Raymond Hettinger <python at ...> writes:
>   or_num =format(x, "12T ,2f")

In many cases, the space would have to be a non-breaking space, but it's
probably too complicated for your PEP.

Regards

Antoine.


From ggpolo at gmail.com  Thu Mar 12 12:33:18 2009
From: ggpolo at gmail.com (Guilherme Polo)
Date: Thu, 12 Mar 2009 08:33:18 -0300
Subject: [Python-ideas] Fwd: Adding a test discovery into Python
In-Reply-To: <A7D4E02B-95B1-4AC7-A96C-DA45FF5D2A8C@gmail.com>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
	<ac2200130903111604j5d34a21ex6aa8a668f6bd8130@mail.gmail.com>
	<ac2200130903111605q2d3ed2c7sa572edc62e34e775@mail.gmail.com>
	<A7D4E02B-95B1-4AC7-A96C-DA45FF5D2A8C@gmail.com>
Message-ID: <ac2200130903120433p5659cf96wf512d996d982e311@mail.gmail.com>

On Thu, Mar 12, 2009 at 12:04 AM, Leonardo Santagada
<santagada at gmail.com> wrote:
>
> On Mar 11, 2009, at 8:05 PM, Guilherme Polo wrote:
>>
>> On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger <python at rcn.com> wrote:
>>>
>>> [Christian Heimes]
>>>>>>
>>>>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on
>>>>>> simple
>>>>>> because there are enough frameworks for elaborate unit testing.
>>>
>>> Test discovery is not the interesting part of the problem.
>>
>> Interesting or not, it is a problem that is asking for a solution,
>> this kind of code is being duplicated in several places for no good
>> reason.
>>
>>>
>>> Axiom: ?The more work involved in writing tests, the fewer
>>> tests that will get written.
>>
>> At some point you will have to run them too, I don't think you want to
>> reimplement the discovery part yet another time.
>
>
> What I think he was getting at is that 20-30 lines of test discovery have to
> be written once for each project (or none if using py.test/nose), but
> self.assertequals and all of the other quirks of unittest are all over a
> test suite and you need to write all of it each time you have to make a
> test.
>
> Not that what you are trying to do is pointless, but fixing this other
> problem is so much more interesting...
>

This is incredible pointless if you think it this way, "only 20-30
lines". I really don't believe you will come up with something decent
in 20-30 lines if you intend this to be reusable for nose and maybe
py.test (although I haven't looked much into py.test), it is not just
about finding files, have you read the previous emails in the
discussion ?

>
> --
> Leonardo Santagada
> santagada at gmail.com


-- 
-- Guilherme H. Polo Goncalves


From wilk at flibuste.net  Thu Mar 12 13:25:36 2009
From: wilk at flibuste.net (William Dode)
Date: Thu, 12 Mar 2009 12:25:36 +0000 (UTC)
Subject: [Python-ideas] float vs decimal
References: <gp937i$6ka$1@ger.gmane.org>
	<50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com>
	<gp9akq$2jv$1@ger.gmane.org>
	<50697b2c0903111439k21171e3t19d18bff1601d0fd@mail.gmail.com>
Message-ID: <gpav00$7ag$1@ger.gmane.org>

On 11-03-2009, Chris Rebert wrote:

[...]

> It's in the FAQ section of the decimal module -
> http://docs.python.org/library/decimal.html :
>
> 17. Is there a way to convert a regular float to a Decimal?
> A. Yes, all binary floating point numbers can be exactly expressed as
> a Decimal. An exact conversion may take more precision than intuition
> would suggest, so we trap Inexact to signal a need for more precision:
>   def float_to_decimal(f):
>     [definition snipped]
>
> 17. Why isn?t the float_to_decimal() routine included in the module?
> A. There is some question about whether it is advisable to mix binary
> and decimal floating point. Also, its use requires some care to avoid
> the representation issues associated with binary floating point:
>>>> float_to_decimal(1.1)
> Decimal('1.100000000000000088817841970012523233890533447265625')

I understand. It means that explicit is better and that we should know 
what we do with decimal numbers...

thanks

-- 
William Dod? - http://flibuste.net
Informaticien Ind?pendant


From solipsis at pitrou.net  Thu Mar 12 13:28:59 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Mar 2009 12:28:59 +0000 (UTC)
Subject: [Python-ideas] Fwd: Adding a test discovery into Python
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
	<ac2200130903111604j5d34a21ex6aa8a668f6bd8130@mail.gmail.com>
	<ac2200130903111605q2d3ed2c7sa572edc62e34e775@mail.gmail.com>
	<A7D4E02B-95B1-4AC7-A96C-DA45FF5D2A8C@gmail.com>
	<ac2200130903120433p5659cf96wf512d996d982e311@mail.gmail.com>
Message-ID: <loom.20090312T122543-261@post.gmane.org>

Guilherme Polo <ggpolo at ...> writes:
> 
> This is incredible pointless if you think it this way, "only 20-30
> lines". I really don't believe you will come up with something decent
> in 20-30 lines if you intend this to be reusable for nose and maybe
> py.test (although I haven't looked much into py.test), it is not just
> about finding files, have you read the previous emails in the
> discussion ?

+1. self.assertEqual is just a typing annoyance compared to the burden of
getting the aforementioned "20-30 lines" right.

(you can probably even rip off the nose.tools module and use its convenience
functions within unittest-driven tests: ok_, eq_, etc.)


From suraj at barkale.com  Thu Mar 12 14:59:23 2009
From: suraj at barkale.com (Suraj Barkale)
Date: Thu, 12 Mar 2009 13:59:23 +0000 (UTC)
Subject: [Python-ideas] cd statement?
References: <49B67E6C.6020206@molden.no>
	<9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com>
	<49B69558.3090000@molden.no>
Message-ID: <loom.20090312T135809-385@post.gmane.org>

Sturla Molden <sturla at ...> writes:
>
> Arnaud Delobelle wrote:
> > Have you tried IPython?
> Yes, it has all that I miss, but it's ugly (at least on Windows, where 
> it runs in a DOS shell).

It is getting there. The 0.9 release had wx interface with minimal
functionality. I have crossed my fingers for 0.10 release.

Regards,
Suraj


From dangyogi at gmail.com  Thu Mar 12 16:40:53 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Thu, 12 Mar 2009 11:40:53 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
Message-ID: <49B92D05.4010807@gmail.com>

Raymond Hettinger wrote:
>    James Knight observed that Indian/Pakistani numbering systems
>    group by hundreds.   
I'm not 100% sure here, but I believe that in India, they insert a 
separator after the first 3 digits, then another after 2 more digits, 
then every 3 digits after that (not sure if they use commas or periods, 
I think commas):

1,000,000,00,000

-bruce


From bruce at leapyear.org  Thu Mar 12 19:17:06 2009
From: bruce at leapyear.org (Bruce Leban)
Date: Thu, 12 Mar 2009 11:17:06 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49B92D05.4010807@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
Message-ID: <cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>

> Make both the thousands separator and decimal separator user specifiable
> but not locale aware.

-1.0 as it stands (or -1,0 if you prefer)

When you say 'user' you mean 'developer'. Having the developer choose the
separators means it *won't* be what the user wants. Why would you stick in
separators if not to display to a user?

If I'm French then all decimal points should be ',' not '.' regardless of
what language the developer speaks, right?

A format specifier that says "please use the local-specific separators when
formatting this number" would be fine. We already have 'n' for this but
suppose we choose ';' as the character for this (chosen because it looks
like a '.' or a ',' which is are two of the three most common choices).

For example format(x, '6;d') == format(x, '6n') and you can use '';' with
any number type: format(x, '6;.3f') or format(x, '10;g').

I'd be inclined to always group in units of four digits if someone writes
format(x, '6;x').

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090312/c3e9e8be/attachment.html>

From guido at python.org  Thu Mar 12 19:21:05 2009
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Mar 2009 11:21:05 -0700
Subject: [Python-ideas] Adding a test discovery into Python
In-Reply-To: <gpab64$ef1$1@ger.gmane.org>
References: <ac2200130902010910x385e4fa8m25d119a707ece87d@mail.gmail.com>
	<gm57o1$5g6$1@ger.gmane.org>
	<ca471dc20902011440h15726e07y8fb39b1eb631dfc@mail.gmail.com>
	<gp94jv$caf$1@ger.gmane.org>
	<7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1>
	<gpab64$ef1$1@ger.gmane.org>
Message-ID: <ca471dc20903121121q466e480ag780af88ab4507041@mail.gmail.com>

On Wed, Mar 11, 2009 at 11:47 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Raymond Hettinger wrote:
>> I'm strongly for offering tools that make it easier to write
>> the tests in the first place. ? The syntax used by py.test
>> and nose is vastly superior to the one used by unittest.py,
>> a module that is more Javathonic than Pythonic.
>> [...]
>> Factoid of the Day: ?In Py2.7's test_datetime module,
>> the phrase self.assertEqual occurs 578 times.
>
> Doesn't that just scream for using a doctest instead?
>
> The interpreter driven type-think-copy-paste pattern works pretty well for
> these things.

That depends on how well all the other tests (those that *don't* use
assertEquals) fit in doctest's mold.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Thu Mar 12 19:28:38 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 11:28:38 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
Message-ID: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>


[Bruce Leban]
> If I'm French then all decimal points should be ',' not '.' regardless
> of what language the developer speaks, right?

We already have a locale aware solution and that should
be used for internationalized apps.  The locale module
is not going away.

This proposal is for everyday programs for local consumption
(most scripts never get internationalized).  I would even
venture that most Python scripts are not written by
professional programmers.  If an accountant needs to knock-out
a quick report, he/she should have a simple means of basic 
formatting without invoking all of the locale machinery.


Raymond


From guido at python.org  Thu Mar 12 19:42:21 2009
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Mar 2009 11:42:21 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
Message-ID: <ca471dc20903121142i115761e8mb0385fa272564603@mail.gmail.com>

Raymond, aren't you equating "local" with the US?

The local module lets you take the locale as a separate parameter. I
agree we should not try to duplicate it (though it's a bad API since
it relies on global state -- that doesn't work very well in
multi-threaded or web apps).

But it does make sense for an accountant in France or Holland to
hardcode her desire for a decimal comma and thousand-separating
periods, as otherwise their boss won't be able to interpret the
output.

On Thu, Mar 12, 2009 at 11:28 AM, Raymond Hettinger <python at rcn.com> wrote:
>
> [Bruce Leban]
>>
>> If I'm French then all decimal points should be ',' not '.' regardless
>> of what language the developer speaks, right?
>
> We already have a locale aware solution and that should
> be used for internationalized apps. ?The locale module
> is not going away.
>
> This proposal is for everyday programs for local consumption
> (most scripts never get internationalized). ?I would even
> venture that most Python scripts are not written by
> professional programmers. ?If an accountant needs to knock-out
> a quick report, he/she should have a simple means of basic formatting
> without invoking all of the locale machinery.
>
>
> Raymond
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From pyideas at rebertia.com  Thu Mar 12 19:44:13 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 12 Mar 2009 11:44:13 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49B92D05.4010807@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
Message-ID: <50697b2c0903121144l48a89a08p35bb0fb902896808@mail.gmail.com>

On Thu, Mar 12, 2009 at 8:40 AM, Bruce Frederiksen <dangyogi at gmail.com> wrote:
> Raymond Hettinger wrote:
>>
>> ? James Knight observed that Indian/Pakistani numbering systems
>> ? group by hundreds.
>
> I'm not 100% sure here, but I believe that in India, they insert a separator
> after the first 3 digits, then another after 2 more digits, then every 3
> digits after that (not sure if they use commas or periods, I think commas):
>
> 1,000,000,00,000

Not quite. I'm not Indian, but based off Wikipedia
(http://en.wikipedia.org/wiki/Lakh):

"after the first three digits, a comma divides every two rather than
every three digits, thus:
Indian system: 12,12,12,123    5,05,000    7,00,00,00,000"

Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com


From python at rcn.com  Thu Mar 12 20:02:07 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 12:02:07 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<ca471dc20903121142i115761e8mb0385fa272564603@mail.gmail.com>
Message-ID: <7120107372EF49CC98939762F1C6D318@RaymondLaptop1>

[GvR]
> Raymond, aren't you equating "local" with the US?

Not at all.  "For local consumption" meant anything that isn't distributed
as a fully internationalized app.  Right now, all our reprs and string
interpolations are not locale-aware (i.e. float reprs are hardwired to use 
periods for the decimal separator).  Those tools are pretty useful to us
in day-to-day work.  I'm just proposing to extend those
non-locale-aware capabilities to include a thousands separator.

For a fully internationalized app, I would use something like Babel
which addresses the challenge in a comprehensive and uniform manner.


> The local module lets you take the locale as a separate parameter. I
> agree we should not try to duplicate it (though it's a bad API since
> it relies on global state -- that doesn't work very well in
> multi-threaded or web apps).
>
> But it does make sense for an accountant in France or Holland to
> hardcode her desire for a decimal comma and thousand-separating
> periods, as otherwise their boss won't be able to interpret the
> output.

Well said.


Raymond


From jimjjewett at gmail.com  Thu Mar 12 20:29:15 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 12 Mar 2009 15:29:15 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
Message-ID: <fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>

On 3/12/09, Raymond Hettinger <python at rcn.com> wrote:

> If an accountant needs to knock-out
> a quick report, he/she should have a simple means of basic
> formatting without invoking all of the locale machinery.

Fair enough.  But what does a thousands separator provide that the "n"
type doesn't already provide?  (Well, except that n isn't as well
known  -- but initially this won't be either.)

Do you want to avoid using locale even in the background?
Do you want to avoid having to set a locale in the program startup?
Do you want a better default for locale?

Do you really want a different type, such as "m" for money?  (That
sounds sensible to me, except that there are so many different
standard ways to format money, even within the US, so I'm not sure a
single format would do it.)

-jJ


From python at rcn.com  Thu Mar 12 20:51:07 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 12:51:07 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
Message-ID: <AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>


[Jim Jewett]
> Fair enough.  But what does a thousands separator provide that the "n"
> type doesn't already provide?  (Well, except that n isn't as well
> known  -- but initially this won't be either.)

It's nice to be have a non-locale aware alternative so you can say
explicitly what you want.  This is especially helpful in Guido's example
where you need to format for a different locale than the one that is
currently on your machine (i.e. the global state doesn't match the target).

FWIW, C-Sharp provides both ways, a locale aware "n" format and
a hard-wired explicit thousands separator.  See the updated PEP for
examples and a link.


> Do you want to avoid using locale even in the background?

I thought locale was always there.


> Do you want to avoid having to set a locale in the program startup?

Yes.  I don't think most casaul users should have to figure that out.  
It's a little to magical and arcane:
    >>> import local
    >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
> Do you want a better default for locale?

The default does suck:

    >>> format(1237, "n")
    '1237'


> Do you really want a different type, such as "m" for money?

I don't but I'm sure someone does.  I did write a money formatter
sample recipe for the decimal docs so people would have something
to model from.

FWIW, I've always thought it weird that the currency symbol could
shift with a locale setting.  ISTM, that if you change the symbol, you
also have to change the amount that goes with it :-)


Raymond


From mrs at mythic-beasts.com  Thu Mar 12 21:24:10 2009
From: mrs at mythic-beasts.com (Mark Seaborn)
Date: Thu, 12 Mar 2009 20:24:10 +0000 (GMT)
Subject: [Python-ideas] CapPython's use of unbound methods
Message-ID: <20090312.202410.846948621.mrs@localhost.localdomain>

Guido asked me to explain why the removal of unbound methods in Python
3.0 causes a problem for enforcing encapsulation in CapPython (an
object-capability subset of Python), which I talked about in a blog
post [1].  It also came up on python-dev [2].

Let me try a slightly different example to answer Guido's immediate
question.

Suppose we have an object x with a private attribute, "_field",
defined by a class Foo:

class Foo(object):

    def __init__(self):
        self._field = "secret"

x = Foo()

Suppose CapPython code is handed x.  It should not be able to read
x._field, and the expression x._field will be rejected by CapPython's
static verifier.

However, in Python 3.0, the CapPython code can do this:

class C(object):

    def f(self):
        return self._field

C.f(x) # returns "secret"

Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is
not being called on an instance of C.

Guido said, "I don't understand where the function object f gets its
magic powers".

The answer is that function definitions directly inside class
statements are treated specially by the verifier.

If you wrote the same function definition at the top level:

def f(var):
    return var._field # rejected

the attribute access would be rejected by the verifier, because "var"
is not a self variable, and private attributes may only be accessed
through self variables.

I renamed the variable in the example, but the name of the variable
makes no difference to whether it is considered to be a self variable.

Self variables are defined as follows:

If a function definition "def f(v1, ...)" appears immediately within a
"class" statement, the function's first argument, v1, is a self
variable, provided that:
 * the "def" is not preceded by any decorators, and
 * "f" is not read anywhere in class scope and is not declared as global.

The reason for these two restrictions is to prevent the function
object from escaping and being used directly.

Mark

[1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html
[2] http://mail.python.org/pipermail/python-dev/2008-September/082499.html


From solipsis at pitrou.net  Thu Mar 12 21:33:03 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Mar 2009 20:33:03 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?Rough_draft=3A_Proposed_format_specifier?=
	=?utf-8?q?_for_a=09thousands_separator_=28discussion_moved_from_py?=
	=?utf-8?q?thon-dev=29?=
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
Message-ID: <loom.20090312T203018-645@post.gmane.org>

Jim Jewett <jimjjewett at ...> writes:
> 
> Do you want to avoid using locale even in the background?
> Do you want to avoid having to set a locale in the program startup?
> Do you want a better default for locale?

As Guido said, a problem is that locale relies on shared state. It makes it very
painful to use (any module setting the locale to a value which suits its
semantics can negatively impact other modules or libraries in your application).

But even worse is that the desired locale is not necessarily installed. For
example if I develop an app for French users but it is hosted on an US server,
perhaps the 'fr_FR' locale won't be available at all.


From eric at trueblade.com  Thu Mar 12 22:24:01 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 17:24:01 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
Message-ID: <49B97D71.5070107@trueblade.com>

Raymond Hettinger wrote:
> 
> [Jim Jewett]
>> Fair enough.  But what does a thousands separator provide that the "n"
>> type doesn't already provide?  (Well, except that n isn't as well
>> known  -- but initially this won't be either.)
> 
> It's nice to be have a non-locale aware alternative so you can say
> explicitly what you want.  This is especially helpful in Guido's example
> where you need to format for a different locale than the one that is
> currently on your machine (i.e. the global state doesn't match the target).

I've always thought that we should have a utility function which formats 
a number based on the same settings that are in the locale, but not 
actually use the locale. Something like:

format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
               grouping=[4, 3, 2])
 >>> '12 34 56 78 765 4321,123'

That would get rid of threading issues, and you wouldn't have to worry 
about what locales were installed.

I basically have this function in the various formatting routines, it 
just needs to be pulled out and exposed.

>> Do you really want a different type, such as "m" for money?
> 
> I don't but I'm sure someone does.  I did write a money formatter
> sample recipe for the decimal docs so people would have something
> to model from.

This becomes easier with the hypothetical "format_number" routine.

But this is all orthogonal to the str.format() discussion.

Eric.


From guido at python.org  Thu Mar 12 22:33:23 2009
From: guido at python.org (Guido van Rossum)
Date: Thu, 12 Mar 2009 14:33:23 -0700
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <20090312.202410.846948621.mrs@localhost.localdomain>
References: <20090312.202410.846948621.mrs@localhost.localdomain>
Message-ID: <ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>

On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> Guido asked me to explain why the removal of unbound methods in Python
> 3.0 causes a problem for enforcing encapsulation in CapPython (an
> object-capability subset of Python), which I talked about in a blog
> post [1]. ?It also came up on python-dev [2].
>
> Let me try a slightly different example to answer Guido's immediate
> question.
>
> Suppose we have an object x with a private attribute, "_field",
> defined by a class Foo:
>
> class Foo(object):
>
> ? ?def __init__(self):
> ? ? ? ?self._field = "secret"
>
> x = Foo()

Can you add some principals to this example? Who wrote the Foo class
definition? Does CapPython have access to the source code for Foo? To
the class object?

> Suppose CapPython code is handed x.

What does it mean to "hand x to CapPython"? Who "is" CapPython?

> It should not be able to read
> x._field, and the expression x._field will be rejected by CapPython's
> static verifier.
>
> However, in Python 3.0, the CapPython code can do this:
>
> class C(object):
>
> ? ?def f(self):
> ? ? ? ?return self._field
>
> C.f(x) # returns "secret"
>
> Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is
> not being called on an instance of C.

In Python 2.x I could write

class C(Foo):
  def f(self):
    return self._field

or alternatively

class C(x.__class__):
  <same f as before>

> Guido said, "I don't understand where the function object f gets its
> magic powers".
>
> The answer is that function definitions directly inside class
> statements are treated specially by the verifier.

Hm, this sounds like a major change in language semantics, and if I
were Sun I'd sue you for using the name "Python" in your product. :-)

> If you wrote the same function definition at the top level:
>
> def f(var):
> ? ?return var._field # rejected
>
> the attribute access would be rejected by the verifier, because "var"
> is not a self variable, and private attributes may only be accessed
> through self variables.
>
> I renamed the variable in the example,

What do you mean by this?

> but the name of the variable
> makes no difference to whether it is considered to be a self variable.
>
> Self variables are defined as follows:
>
> If a function definition "def f(v1, ...)" appears immediately within a
> "class" statement, the function's first argument, v1, is a self
> variable, provided that:
> ?* the "def" is not preceded by any decorators, and
> ?* "f" is not read anywhere in class scope and is not declared as global.
>
> The reason for these two restrictions is to prevent the function
> object from escaping and being used directly.

Do you also catch things like

g = getattr
s = 'field'.replace('f', '_f')

print g(x, s)

?

> Mark
>
> [1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html
> [2] http://mail.python.org/pipermail/python-dev/2008-September/082499.html

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Thu Mar 12 22:36:14 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Mar 2009 10:36:14 +1300
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
Message-ID: <49B9804E.3010501@canterbury.ac.nz>

Bruce Leban wrote:

> When you say 'user' you mean 'developer'. Having the developer choose 
> the separators means it *won't* be what the user wants. Why would you 
> stick in separators if not to display to a user?

I agree. I don't see a use case for hard-coding non-standard
separators into every format string.

So I'm +1 on proposal I and -1 on proposal II.

Also +1 on providing a "use the locale" option
that's orthogonal to the type specifier.

-- 
Greg


From solipsis at pitrou.net  Thu Mar 12 23:25:03 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 Mar 2009 22:25:03 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
Message-ID: <loom.20090312T222438-482@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:
> 
> I agree. I don't see a use case for hard-coding non-standard
> separators into every format string.

Sorry, but what do you call "non-standard" exactly?


From eric at trueblade.com  Fri Mar 13 00:31:39 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 19:31:39 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <49B97D71.5070107@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com>
Message-ID: <49B99B5B.7060808@trueblade.com>

Eric Smith wrote:
> I've always thought that we should have a utility function which formats 
> a number based on the same settings that are in the locale, but not 
> actually use the locale. Something like:
> 
> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
>               grouping=[4, 3, 2])
>  >>> '12 34 56 78 765 4321,123'

To be maximally useful (for example, so it could be used in Decimal to 
implement locale formatting), maybe it should work on strings:

 >>> format_number(whole_part='123456787654321',
               fractional_part='123',
               decimal_point=',',
               thousands_sep=' ',
               grouping=[4, 3, 2])
 >>> '12 34 56 78 765 4321,123'

 >>> format_number(whole_part='123456787654321',
               decimal_point=',',
               thousands_sep='.',
               grouping=[4, 3, 2])
 >>> '12.34.56.78.765.4321'

I think such a method, along with locale.localeconv(), would be the 
workhorse for much of formatting we've been talking about. It could be 
flushed out with the sign and other remaining fields from localeconv(). 
The key point is that it takes everything as parameters and doesn't use 
any global state. In particular, it by itself would not reference the 
locale.

I'll probably add such a routine anyway, even if it doesn't get 
documented as a public API.

Eric.


From python at rcn.com  Fri Mar 13 00:37:30 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 16:37:30 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
Message-ID: <2385B41B46D1491388AED0C1CE7E3060@RaymondLaptop1>

Today's updates to http://www.python.org/dev/peps/pep-0378/ 

* Specify what width means when thousands separators are present.
* Clarify that the locale module is not being proposed to change.
* Add research on what is done in C-Sharp, MS-Excel, COBOL, and CommonLisp.
* Add more examples.


Raymond


From dangyogi at gmail.com  Fri Mar 13 00:41:59 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Thu, 12 Mar 2009 19:41:59 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <49B99B5B.7060808@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
Message-ID: <49B99DC7.2020405@gmail.com>

Eric Smith wrote:
> >>> format_number(whole_part='123456787654321',
>               decimal_point=',',
>               thousands_sep='.',
>               grouping=[4, 3, 2])
> >>> '12.34.56.78.765.4321'
>
Maybe the 'thousands_sep' parameter should be called 'grouping_sep' 
(since it doesn't always group by thousands)?

-bruce frederiksen


From python at rcn.com  Fri Mar 13 00:57:19 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 16:57:19 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
Message-ID: <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>


>> I've always thought that we should have a utility function which formats 
>> a number based on the same settings that are in the locale, but not 
>> actually use the locale. Something like:
>> 
>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
>>               grouping=[4, 3, 2])
>>  >>> '12 34 56 78 765 4321,123'
> 
> To be maximally useful (for example, so it could be used in Decimal to 
> implement locale formatting), maybe it should work on strings:
> 
> >>> format_number(whole_part='123456787654321',
>               fractional_part='123',
>               decimal_point=',',
>               thousands_sep=' ',
>               grouping=[4, 3, 2])
> >>> '12 34 56 78 765 4321,123'

Whoa guys!  I think you're treading very far away from and rejecting the 
whole idea of PEP 3101 which was to be the one ring to bind them all with
format(obj, fmt) having just two arguments and doing nothing but
passing them on to obj.__fmt__() which would be responsible for
parsing a format string.

Also,even  if you wanted a flexible clear separate tool just for number
formatting, I don't think keyword arguments are the way to go.
That is a somewhat heavy approach with limited flexibility.
The research in PEP 378 shows that for languages needing
fine control and extreme versatility in formatting, some kind
of picture string is the way to go.  MS Excel is a champ
at number/date formatting strings:   #,##0 and whatnot.
The allow negatives to have placeholders, trailing minus signs,
parentheses, etc.  Columns can be aligned neating, any type
of padding can be used, any type of separator may be specified.
The COBOL picture statements also offer flexibility and clarity.
Mini-languages of some sort beat the heck out of functions
with a zillion optional arguments.


Raymond

"Working with creative thinkers can be like herding cats."


From eric at trueblade.com  Fri Mar 13 01:02:20 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 20:02:20 -0400
Subject: [Python-ideas] locale-independent number formatting (was: Rough
 draft: Proposed format specifier for a thousands separator)
In-Reply-To: <49B99DC7.2020405@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com> <49B99DC7.2020405@gmail.com>
Message-ID: <49B9A28C.9020708@trueblade.com>

Bruce Frederiksen wrote:
> Eric Smith wrote:
>> >>> format_number(whole_part='123456787654321',
>>               decimal_point=',',
>>               thousands_sep='.',
>>               grouping=[4, 3, 2])
>> >>> '12.34.56.78.765.4321'
>>
> Maybe the 'thousands_sep' parameter should be called 'grouping_sep' 
> (since it doesn't always group by thousands)?
> 
> -bruce frederiksen
> 

thousands_sep is the locale.localeconv() name, which I suggest we use.

I suggest that this particular API only support the LC_NUMERIC fields 
(decimal_point, grouping, thousands_sep), and that maybe we have a 
separate format_money which supports the LC_MONETARY fields.

Eric.


From eric at trueblade.com  Fri Mar 13 01:08:23 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 20:08:23 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
	<9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>
Message-ID: <49B9A3F7.5010407@trueblade.com>

Raymond Hettinger wrote:
> Whoa guys!  I think you're treading very far away from and rejecting the 
> whole idea of PEP 3101 which was to be the one ring to bind them all with
> format(obj, fmt) having just two arguments and doing nothing but
> passing them on to obj.__fmt__() which would be responsible for
> parsing a format string.

I completely agree. That's why I said "But this is all orthogonal to the 
str.format() discussion." I meant "orthogonal" in the "unrelated" sense.

I'm completely on board with your PEP 378 as a simple way just to get 
some simple formatting into numbers.

> Also,even  if you wanted a flexible clear separate tool just for number
> formatting, I don't think keyword arguments are the way to go.
> That is a somewhat heavy approach with limited flexibility.
> The research in PEP 378 shows that for languages needing
> fine control and extreme versatility in formatting, some kind
> of picture string is the way to go.  MS Excel is a champ
> at number/date formatting strings:   #,##0 and whatnot.
> The allow negatives to have placeholders, trailing minus signs,
> parentheses, etc.  Columns can be aligned neating, any type
> of padding can be used, any type of separator may be specified.
> The COBOL picture statements also offer flexibility and clarity.
> Mini-languages of some sort beat the heck out of functions
> with a zillion optional arguments.

I think picture based is okay and has its place, but a routine like my 
proposed format_number (which I know is a bad name) is really the heavy 
lifter for all locale-based number formatting. Decimal shouldn't really 
have to completely reimplement locale-based formatting, especially when 
it already exists in the core. I just want to expose it.

Eric.


From python at rcn.com  Fri Mar 13 01:18:05 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 17:18:05 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
	<9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>
	<49B9A3F7.5010407@trueblade.com>
Message-ID: <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1>


[Eric Smith]
> Decimal shouldn't really 
> have to completely reimplement locale-based formatting, especially when 
> it already exists in the core. I just want to expose it.

I see.  Sounds like you're looking for the parser to have some hooks
so that people writing new __format__ methods don't have to start
from scratch.


> I completely agree. That's why I said "But this is all orthogonal to the 
> str.format() discussion." I meant "orthogonal" in the "unrelated" sense.

Makes sense.  Hopefully, we can get this thread back on track for
evaluating the proposal for a minor buildout to the existing mini-language.


Raymond


From eric at trueblade.com  Fri Mar 13 01:22:01 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 12 Mar 2009 20:22:01 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
	<9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>
	<49B9A3F7.5010407@trueblade.com>
	<5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1>
Message-ID: <49B9A729.5020207@trueblade.com>

Raymond Hettinger wrote:
> 
> [Eric Smith]
>> Decimal shouldn't really have to completely reimplement locale-based 
>> formatting, especially when it already exists in the core. I just want 
>> to expose it.
> 
> I see.  Sounds like you're looking for the parser to have some hooks
> so that people writing new __format__ methods don't have to start
> from scratch.

Not necessarily hooks, but some support routines. I think the standard 
format specifier parser should be exposed, and also the locale-based 
formatter should be exposed. These are both unrelated to PEP 378, but 
they could be used to implement it. They'd be especially useful for 
non-builtin types like Decimal.

> Makes sense.  Hopefully, we can get this thread back on track for
> evaluating the proposal for a minor buildout to the existing mini-language.

Right. Apologies for hijacking it, and especially for not making it 
clear that I was veering off subject.

Eric.


From greg.ewing at canterbury.ac.nz  Fri Mar 13 02:15:55 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Mar 2009 14:15:55 +1300
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <loom.20090312T222438-482@post.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
Message-ID: <49B9B3CB.2090906@canterbury.ac.nz>

Antoine Pitrou wrote:
> Greg Ewing <greg.ewing at ...> writes:
>> I agree. I don't see a use case for hard-coding non-standard
>> separators into every format string.
> 
> Sorry, but what do you call "non-standard" exactly?

I mean something other than "," and ".".

My point is that while it's perfectly reasonable
for, e.g. a French programmer to want to format his
numbers with dots and commas the other way around,
it's *not* reasonable to force him to tediously specify
it in each and every format specifier he writes.

There needs to be some way of setting it once for
the whole program, otherwise it just won't be
practical.

-- 
Greg


From steve at pearwood.info  Fri Mar 13 02:18:55 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 13 Mar 2009 12:18:55 +1100
Subject: [Python-ideas] [Python-Dev] Ext4 data loss
In-Reply-To: <C772320F-990D-441F-9989-88C69B5F730B@zooko.com>
References: <gp6hhq$hss$1@ger.gmane.org>
	<200903120948.28250.steve@pearwood.info>
	<C772320F-990D-441F-9989-88C69B5F730B@zooko.com>
Message-ID: <200903131218.55647.steve@pearwood.info>

On Thu, 12 Mar 2009 12:26:40 pm zooko wrote:
> > Would there be interest in a filetools module? Replies and
> > discussion to python-ideas please.
>
> I've been using and maintaining a few filesystem hacks for, let's
> see, almost nine years now:
>
> http://allmydata.org/trac/pyutil/browser/pyutil/pyutil/fileutil.py
>
> (The first version of that was probably written by Greg Smith in
> about 1999.)
>
> I'm sure there are many other such packages.  A couple of quick
> searches of pypi turned up these two:
>
> http://pypi.python.org/pypi/Pythonutils
> http://pypi.python.org/pypi/fs
>
> I wonder if any of them have the sort of functionality you're
> thinking of.


Close, but not quite.

I'm suggesting a module with a collection of subclasses of file that 
exhibit modified behaviour. For example:

class FlushOnWrite(file):
    def write(self, data):
        super(FlushOnWrite, self).write(data)
        self.flush()
    # similarly for writelines

class SyncOnWrite(FlushOnWrite):
    # ...

class SyncOnClose(file):
    # ...

plus functions which implement common idioms for safely writing data, 
making backups on a save, etc. A common idiom for safely over-writing a 
file while minimising the window of opportunity for file loss is:

write to a temporary file and close it
move the original to a backup location
move the temporary file to where the original was
if no errors, delete the backup

although when I say "common" what I really mean is that it should be 
common, but probably isn't :-/ The sort of file handling that is 
complicated and tedious to get right, and so most developers don't 
bother, and those that do are re-inventing the wheel.

There's a couple of recipes in the Python Cookbook which might be 
candidates. E.g. the first edition has recipes "Versioning Filenames" 
by Robin Parmar and "Module: Versioned Backups" by Mitch Chapman.

What I DON'T mean is pathname utilities. Nor do I mean mini-applications 
that operate on files, like renaming file extensions, deleting files 
that meet some criterion, etc. I don't think they belong in the 
standard library, and even if they do, they don't belong in this 
proposed module.

My intention is to offer a standard set of tools so people can choose 
the behaviour that suits their application best, rather than trying to 
make file() a one-size-fits-all solution.


-- 
Steven D'Aprano


From python at rcn.com  Fri Mar 13 03:24:22 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 12 Mar 2009 19:24:22 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1><49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>
	<9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1>
	<49B9A3F7.5010407@trueblade.com>
	<5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1>
	<49B9A729.5020207@trueblade.com>
Message-ID: <51AA03163D3942C9A74D633B1CD8BC0E@RaymondLaptop1>


[Eric Smith]
> Right. Apologies for hijacking it, and especially for not making it 
> clear that I was veering off subject.

No problem.  It was an interesting side discussion.

I've updated the PEP to include your variant that doesn't use T.
The examples show that it is much cleaner looking and self-evident.


Raymond


From jimjjewett at gmail.com  Fri Mar 13 04:50:17 2009
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 12 Mar 2009 23:50:17 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49B99B5B.7060808@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com>
Message-ID: <fb6fbf560903122050n2b9cbb79r87a5c778e683efd9@mail.gmail.com>

On 3/12/09, Eric Smith <eric at trueblade.com> wrote:
> Eric Smith wrote:
>> ... formats
>> a number based on the same settings that are in the locale, but not
>> actually use the locale.
...

> The key point is that it takes everything as parameters and doesn't use
> any global state. In particular, it by itself would not reference the
> locale.

Why not?  You'll need *some* default for decimal_point, and the one
from localeconv makes at least as much sense as a hard-coded default.

I agree that it shouldn't *change* anything in the locale, and any
keywords explicitly passed in should override locale, but if it never
looks at locale, you'll get patterns like

import locale
kw=dict(locale.localeconv)
kw['thousands_sep']=' '

new_util_func(number, **kw)

-jJ


From solipsis at pitrou.net  Fri Mar 13 11:02:22 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 13 Mar 2009 10:02:22 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
	<49B9B3CB.2090906@canterbury.ac.nz>
Message-ID: <loom.20090313T095626-65@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:
> 
> My point is that while it's perfectly reasonable
> for, e.g. a French programmer to want to format his
> numbers with dots and commas the other way around,
> it's *not* reasonable to force him to tediously specify
> it in each and every format specifier he writes.

A program often formatting numbers the same way can factor that into dedicated
helpers:

    def format_float(f):
        return "{0:T.,2f}".format(f)

or even:

    format_float = "{0:T.,2f}".format

Regards

Antoine.


From greg.ewing at canterbury.ac.nz  Fri Mar 13 11:26:05 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Mar 2009 23:26:05 +1300
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <loom.20090313T095626-65@post.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
	<49B9B3CB.2090906@canterbury.ac.nz>
	<loom.20090313T095626-65@post.gmane.org>
Message-ID: <49BA34BD.6070005@canterbury.ac.nz>

Antoine Pitrou wrote:

> A program often formatting numbers the same way can factor that into dedicated
> helpers:

If that's an acceptable thing to do on a daily basis,
then we don't need format strings at all.

-- 
Greg


From denis.spir at free.fr  Fri Mar 13 11:56:02 2009
From: denis.spir at free.fr (spir)
Date: Fri, 13 Mar 2009 11:56:02 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49B99B5B.7060808@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com>
Message-ID: <20090313115602.3f9a19b9@o>

Le Thu, 12 Mar 2009 19:31:39 -0400,
Eric Smith <eric at trueblade.com> s'exprima ainsi:

> > I've always thought that we should have a utility function which formats 
> > a number based on the same settings that are in the locale, but not 
> > actually use the locale. Something like:
> > 
> > format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
> >               grouping=[4, 3, 2])  
> >  >>> '12 34 56 78 765 4321,123'  
> 
> To be maximally useful (for example, so it could be used in Decimal to 
> implement locale formatting), maybe it should work on strings:
> 
>  >>> format_number(whole_part='123456787654321',  
>                fractional_part='123',
>                decimal_point=',',
>                thousands_sep=' ',
>                grouping=[4, 3, 2])
>  >>> '12 34 56 78 765 4321,123'  
> 
>  >>> format_number(whole_part='123456787654321',  
>                decimal_point=',',
>                thousands_sep='.',
>                grouping=[4, 3, 2])
>  >>> '12.34.56.78.765.4321'  
 

I find the overall problem of providing an interface to specify a number format rather challenging. The issue I see is to design a formatting pattern that is simple, clear, _and_ practicle. A practicle pattern is easy to specify, but then it becomes rather illegible and/or hard to remember, while a legible one ends up excessively verbose. 

I have the impression, but I may well be wrong, that contrarily to a format, a *formatted number* instead seems easy to scan -- with human eyes. So, as a crazy idea, I wonder whether we shouldn't let the user provide a example formatted number instead. This may address most of use cases, but probably not all.

To makes things easier, why not specify a canonical number, such as '-123456.789', of which the user should define the formatted version? Then a smart parser could deduce the format to be applied to further numbers. Below a purely artificial example.

-123456.789   -->   kg 00_123_456,79-

format:
   unit: 'kg'
   unit_pos: LEFT
   unit_sep: ' '
   thousand_sep: '_'
   fract_sep : ','
   sign_pos: RIGHT
   sign_sep: None
   padding_char: '0'

There are obvious issues:
* Does rouding apply to whole precision (number of significative digits), or to the fractional part only? Then, should the format be interpreted as the most common case (probably fract. rounding), provide a disambiguation flag, provide a flag for non-default case only? What if rounding applies after a big number of digits? Should we instead allow the user providing a longer number?
* Similar for padding: does it apply to the length of the whole number or to the integral part (common in financial apps to align decimal signs). What if the  padding applies to a smaller number of digits than the one of the canonical number. Should we instead allow the user providing a shorter number?
* probably more...

The space of valid formats can be specified using a parsing grammar, so that a parse failure indicates invalid format, and a "tagged" parse tree provides all the information needed to construct a format object.

Really do not know whether this idea is stupid or worth beeing explored ;-) [But I would well try it for personal use. At least as everyday-fast-and-easy feature.]

Denis


------
la vita e estrany


From solipsis at pitrou.net  Fri Mar 13 11:58:16 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 13 Mar 2009 10:58:16 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
	<49B9B3CB.2090906@canterbury.ac.nz>
	<loom.20090313T095626-65@post.gmane.org>
	<49BA34BD.6070005@canterbury.ac.nz>
Message-ID: <loom.20090313T105743-878@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:
> 
> > A program often formatting numbers the same way can factor that into
dedicated
> > helpers:
> 
> If that's an acceptable thing to do on a daily basis,
> then we don't need format strings at all.

Why exactly?


From denis.spir at free.fr  Fri Mar 13 12:05:25 2009
From: denis.spir at free.fr (spir)
Date: Fri, 13 Mar 2009 12:05:25 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <fb6fbf560903122050n2b9cbb79r87a5c778e683efd9@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com>
	<fb6fbf560903122050n2b9cbb79r87a5c778e683efd9@mail.gmail.com>
Message-ID: <20090313120525.4e3d9aeb@o>

Le Thu, 12 Mar 2009 23:50:17 -0400,
Jim Jewett <jimjjewett at gmail.com> s'exprima ainsi:

> Why not?  You'll need *some* default for decimal_point, and the one
> from localeconv makes at least as much sense as a hard-coded default.
> 
> I agree that it shouldn't *change* anything in the locale, and any
> keywords explicitly passed in should override locale, but if it never
> looks at locale, you'll get patterns like

I think this makes much sense. Actually, there may be a principle similar to 'cascade overriding' in CSS sheets: the last one who speaks wins. In the case of number formatting, this could be eg a cascade of:

locale format --> coded format --> end-user config format

denis
------
la vita e estrany


From eric at trueblade.com  Fri Mar 13 12:27:14 2009
From: eric at trueblade.com (Eric Smith)
Date: Fri, 13 Mar 2009 07:27:14 -0400
Subject: [Python-ideas] String formatting utility functions
Message-ID: <49BA4312.3000804@trueblade.com>

Jim Jewett wrote:
 > On 3/12/09, Eric Smith <eric at trueblade.com> wrote:
 >> Eric Smith wrote:
 >>> ... formats
 >>> a number based on the same settings that are in the locale, but not
 >>> actually use the locale.
 > ...
 >
 >> The key point is that it takes everything as parameters and doesn't use
 >> any global state. In particular, it by itself would not reference the
 >> locale.
 >
 > Why not?  You'll need *some* default for decimal_point, and the one
 > from localeconv makes at least as much sense as a hard-coded default.

I guess you could do this, but I can't see it ever actually being used 
that way. Do you really want to only specify that you're using commas 
for thousands, then find that someone has switched the locale to one 
where a comma is the decimal character?

new_util_func(1234.56, thousands_sep=',')
'1,234,56'

Best to be explicit on what you're expecting. My use case for this 
function is one where all of the arguments are known and specified every 
time. Specifically it's for implementing 'n' formatting for Decimal or 
other numeric types. You will either know the arguments, or want to use 
every one of them from the locale. If you're using the locale, just call 
localeconv() and use every value you get back. I don't have a 
mix-and-match use case.

Eric.


From eric at trueblade.com  Fri Mar 13 12:35:44 2009
From: eric at trueblade.com (Eric Smith)
Date: Fri, 13 Mar 2009 07:35:44 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <loom.20090313T095626-65@post.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<49B9804E.3010501@canterbury.ac.nz>	<loom.20090312T222438-482@post.gmane.org>	<49B9B3CB.2090906@canterbury.ac.nz>
	<loom.20090313T095626-65@post.gmane.org>
Message-ID: <49BA4510.4000706@trueblade.com>

Antoine Pitrou wrote:
> Greg Ewing <greg.ewing at ...> writes:
>> My point is that while it's perfectly reasonable
>> for, e.g. a French programmer to want to format his
>> numbers with dots and commas the other way around,
>> it's *not* reasonable to force him to tediously specify
>> it in each and every format specifier he writes.
> 
> A program often formatting numbers the same way can factor that into dedicated
> helpers:
> 
>     def format_float(f):
>         return "{0:T.,2f}".format(f)
> 
> or even:
> 
>     format_float = "{0:T.,2f}".format

Or:
float_fmt = 'T.,2f'

then you can re-use it everywhere, and multiple times in a single 
.format() expression:
'{0:{fmt}} {1:{fmt}}.format(3.14, 2.72, fmt=float_fmt)

(Try that with %-formatting! :-)

Or with a slight modification to the work I'm doing to implement 
auto-numbering:
'{:{fmt}} {:{fmt}}'.format(3.14, 2.78, fmt=float_fmt)
(but this is a different issue!)

Eric.


From lie.1296 at gmail.com  Fri Mar 13 12:46:32 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Fri, 13 Mar 2009 22:46:32 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <20090313115602.3f9a19b9@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o>
Message-ID: <gpdh2n$qr9$1@ger.gmane.org>

spir wrote:
> Le Thu, 12 Mar 2009 19:31:39 -0400,
> Eric Smith <eric at trueblade.com> s'exprima ainsi:
> 
>>> I've always thought that we should have a utility function which formats 
>>> a number based on the same settings that are in the locale, but not 
>>> actually use the locale. Something like:
>>>
>>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
>>>               grouping=[4, 3, 2])  
>>>  >>> '12 34 56 78 765 4321,123'  
>> To be maximally useful (for example, so it could be used in Decimal to 
>> implement locale formatting), maybe it should work on strings:
>>
>>  >>> format_number(whole_part='123456787654321',  
>>                fractional_part='123',
>>                decimal_point=',',
>>                thousands_sep=' ',
>>                grouping=[4, 3, 2])
>>  >>> '12 34 56 78 765 4321,123'  
>>
>>  >>> format_number(whole_part='123456787654321',  
>>                decimal_point=',',
>>                thousands_sep='.',
>>                grouping=[4, 3, 2])
>>  >>> '12.34.56.78.765.4321'  
>  
> 
> I find the overall problem of providing an interface to specify a number format rather challenging. The issue I see is to design a formatting pattern that is simple, clear, _and_ practicle. A practicle pattern is easy to specify, but then it becomes rather illegible and/or hard to remember, while a legible one ends up excessively verbose. 
> 
> I have the impression, but I may well be wrong, that contrarily to a format, a *formatted number* instead seems easy to scan -- with human eyes. So, as a crazy idea, I wonder whether we shouldn't let the user provide a example formatted number instead. This may address most of use cases, but probably not all.
> 
> To makes things easier, why not specify a canonical number, such as '-123456.789', of which the user should define the formatted version? Then a smart parser could deduce the format to be applied to further numbers. Below a purely artificial example.
> 
> -123456.789   -->   kg 00_123_456,79-
> 
> format:
>    unit: 'kg'
>    unit_pos: LEFT
>    unit_sep: ' '
>    thousand_sep: '_'
>    fract_sep : ','
>    sign_pos: RIGHT
>    sign_sep: None
>    padding_char: '0'
> 
> There are obvious issues:
> * Does rouding apply to whole precision (number of significative digits), or to the fractional part only? Then, should the format be interpreted as the most common case (probably fract. rounding), provide a disambiguation flag, provide a flag for non-default case only? What if rounding applies after a big number of digits? Should we instead allow the user providing a longer number?
> * Similar for padding: does it apply to the length of the whole number or to the integral part (common in financial apps to align decimal signs). What if the  padding applies to a smaller number of digits than the one of the canonical number. Should we instead allow the user providing a shorter number?
> * probably more...
> 
> The space of valid formats can be specified using a parsing grammar, so that a parse failure indicates invalid format, and a "tagged" parse tree provides all the information needed to construct a format object.
> 
> Really do not know whether this idea is stupid or worth beeing explored ;-) [But I would well try it for personal use. At least as everyday-fast-and-easy feature.]

Your proposal (other than being harder to implement), is similar to the 
way Excel handled formatting, but instead of sample number, they uses # 
for placeholder. If you really want to test-implement it, better try 
using that.

And I think it is impossible for the parser to be that smart to 
recognize that sign pos should be put in the rear (the smartest parser 
might only treat it as literal negative). Also it is highly inflexible, 
what about custom positive sign? What if I want to use literal -? What 
about literal number? What about non-latin number?


From denis.spir at free.fr  Fri Mar 13 13:20:23 2009
From: denis.spir at free.fr (spir)
Date: Fri, 13 Mar 2009 13:20:23 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpdh2n$qr9$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com>
	<20090313115602.3f9a19b9@o> <gpdh2n$qr9$1@ger.gmane.org>
Message-ID: <20090313132023.1becb505@o>

Le Fri, 13 Mar 2009 22:46:32 +1100,
Lie Ryan <lie.1296 at gmail.com> s'exprima ainsi:

> spir wrote:
> > Le Thu, 12 Mar 2009 19:31:39 -0400,
> > Eric Smith <eric at trueblade.com> s'exprima ainsi:
> > 
> >>> I've always thought that we should have a utility function which
> >>> formats a number based on the same settings that are in the locale, but
> >>> not actually use the locale. Something like:
> >>>
> >>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
> >>>               grouping=[4, 3, 2])  
> >>>  >>> '12 34 56 78 765 4321,123'  
> >> To be maximally useful (for example, so it could be used in Decimal to 
> >> implement locale formatting), maybe it should work on strings:
> >>
> >>  >>> format_number(whole_part='123456787654321',  
> >>                fractional_part='123',
> >>                decimal_point=',',
> >>                thousands_sep=' ',
> >>                grouping=[4, 3, 2])
> >>  >>> '12 34 56 78 765 4321,123'  
> >>
> >>  >>> format_number(whole_part='123456787654321',  
> >>                decimal_point=',',
> >>                thousands_sep='.',
> >>                grouping=[4, 3, 2])
> >>  >>> '12.34.56.78.765.4321'  
> >  
> > 
> > I find the overall problem of providing an interface to specify a number
> > format rather challenging. The issue I see is to design a formatting
> > pattern that is simple, clear, _and_ practicle. A practicle pattern is
> > easy to specify, but then it becomes rather illegible and/or hard to
> > remember, while a legible one ends up excessively verbose. 
> > 
> > I have the impression, but I may well be wrong, that contrarily to a
> > format, a *formatted number* instead seems easy to scan -- with human
> > eyes. So, as a crazy idea, I wonder whether we shouldn't let the user
> > provide a example formatted number instead. This may address most of use
> > cases, but probably not all.
> > 
> > To makes things easier, why not specify a canonical number, such as
> > '-123456.789', of which the user should define the formatted version?
> > Then a smart parser could deduce the format to be applied to further
> > numbers. Below a purely artificial example.
> > 
> > -123456.789   -->   kg 00_123_456,79-
> > 
> > format:
> >    unit: 'kg'
> >    unit_pos: LEFT
> >    unit_sep: ' '
> >    thousand_sep: '_'
> >    fract_sep : ','
> >    sign_pos: RIGHT
> >    sign_sep: None
> >    padding_char: '0'
> > 
> > There are obvious issues:
> > * Does rouding apply to whole precision (number of significative digits),
> > or to the fractional part only? Then, should the format be interpreted as
> > the most common case (probably fract. rounding), provide a disambiguation
> > flag, provide a flag for non-default case only? What if rounding applies
> > after a big number of digits? Should we instead allow the user providing
> > a longer number?
> > * Similar for padding: does it apply to the length of the whole number or
> > to the integral part (common in financial apps to align decimal signs).
> > What if the  padding applies to a smaller number of digits than the one
> > of the canonical number. Should we instead allow the user providing a
> > shorter number?
> > * probably more...
> > 
> > The space of valid formats can be specified using a parsing grammar, so
> > that a parse failure indicates invalid format, and a "tagged" parse tree
> > provides all the information needed to construct a format object.
> > 
> > Really do not know whether this idea is stupid or worth beeing
> > explored ;-) [But I would well try it for personal use. At least as
> > everyday-fast-and-easy feature.]
> 
> Your proposal (other than being harder to implement), is similar to the 
> way Excel handled formatting, but instead of sample number, they uses # 
> for placeholder. If you really want to test-implement it, better try 
> using that.

Right. I also think now that "picture strings" pointed in the PEP are a better option for such needs. While they probably cannot handle issues such as ambiguity of precision or padding without additional parameters, neither. The only advantage of my proposal is that the user provides an example, instead of an abstract representation.

> And I think it is impossible for the parser to be that smart to 
> recognize that sign pos should be put in the rear (the smartest parser 
> might only treat it as literal negative).

? Either I do not understand, or it is wrong. You can well have a parse expression allowing either a front or a rear sign, as long as there is a non-ambiguous sign-pattern.
What does 'literal negative' mean?

> Also it is highly inflexible, 
> what about custom positive sign? What if I want to use literal -? What 
> about literal number? What about non-latin number?

~ true. But this applies to any formatting rule, no? You have to specify eg which code point areas are allowed for valid digits -- and that must not overlap with code points allowed as sign, separators, or whatever.
Custom signs are not a problem, as long as they do not conflict with digits or seps. Idem for non-latin.?These points are not specific to my proposal, they apply to any kind of formatting instead.

> What if I want to use literal -? What about literal number?

I do not understand your point.

Denis
------
la vita e estrany


From lie.1296 at gmail.com  Fri Mar 13 16:09:28 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Sat, 14 Mar 2009 02:09:28 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <20090313132023.1becb505@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>
	<gpdh2n$qr9$1@ger.gmane.org> <20090313132023.1becb505@o>
Message-ID: <gpdsv9$47d$1@ger.gmane.org>

spir wrote:
> Le Fri, 13 Mar 2009 22:46:32 +1100,
> Lie Ryan <lie.1296 at gmail.com> s'exprima ainsi:
> 
>> spir wrote:
>>> Le Thu, 12 Mar 2009 19:31:39 -0400,
>>> Eric Smith <eric at trueblade.com> s'exprima ainsi:
>>>
>>>>> I've always thought that we should have a utility function which
>>>>> formats a number based on the same settings that are in the locale, but
>>>>> not actually use the locale. Something like:
>>>>>
>>>>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ',
>>>>>               grouping=[4, 3, 2])  
>>>>>  >>> '12 34 56 78 765 4321,123'  
>>>> To be maximally useful (for example, so it could be used in Decimal to 
>>>> implement locale formatting), maybe it should work on strings:
>>>>
>>>>  >>> format_number(whole_part='123456787654321',  
>>>>                fractional_part='123',
>>>>                decimal_point=',',
>>>>                thousands_sep=' ',
>>>>                grouping=[4, 3, 2])
>>>>  >>> '12 34 56 78 765 4321,123'  
>>>>
>>>>  >>> format_number(whole_part='123456787654321',  
>>>>                decimal_point=',',
>>>>                thousands_sep='.',
>>>>                grouping=[4, 3, 2])
>>>>  >>> '12.34.56.78.765.4321'  
>>>  
>>>
>>> I find the overall problem of providing an interface to specify a number
>>> format rather challenging. The issue I see is to design a formatting
>>> pattern that is simple, clear, _and_ practicle. A practicle pattern is
>>> easy to specify, but then it becomes rather illegible and/or hard to
>>> remember, while a legible one ends up excessively verbose. 
>>>
>>> I have the impression, but I may well be wrong, that contrarily to a
>>> format, a *formatted number* instead seems easy to scan -- with human
>>> eyes. So, as a crazy idea, I wonder whether we shouldn't let the user
>>> provide a example formatted number instead. This may address most of use
>>> cases, but probably not all.
>>>
>>> To makes things easier, why not specify a canonical number, such as
>>> '-123456.789', of which the user should define the formatted version?
>>> Then a smart parser could deduce the format to be applied to further
>>> numbers. Below a purely artificial example.
>>>
>>> -123456.789   -->   kg 00_123_456,79-
>>>
>>> format:
>>>    unit: 'kg'
>>>    unit_pos: LEFT
>>>    unit_sep: ' '
>>>    thousand_sep: '_'
>>>    fract_sep : ','
>>>    sign_pos: RIGHT
>>>    sign_sep: None
>>>    padding_char: '0'
>>>
>>> There are obvious issues:
>>> * Does rouding apply to whole precision (number of significative digits),
>>> or to the fractional part only? Then, should the format be interpreted as
>>> the most common case (probably fract. rounding), provide a disambiguation
>>> flag, provide a flag for non-default case only? What if rounding applies
>>> after a big number of digits? Should we instead allow the user providing
>>> a longer number?
>>> * Similar for padding: does it apply to the length of the whole number or
>>> to the integral part (common in financial apps to align decimal signs).
>>> What if the  padding applies to a smaller number of digits than the one
>>> of the canonical number. Should we instead allow the user providing a
>>> shorter number?
>>> * probably more...
>>>
>>> The space of valid formats can be specified using a parsing grammar, so
>>> that a parse failure indicates invalid format, and a "tagged" parse tree
>>> provides all the information needed to construct a format object.
>>>
>>> Really do not know whether this idea is stupid or worth beeing
>>> explored ;-) [But I would well try it for personal use. At least as
>>> everyday-fast-and-easy feature.]
>> Your proposal (other than being harder to implement), is similar to the 
>> way Excel handled formatting, but instead of sample number, they uses # 
>> for placeholder. If you really want to test-implement it, better try 
>> using that.
> 
> Right. I also think now that "picture strings" pointed in the PEP are a better option for such needs. While they probably cannot handle issues such as ambiguity of precision or padding without additional parameters, neither. The only advantage of my proposal is that the user provides an example, instead of an abstract representation.
> 
>> And I think it is impossible for the parser to be that smart to 
>> recognize that sign pos should be put in the rear (the smartest parser 
>> might only treat it as literal negative).
> 
> ? Either I do not understand, or it is wrong. 

Partially wrong, when I said "literal negative" I really meant "literal -".

> You can well have a parse expression allowing either a front or a rear sign, as long as there is a non-ambiguous sign-pattern.
> What does 'literal negative' mean?

But what if I want ~ to denote negative number?

>> Also it is highly inflexible, 
>> what about custom positive sign? What if I want to use literal -? What 
>> about literal number? What about non-latin number?
> 
> ~ true. But this applies to any formatting rule, no? 

Yes, but using number example introduces lots of ambiguities. You must 
use parameters to avoid these ambiguities.

> You have to specify eg which code point areas are allowed for valid digits -- and that must not overlap with code points allowed as sign, separators, or whatever.

> Custom signs are not a problem, as long as they do not conflict with digits or seps. Idem for non-latin. These points are not specific to my proposal, they apply to any kind of formatting instead.

How would the example format interpret this:
123 456~

When I want ~ to be the negative sign?

What if I want < for negative and > for positive?

Those are quite hyphotetical, but if we're talking about languages that 
doesn't use latin numeral, that sort of thing is very likely to happen.

>> What if I want to use literal -? What about literal number?
> 
> I do not understand your point.

What if I want to I want my number to look like this:
123-4567

Using example format would have a hard time to guess whether the "-" 
should be a negative sign or literal "-". Maybe you can use escape 
characters, but that would turn the strongest point of example format to 
  itself


From python at rcn.com  Sat Mar 14 00:40:30 2009
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 13 Mar 2009 16:40:30 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>
	<20090313115602.3f9a19b9@o> <gpdh2n$qr9$1@ger.gmane.org>
Message-ID: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>

Todays updates to:  http://www.python.org/dev/peps/pep-0378/

* Summarize commentary to date.
* Add APOSTROPHE and non-breaking SPACE to the list of separators.
* Add more links to external references.
* Detail issues with the locale module.
* Clarify how proposal II is parsed.


From eric at trueblade.com  Sat Mar 14 02:44:40 2009
From: eric at trueblade.com (Eric Smith)
Date: Fri, 13 Mar 2009 21:44:40 -0400
Subject: [Python-ideas] String formatting and namedtuple
In-Reply-To: <49B509A3.1080404@trueblade.com>
References: <gmrlom$gth$1@ger.gmane.org>	<aac2c7cb0902111318n6a2d8c2jdbfc015924a16c3@mail.gmail.com>	<ca471dc20902111324qbe4611fgc233138f073fb59e@mail.gmail.com>	<gmvim3$3lf$1@ger.gmane.org>	<ca471dc20902111501l40b51684of0af4f9ef635725e@mail.gmail.com>	<gmvsmp$k7$1@ger.gmane.org>	<20090212141040.0c89e0fc@o>	<70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com>	<ca471dc20902121046n46d4c0f3pcd2ad259b837a54e@mail.gmail.com>	<gn1t14$tlu$1@ger.gmane.org>	<cf5b87740902161122j2a1abdcif8e5cd9053a1c456@mail.gmail.com>	<gncgic$q3q$1@ger.gmane.org>	<4999D184.3080105@trueblade.com>	<499AAF4E.3020506@trueblade.com>
	<49B509A3.1080404@trueblade.com>
Message-ID: <49BB0C08.8050309@trueblade.com>

Eric Smith wrote:
> I've added a patch to http://bugs.python.org/issue5237 that implements 
> the basic '{}' functionality in str.format.

I've added another patch to issue 5237 which I believe is production 
quality. I'll work on tests.

 >>> '{} {}'.format(1, 2)
'1 2'


From and-dev at doxdesk.com  Sat Mar 14 02:56:41 2009
From: and-dev at doxdesk.com (And Clover)
Date: Sat, 14 Mar 2009 02:56:41 +0100
Subject: [Python-ideas] str.split with padding
Message-ID: <49BB0ED9.4000003@doxdesk.com>

Here's a simple one I've reinvented in my own apps often enough that it 
might be worth adding to the built-in split() method:

     s.split(sep[, maxsplit[, pad]])

pad, if set True, would pad out the returned list with empty strings 
(strs/unicodes depending on returned datatype) so that the list was 
always (maxsplit+1) elements long. This allows one to do things like 
unpacking assignments:

     user, hostname= address.split('@', 1, True)

without having to worry about exceptions when the number of ?sep?s in 
the string is unexpectedly fewer than ?maxsplit?.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/


From lie.1296 at gmail.com  Sat Mar 14 03:43:28 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Sat, 14 Mar 2009 13:43:28 +1100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <49BB0ED9.4000003@doxdesk.com>
References: <49BB0ED9.4000003@doxdesk.com>
Message-ID: <gpf5kh$ts5$1@ger.gmane.org>

And Clover wrote:
> Here's a simple one I've reinvented in my own apps often enough that it 
> might be worth adding to the built-in split() method:
> 
>     s.split(sep[, maxsplit[, pad]])
> 
> pad, if set True, would pad out the returned list with empty strings 
> (strs/unicodes depending on returned datatype) so that the list was 
> always (maxsplit+1) elements long. This allows one to do things like 
> unpacking assignments:
> 
>     user, hostname= address.split('@', 1, True)
> 
> without having to worry about exceptions when the number of ?sep?s in 
> the string is unexpectedly fewer than ?maxsplit?.
> 

Can you find a better use case? For splitting email address, I think I 
would want to know if the address turned out to be invalid (e.g. it does 
not contain exactly 1 @s)


From and-dev at doxdesk.com  Sat Mar 14 04:46:03 2009
From: and-dev at doxdesk.com (And Clover)
Date: Sat, 14 Mar 2009 04:46:03 +0100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <gpf5kh$ts5$1@ger.gmane.org>
References: <49BB0ED9.4000003@doxdesk.com> <gpf5kh$ts5$1@ger.gmane.org>
Message-ID: <49BB287B.5020507@doxdesk.com>

Lie Ryan wrote:

> Can you find a better use case?

Well here are some random uses from projects that a search on splitpad 
(one of the names I used for it) is turning up:

     command, parameters= splitpad(line, ' ', 1) # get SMTP command
     y, m, d= splitpad(t, '-', 2) # split date, month and day optional
     headers, body= splitpad(request, '\n\n', 1) # there might be no body
     table, column= rsplitpad(colname, '.', 1) # extract SQL 
[table.]column name
     id, cat, name, price= splitpad(line, ',', 3) # should be four 
columns, but editor might have lost trailing commas
     user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will 
always contain ':' unless malformed
     pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no 
'=value' part is allowable
     server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', 
1) # might not have a version

And so on. (Obviously these have an internetty bias, where ?be liberal 
in what you accept? is desirable.)

> For splitting email address, I think I would want to know if the address turned
> out to be invalid (e.g. it does  not contain exactly 1 @s)

Maybe, maybe not. In this case I wanted to accept the case of a bare 
username, with or without ?@?, as a local user. An empty string instead 
of an exception for a missing part is something I find very common; it 
kind of fits with Python's ?string processing does what you usually 
want? behaviour (as compared to other languages that are still tediously 
throwing exceptions when you try to slice outside the string length range).

For example with an HTTP command (eg. ?GET / HTTP/1.0?):

     method, path, version= splitpad(command, ' ', 2)

?version? might be missing, on ancient HTTP/0.9 clients. ?path? could be 
missing, on malformed requests. In either of those cases I don't want an 
exception, and I don't particularly want to burden my split code with 
extra checking; I'll probably have to do further checking on ?path? 
anyway so setting it to an empty string is the best I can do here.

The alternative I use if I can't be bothered to define splitpad() again 
is something like:

     parts= command.split(' ', 2)
     method= parts[0]
     path= parts[1] if len(parts)>=2 else ''
     ....

which is pretty ugly.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/


From ncoghlan at gmail.com  Sat Mar 14 04:50:25 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 14 Mar 2009 13:50:25 +1000
Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed
 format	specifier for a	thousands separator (discussion moved
 from	python-dev)
In-Reply-To: <49BA4A12.4040402@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B8D702.4040004@trueblade.com>	<1F032CDC26874991B78E47DBDB333917@RaymondLaptop1>
	<49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com>
Message-ID: <49BB2981.5070008@gmail.com>

Joining the discussion over here to add a couple of points that I
haven't seen in Raymond's PEP updates on the checkin list:

1. The Single Unix Specification apparently uses an apostrophe as a flag
in prinft() %-formatting to request inclusion of a thousands separator
in a locale aware way [1]. Since the apostrophe is much harder to
mistake for a period than a comma is, I would modify my "just a flag"
suggestion to use an apostrophe as the flag instead of a comma:

   [[fill]align][sign][#][0][width]['][.precision][type]

The output would still use commas though:

format(1234, "8.1f")     -->    '  1234.0'
format(1234, "8'.1f")    -->    ' 1,234.0'
format(1234, "8d")       -->    '    1234'
format(1234, "8'd")      -->    '   1,234'


2. PEP 3101 *already included* a way to modify the handling of format
strings in a consistent way: use a custom string.Formatter subclass
instead of relying on the basic str.format method.

When the mini language parser is exposed (which I consider a separate
issue from this PEP), a locale aware custom formatter is going to find a
"include digit separators" flag far more useful than the overly explicit
"use this thousands separator and this decimal separator".

Cheers,
Nick.

[1] http://linux.die.net/man/3/printf

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From steve at pearwood.info  Sat Mar 14 05:22:42 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 14 Mar 2009 15:22:42 +1100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <gpf5kh$ts5$1@ger.gmane.org>
References: <49BB0ED9.4000003@doxdesk.com> <gpf5kh$ts5$1@ger.gmane.org>
Message-ID: <200903141522.42307.steve@pearwood.info>

On Sat, 14 Mar 2009 01:43:28 pm Lie Ryan wrote:
> And Clover wrote:
> > Here's a simple one I've reinvented in my own apps often enough
> > that it might be worth adding to the built-in split() method:
> >
> >     s.split(sep[, maxsplit[, pad]])
> >
> > pad, if set True, would pad out the returned list with empty
> > strings (strs/unicodes depending on returned datatype) so that the
> > list was always (maxsplit+1) elements long. This allows one to do
> > things like unpacking assignments:
> >
> >     user, hostname= address.split('@', 1, True)
> >
> > without having to worry about exceptions when the number of ?sep?s
> > in the string is unexpectedly fewer than ?maxsplit?.
>
> Can you find a better use case? For splitting email address, I think
> I would want to know if the address turned out to be invalid (e.g. it
> does not contain exactly 1 @s)


What makes you think that email address must contain exactly one @ sign?

Email being sent locally may contain zero @ signs, and email being sent 
externally can contain one or more @ signs. Andy's code:

user, hostname= address.split('@', 1, True)

will fail on syntactically valid email addresses like this:

fred(away @ the pub)@example.com


-- 
Steven D'Aprano


From lie.1296 at gmail.com  Sat Mar 14 05:59:18 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Sat, 14 Mar 2009 15:59:18 +1100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <200903141522.42307.steve@pearwood.info>
References: <49BB0ED9.4000003@doxdesk.com> <gpf5kh$ts5$1@ger.gmane.org>
	<200903141522.42307.steve@pearwood.info>
Message-ID: <gpfdj8$cch$1@ger.gmane.org>

Steven D'Aprano wrote:
> On Sat, 14 Mar 2009 01:43:28 pm Lie Ryan wrote:
>> And Clover wrote:
>>> Here's a simple one I've reinvented in my own apps often enough
>>> that it might be worth adding to the built-in split() method:
>>>
>>>     s.split(sep[, maxsplit[, pad]])
>>>
>>> pad, if set True, would pad out the returned list with empty
>>> strings (strs/unicodes depending on returned datatype) so that the
>>> list was always (maxsplit+1) elements long. This allows one to do
>>> things like unpacking assignments:
>>>
>>>     user, hostname= address.split('@', 1, True)
>>>
>>> without having to worry about exceptions when the number of ?sep?s
>>> in the string is unexpectedly fewer than ?maxsplit?.
>> Can you find a better use case? For splitting email address, I think
>> I would want to know if the address turned out to be invalid (e.g. it
>> does not contain exactly 1 @s)
> 
> 
> What makes you think that email address must contain exactly one @ sign?
> 
> Email being sent locally may contain zero @ signs, and email being sent 
> externally can contain one or more @ signs. Andy's code:
> 
> user, hostname= address.split('@', 1, True)
> 
> will fail on syntactically valid email addresses like this:
> 
> fred(away @ the pub)@example.com

 From Wikipedia:
RFC invalid e-mail addresses
     * Abc.example.com (character @ is missing)
     * Abc. at example.com (character dot(.) is last in local part)
     * Abc..123 at example.com (character dot(.) is double)
     * A at b@c at example.com (only one @ is allowed outside quotations marks)
     * ()[]\;:,<>@example.com (none of the characters before the @ in 
this example, are allowed outside quotation marks)

Your example is valid email address if and only if it is enclosed in 
quotation mark: "fred(away @ the pub)"@example.com


From lie.1296 at gmail.com  Sat Mar 14 06:04:18 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Sat, 14 Mar 2009 16:04:18 +1100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <49BB287B.5020507@doxdesk.com>
References: <49BB0ED9.4000003@doxdesk.com> <gpf5kh$ts5$1@ger.gmane.org>
	<49BB287B.5020507@doxdesk.com>
Message-ID: <gpfdsi$cch$2@ger.gmane.org>

And Clover wrote:
> Lie Ryan wrote:
> 
>> Can you find a better use case?
> 
> Well here are some random uses from projects that a search on splitpad 
> (one of the names I used for it) is turning up:
> 
>     command, parameters= splitpad(line, ' ', 1) # get SMTP command
>     y, m, d= splitpad(t, '-', 2) # split date, month and day optional
>     headers, body= splitpad(request, '\n\n', 1) # there might be no body
>     table, column= rsplitpad(colname, '.', 1) # extract SQL 
> [table.]column name
>     id, cat, name, price= splitpad(line, ',', 3) # should be four 
> columns, but editor might have lost trailing commas
>     user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will 
> always contain ':' unless malformed
>     pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no 
> '=value' part is allowable
>     server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', 
> 1) # might not have a version
> 
> And so on. (Obviously these have an internetty bias, where ?be liberal 
> in what you accept? is desirable.)
> 
>> For splitting email address, I think I would want to know if the 
>> address turned
>> out to be invalid (e.g. it does  not contain exactly 1 @s)
> 
> Maybe, maybe not. In this case I wanted to accept the case of a bare 
> username, with or without ?@?, as a local user. An empty string instead 
> of an exception for a missing part is something I find very common; it 
> kind of fits with Python's ?string processing does what you usually 
> want? behaviour (as compared to other languages that are still tediously 
> throwing exceptions when you try to slice outside the string length range).
> 
> For example with an HTTP command (eg. ?GET / HTTP/1.0?):
> 
>     method, path, version= splitpad(command, ' ', 2)
> 
> ?version? might be missing, on ancient HTTP/0.9 clients. ?path? could be 
> missing, on malformed requests. In either of those cases I don't want an 
> exception, and I don't particularly want to burden my split code with 
> extra checking; I'll probably have to do further checking on ?path? 
> anyway so setting it to an empty string is the best I can do here.
> 
> The alternative I use if I can't be bothered to define splitpad() again 
> is something like:
> 
>     parts= command.split(' ', 2)
>     method= parts[0]
>     path= parts[1] if len(parts)>=2 else ''
>     ....
> 
> which is pretty ugly.
> 

I am honestly quite surprised: 
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html


From bruce at leapyear.org  Sat Mar 14 07:04:26 2009
From: bruce at leapyear.org (Bruce Leban)
Date: Fri, 13 Mar 2009 23:04:26 -0700
Subject: [Python-ideas] str.split with padding
In-Reply-To: <gpfdj8$cch$1@ger.gmane.org>
References: <49BB0ED9.4000003@doxdesk.com> <gpf5kh$ts5$1@ger.gmane.org>
	<200903141522.42307.steve@pearwood.info> <gpfdj8$cch$1@ger.gmane.org>
Message-ID: <cf5b87740903132304n4581af08rcbc79de40008cc3d@mail.gmail.com>

On Fri, Mar 13, 2009 at 9:59 PM, Lie Ryan <lie.1296 at gmail.com> wrote:

> Steven D'Aprano wrote:
>
>>
>> Email being sent locally may contain zero @ signs, and email being sent
>> externally can contain one or more @ signs. Andy's code:
>>
>> user, hostname= address.split('@', 1, True)
>>
>> will fail on syntactically valid email addresses like this:
>>
>> fred(away @ the pub)@example.com
>>
>
> From Wikipedia:
> RFC invalid e-mail addresses
>    * Abc.example.com <http://abc.example.com/> (character @ is missing)
>    * Abc. at example.com (character dot(.) is last in local part)
>    * Abc..123 at example.com (character dot(.) is double)
>    * A at b@c at example.com (only one @ is allowed outside quotations marks)
>    * ()[]\;:,<>@example.com (none of the characters before the @ in this
> example, are allowed outside quotation marks)
>
> Your example is valid email address if and only if it is enclosed in
> quotation mark: "fred(away @ the pub)"@example.com
>
That is valid but not because you can have nested email addresses like
that.** The (...) part is a comment. I wouldn't bet that very many mail
clients handle that according to the rfc. Many don't handle quoted strings
either. And there are those that have a narrow view of which characters
(Hint: if you don't want to get mail from hotmail users, just make sure your
email address has '/' in it.)
http://www.ietf.org/rfc/rfc0822.txt  **Way back people wrote nested email
addresses with % replacing the @ in the nested address (sna%foo at bar). I
haven't seen that for a while.

On topic:

Making split more complicated seems like overspecialization. Wouldn't a
generic padding function be more useful? FWIW, this has been discussed
before. http://bugs.python.org/issue5034

--- Bruce (sorry for the digression)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090313/def47e29/attachment.html>

From python at rcn.com  Sat Mar 14 07:08:18 2009
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 13 Mar 2009 23:08:18 -0700
Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed
	format	specifier for a	thousands separator (discussion moved
	from	python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B8D702.4040004@trueblade.com>	<1F032CDC26874991B78E47DBDB333917@RaymondLaptop1><49B9505C.1050803@trueblade.com>
	<49BA4A12.4040402@gmail.com> <49BB2981.5070008@gmail.com>
Message-ID: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>


[Nick Coghlan]
> 1. The Single Unix Specification apparently uses an apostrophe as a flag
> in prinft() %-formatting to request inclusion of a thousands separator
> in a locale aware way [1].

We already use C-sharp's "n" flag for a locale aware thousands separator.


> Since the apostrophe is much harder to
> mistake for a period than a comma is, I would modify my "just a flag"
> suggestion to use an apostrophe as the flag instead of a comma:
 . . .
> The output would still use commas though:

That doesn't make sense for two reasons:
1.  Why mark a non-locale aware form with a flag that indicates
     locale awareness in another language.
2.  It seems to be basic bad design to require an apostrophe
     to emit commas.

FWIW, the comma-only version of the proposal is probably going to 
die anyway.  The more flexible alternative evolved to something simple
and direct.  Also, the newsgroup discussion make it abundantly clear 
that half the world will rebel if commas are the only supported option.  


> 2. PEP 3101 *already included* a way to modify the handling of format
> strings in a consistent way: use a custom string.Formatter subclass
> instead of relying on the basic str.format method.
> 
> When the mini language parser is exposed (which I consider a separate
> issue from this PEP), a locale aware custom formatter is going to find a
> "include digit separators" flag far more useful than the overly explicit
> "use this thousands separator and this decimal separator".

Thanks.  Will note that in the PEP when I get a chance.


Raymond


From denis.spir at free.fr  Sat Mar 14 08:49:46 2009
From: denis.spir at free.fr (spir)
Date: Sat, 14 Mar 2009 08:49:46 +0100
Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed format
 specifier for a	thousands separator (discussion moved from	python-dev)
In-Reply-To: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B8D702.4040004@trueblade.com>
	<1F032CDC26874991B78E47DBDB333917@RaymondLaptop1>
	<49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com>
	<49BB2981.5070008@gmail.com>
	<103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>
Message-ID: <20090314084946.50f64627@o>

Le Fri, 13 Mar 2009 23:08:18 -0700,
"Raymond Hettinger" <python at rcn.com> s'exprima ainsi:

>  Since the apostrophe is much harder to
> > mistake for a period than a comma is, I would modify my "just a flag"
> > suggestion to use an apostrophe as the flag instead of a comma:  
>  . . .
> > The output would still use commas though:  
> 
> That doesn't make sense for two reasons:
> 1.  Why mark a non-locale aware form with a flag that indicates
>      locale awareness in another language.
> 2.  It seems to be basic bad design to require an apostrophe
>      to emit commas.

If I properly understand the PEP (by the way, congratulations for the reformulation -- the motivation section esp. is clearer and more motivat-ing) there are 2 differences between the poposals:
* choose char for thousand-sep
* choose decimal sep

> FWIW, the comma-only version of the proposal is probably going to 
> die anyway.  The more flexible alternative evolved to something simple
> and direct.  Also, the newsgroup discussion make it abundantly clear 
> that half the world will rebel if commas are the only supported option.  

If the first proposal let the user choose the thousand-sep char it would be more appealing, indeed. As is, it has no chance.
Anyway, the second proposal is now rather clear and simple. In my mind, both separators work together even when there no possible conflict between the actual chars.
+1 for version #2 (more or less as is now)

I would just add:
   The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable space).

Denis

PS - OT: As the width param is the width of whole number, how to cope with with decimal point alignment, meaning that there should be integral part width/padding instead?
      123.45
        1.2
   123456.789
Maybe this need is mainly in the financial field, so that this will be implicitly addressed because to the 2-digit rounding?
      123.45
        1.20
   123456.79
------
la vita e estrany


From g.brandl at gmx.net  Sat Mar 14 09:55:11 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 14 Mar 2009 09:55:11 +0100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <49BB0ED9.4000003@doxdesk.com>
References: <49BB0ED9.4000003@doxdesk.com>
Message-ID: <gpfrdf$4c3$1@ger.gmane.org>

And Clover schrieb:
> Here's a simple one I've reinvented in my own apps often enough that it 
> might be worth adding to the built-in split() method:
> 
>      s.split(sep[, maxsplit[, pad]])
> 
> pad, if set True, would pad out the returned list with empty strings 
> (strs/unicodes depending on returned datatype) so that the list was 
> always (maxsplit+1) elements long. This allows one to do things like 
> unpacking assignments:
> 
>      user, hostname= address.split('@', 1, True)
> 
> without having to worry about exceptions when the number of ?sep?s in 
> the string is unexpectedly fewer than ?maxsplit?.

Note that for maxsplit=1, you can use str.partition().

Georg


From g.brandl at gmx.net  Sat Mar 14 10:24:29 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 14 Mar 2009 10:24:29 +0100
Subject: [Python-ideas] Draft PEP: Standard daemon process library
In-Reply-To: <874oxzxujs.fsf@benfinney.id.au>
References: <87wscj11fl.fsf@benfinney.id.au> <874oxzxujs.fsf@benfinney.id.au>
Message-ID: <gpft4d$7vo$1@ger.gmane.org>

Ben Finney schrieb:
> Howdy all,
> 
> Significant changes in this release:
> 
> * Name the daemon process context class `DaemonContext`, since it
>   doesn't actually represent a separate daemon. (The reference
>   implementation will also have a `DaemonRunner` class, but that's
>   outside the scope of this PEP.)
> 
> * Implement the context manager protocol, allowing use as a ?with?
>   context manager or via explicit ?open? and ?close? calls.
> 
> * Delegate PID file handling to a `pidfile` object handed to the
>   `DaemonContext` instance, and used simply as a context manager.
> 
> * Simplify the set of options by using a mapping for signal handlers.
> 
> * Target Python 3.2, since the reference implementation will very
>   likely not be complete in time for anything earlier.

This looks like it should be submitted as a formal PEP now; that should also
ensure more interest in it, and an eventual resolution.

Georg


From solipsis at pitrou.net  Sat Mar 14 14:45:12 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 14 Mar 2009 13:45:12 +0000 (UTC)
Subject: [Python-ideas] non-breaking space
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B8D702.4040004@trueblade.com>
	<1F032CDC26874991B78E47DBDB333917@RaymondLaptop1>
	<49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com>
	<49BB2981.5070008@gmail.com>
	<103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>
	<20090314084946.50f64627@o>
Message-ID: <loom.20090314T134011-529@post.gmane.org>

Hello,

spir <denis.spir at ...> writes:
> 
> I would just add:
>    The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable
space).

Then the proposal should allow for any kind of space characters (that is, any
character for which isspace() is True). There are several non-breaking space
characters in the unicode character set, with varying character widths, which is
important for typography rules. See
http://en.wikipedia.org/wiki/Non-breaking_space for some examples.

Regards

Antoine (playing devil's advocate a bit - but only a bit).


From jervisau at gmail.com  Sat Mar 14 14:47:01 2009
From: jervisau at gmail.com (Jervis Whitley)
Date: Sun, 15 Mar 2009 00:47:01 +1100
Subject: [Python-ideas] Inline assignment expression
Message-ID: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>

http://bugs.python.org/issue1714448 had an interesting proposal that I thought
might be worthwhile discussing here.

if something as x:

however, of greater use would be assignment expressions that allow:
if (something as x) == other:
    # can now use x.

I propose that we implement assignment expressions that would allow assignments
to be made any place that expressions are currently valid. The
proposal uses the
(nominal) right arrow (RARROW) '->' to indicate the assignment. The
form would look
like this:

    EXPR -> VAR

which translates to

    VAR = EXPR
    (EXPR)

Expression (EXPR) is evaluated and assigned to target VAR. The value of EXPR is
left on the top of stack.

another toy example to think about:

    while len(expensive() -> res) == 4:
        dosomething(res)

A patch has been uploaded to the named issue in the bug tracker. I encourage
you to try it out (py3k at the moment). As I mentioned earlier the
exact syntax is only nominal. We needn't use the RARROW if consensus
is against that, it is a simple operation to change this to any of
('becomes', 'into', 'assigns' ...


I look forward to your comments.

Cheers,

Jervis


From guido at python.org  Sat Mar 14 15:57:25 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 14 Mar 2009 07:57:25 -0700
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
Message-ID: <ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>

If we're going to allow unconstrained assignments inside expressions
why don't we use the same syntax as C, C++, Java, JavaScript etc.?

But I left this out intentionally for a reason. We would need to have
a great deal of evidence that it was a mistake for making a U-turn.

Have a happy discussion,

--Guido <extricates himself>

On Sat, Mar 14, 2009 at 6:47 AM, Jervis Whitley <jervisau at gmail.com> wrote:
> http://bugs.python.org/issue1714448 had an interesting proposal that I thought
> might be worthwhile discussing here.
>
> if something as x:
>
> however, of greater use would be assignment expressions that allow:
> if (something as x) == other:
> ? ?# can now use x.
>
> I propose that we implement assignment expressions that would allow assignments
> to be made any place that expressions are currently valid. The
> proposal uses the
> (nominal) right arrow (RARROW) '->' to indicate the assignment. The
> form would look
> like this:
>
> ? ?EXPR -> VAR
>
> which translates to
>
> ? ?VAR = EXPR
> ? ?(EXPR)
>
> Expression (EXPR) is evaluated and assigned to target VAR. The value of EXPR is
> left on the top of stack.
>
> another toy example to think about:
>
> ? ?while len(expensive() -> res) == 4:
> ? ? ? ?dosomething(res)
>
> A patch has been uploaded to the named issue in the bug tracker. I encourage
> you to try it out (py3k at the moment). As I mentioned earlier the
> exact syntax is only nominal. We needn't use the RARROW if consensus
> is against that, it is a simple operation to change this to any of
> ('becomes', 'into', 'assigns' ...
>
>
> I look forward to your comments.
>
> Cheers,
>
> Jervis
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Sat Mar 14 17:52:09 2009
From: python at rcn.com (Raymond Hettinger)
Date: Sat, 14 Mar 2009 09:52:09 -0700
Subject: [Python-ideas] non-breaking space
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49B8D702.4040004@trueblade.com><1F032CDC26874991B78E47DBDB333917@RaymondLaptop1><49B9505C.1050803@trueblade.com>
	<49BA4A12.4040402@gmail.com><49BB2981.5070008@gmail.com><103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1><20090314084946.50f64627@o>
	<loom.20090314T134011-529@post.gmane.org>
Message-ID: <161AE36774DA4AAEA22CC88AC07285A5@RaymondLaptop1>


denis.spir at ...> writes:
>> 
>> I would just add:
>>    The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable
> space).
> 
> Then the proposal should allow for any kind of space characters (that is, any
> character for which isspace() is True). There are several non-breaking space
> characters in the unicode character set, with varying character widths, which is
> important for typography rules. See
> http://en.wikipedia.org/wiki/Non-breaking_space for some examples.
> 
> Regards
> 
> Antoine (playing devil's advocate a bit - but only a bit).

Keeping in mind the needs of people writing parsers,
I don't think it's a good idea to expand this set.
Already, we're not supporting all possible separators whether
they be spaces or not.  Given just U+0020 and U+00A0, 
a person can easily do a str.replace() to get to anything else.


Raymond


From and-dev at doxdesk.com  Sat Mar 14 18:09:58 2009
From: and-dev at doxdesk.com (And Clover)
Date: Sat, 14 Mar 2009 18:09:58 +0100
Subject: [Python-ideas] str.split with padding
In-Reply-To: <gpfrdf$4c3$1@ger.gmane.org>
References: <49BB0ED9.4000003@doxdesk.com> <gpfrdf$4c3$1@ger.gmane.org>
Message-ID: <49BBE4E6.1020606@doxdesk.com>

Georg Brandl wrote:

> Note that for maxsplit=1, you can use str.partition().

Indeed, though it does slightly spoil the cleanness of the unpacking 
assignment to include a dummy lvalue for the middle element.

[Thanks for the on-topic reply! I'm surprised more people haven't felt 
the need to write unpacking splits like this to be honest, but I guess 
engaging in SMTP syntax law is much more fun. Yes guys, I'm well aware 
of the capabilities of the RFC2822 addr-spec format, thanks, and no, 
it's not relevant to the particular program that example came from. 
Cheers for the concern though.]

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/


From and-dev at doxdesk.com  Sat Mar 14 18:15:10 2009
From: and-dev at doxdesk.com (And Clover)
Date: Sat, 14 Mar 2009 18:15:10 +0100
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
Message-ID: <49BBE61E.6000304@doxdesk.com>

Jervis Whitley wrote:

> if (something as x) == other:
>     # can now use x.

Interesting. I'd definitely prefer that to the C-style inline assignment 
syntax: I think it reads better, and there's less chance of the 
Accidental Assignment Instead Of Comparison trap that has plagued other 
languages.

I remain to be convinced that inline assignment is enough of a win in 
general, but if implemented, that's the syntax I'd want.

-- 
And Clover
mailto:and at doxdesk.com
http://www.doxdesk.com/


From guido at python.org  Sat Mar 14 18:30:43 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 14 Mar 2009 10:30:43 -0700
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <49BBE61E.6000304@doxdesk.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<49BBE61E.6000304@doxdesk.com>
Message-ID: <ca471dc20903141030w55f9e9d3nb61849374bd41d75@mail.gmail.com>

On Sat, Mar 14, 2009 at 10:15 AM, And Clover <and-dev at doxdesk.com> wrote:
> Jervis Whitley wrote:
>
>> if (something as x) == other:
>> ? ?# can now use x.
>
> Interesting. I'd definitely prefer that to the C-style inline assignment
> syntax: I think it reads better, and there's less chance of the Accidental
> Assignment Instead Of Comparison trap that has plagued other languages.
>
> I remain to be convinced that inline assignment is enough of a win in
> general, but if implemented, that's the syntax I'd want.

Perhaps you want to replace top-level assignments with "expr as target" as well?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From tjreedy at udel.edu  Sat Mar 14 19:50:28 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Mar 2009 14:50:28 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49BA34BD.6070005@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<49B9804E.3010501@canterbury.ac.nz>	<loom.20090312T222438-482@post.gmane.org>	<49B9B3CB.2090906@canterbury.ac.nz>	<loom.20090313T095626-65@post.gmane.org>
	<49BA34BD.6070005@canterbury.ac.nz>
Message-ID: <gpgu9i$ne$1@ger.gmane.org>

Greg Ewing wrote:
> Antoine Pitrou wrote:
> 
>> A program often formatting numbers the same way can factor that into 
>> dedicated helpers:
> 
> If that's an acceptable thing to do on a daily basis,
> then we don't need format strings at all.

Given that the helper functions *use* format strings, or could even be a 
method bound to a format string, that seems like an odd claim ;-).


From tjreedy at udel.edu  Sat Mar 14 20:03:19 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Mar 2009 15:03:19 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <20090313115602.3f9a19b9@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com>
	<49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o>
Message-ID: <gpgv1l$30k$1@ger.gmane.org>

spir wrote:

> I have the impression, but I may well be wrong, that contrarily to a
> format, a *formatted number* instead seems easy to scan -- with human
> eyes. So, as a crazy idea, I wonder whether we shouldn't let the user
> provide a example formatted number instead. This may address most of
> use cases, but probably not all.
> 
> To makes things easier, why not specify a canonical number, such as
> '-123456.789', of which the user should define the formatted version?
> Then a smart parser could deduce the format to be applied to further
> numbers. Below a purely artificial example.
> 
> -123456.789   -->   kg 00_123_456,79-
> 
> format: unit: 'kg' unit_pos: LEFT unit_sep: ' ' thousand_sep: '_' 
> fract_sep : ',' sign_pos: RIGHT sign_sep: None padding_char: '0'

Once the .format language is expanded to be able to define grouping 
separators, one will be able to define functions to turn such templates 
in field specs. Now many options are allowed would depend on the function.


From tjreedy at udel.edu  Sat Mar 14 21:08:56 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Mar 2009 16:08:56 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>
	<gpdh2n$qr9$1@ger.gmane.org>
	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
Message-ID: <gph2sm$dij$1@ger.gmane.org>

Raymond Hettinger wrote:
> Todays updates to:  http://www.python.org/dev/peps/pep-0378/
> 
> * Summarize commentary to date.
> * Add APOSTROPHE and non-breaking SPACE to the list of separators.
> * Add more links to external references.
> * Detail issues with the locale module.
> * Clarify how proposal II is parsed.

+1 for proposal 2

Comment on locale.  It was designed, perhaps 30 years ago, for 
*national* programming (hence the global locale setting).  The doc 
should really describe it as for 'nationalization' rather than for 
'internatioalization'.

For *global* (international) programming, all the formatting functions 
should either take a locale dict or be instance methods of a Locale 
class whose instances are individual locales.  With this PEP 
implemented, we could potentially locale with a platform- and 
implementation-language-independent countrybase and country module with 
Country class using the expanded str.format strings.

The only thing not directly handled, as far as I can see, is groupings 
other than by threes, which would have to be handled by other means.

Terry Jan Reedy


From tjreedy at udel.edu  Sat Mar 14 21:23:39 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Mar 2009 16:23:39 -0400
Subject: [Python-ideas] str.split with padding
In-Reply-To: <49BB0ED9.4000003@doxdesk.com>
References: <49BB0ED9.4000003@doxdesk.com>
Message-ID: <gph3o9$fn2$1@ger.gmane.org>

And Clover wrote:
> Here's a simple one I've reinvented in my own apps often enough that it 
> might be worth adding to the built-in split() method:
> 
>     s.split(sep[, maxsplit[, pad]])
> 
> pad, if set True, would pad out the returned list with empty strings 
> (strs/unicodes depending on returned datatype) so that the list was 
> always (maxsplit+1) elements long. This allows one to do things like 
> unpacking assignments:
> 
>     user, hostname= address.split('@', 1, True)
> 
> without having to worry about exceptions when the number of ?sep?s in 
> the string is unexpectedly fewer than ?maxsplit?.

I would make pad = <padding string>.  Example use case:

major,minor,micro = pyversion.split('.', 2, '0') # 3.0 = 3.0.0, etc.
# or
major,minor,micro = (int(s) for s in pyversion.split('.', 2, '0') )

I suppose a counter argument is than one could write
(pyversion+'.0').split('.',2)

Terry Jan Reedy


From jervisau at gmail.com  Sat Mar 14 23:52:06 2009
From: jervisau at gmail.com (Jervis Whitley)
Date: Sun, 15 Mar 2009 09:52:06 +1100
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
Message-ID: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>

> why don't we use the same syntax as C, C++, Java, JavaScript etc.?

I have deliberately chosen to use a different syntax (right assignment
for one, and
a different, albeit nominal, operator) than C, C++ to address the
concern that a user may unintentionally assign when they wanted to compare.

http://bugs.python.org/issue1714448 the issue that I was responding to,
also recognised the need to move away from a C style assignment for a
similar situation (I have also written a patch, not posted yet, to
address their situation.)

> But I left this out intentionally for a reason. We would need to have
> a great deal of evidence that it was a mistake for making a U-turn.

I realise that this is a trivial (to implement) patch and that it must
have been left
out of Python for a reason, however I am sure that using an explicit
and elegant
enough syntax that this can shake the feeling that it is un-pythonic.

I have drafted a PEP with some of the basic discussion included and
some example situations. It does however, fail to discuss issues of
precedence and implementations
in other languages at this stage.

As implemented the precedence for this operation is just below a
BoolOp and above a BinOp

so things like

    test() as x == answer

should work and (for example)

    4 * 4 as x == 16 # True


I read your answer as a -0.5, if it is dead in the water, let me know
we can close the
Issue as a 'Wont Fix'.

Cheers,

Jervis


From jervisau at gmail.com  Sat Mar 14 23:59:38 2009
From: jervisau at gmail.com (Jervis Whitley)
Date: Sun, 15 Mar 2009 09:59:38 +1100
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <49BBE61E.6000304@doxdesk.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<49BBE61E.6000304@doxdesk.com>
Message-ID: <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>

>
>> if (something as x) == other:
>> ? ?# can now use x.
>
> Interesting. I'd definitely prefer that to the C-style inline assignment
> syntax: I think it reads better, and there's less chance of the Accidental
> Assignment Instead Of Comparison trap that has plagued other languages.

However, allowing this "something as x" syntax for assignment would cause
confusion with the "with contextmanager as x" scenario. "as" was chosen in their
case because the expr contextmanager is not assigned to x.
While I do like the "as" syntax too, I have not endorsed it for the
above reason.

However, this does show that using the "as" makes this somehow
pythonic looking,
and there must be an alternative that keeps this far enough away from the
assignment expressions in other languages so as to avoid the trap you mention.

> I remain to be convinced that inline assignment is enough of a win in
> general, but if implemented, that's the syntax I'd want.

Cheers,

Jervis


From tjreedy at udel.edu  Sun Mar 15 02:52:47 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 14 Mar 2009 21:52:47 -0400
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>	<ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
	<8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
Message-ID: <gphn1c$pbi$1@ger.gmane.org>

Jervis Whitley wrote:
> [Guido]
>> But I left this out intentionally for a reason. We would need to have
>> a great deal of evidence that it was a mistake for making a U-turn.

> I read your answer as a -0.5, if it is dead in the water, let me know
> we can close the Issue as a 'Wont Fix'.

Having read similar discussions over the last decade, I read it as about 
-.995 ;-)
In other words, not quite as dead as adding braces, but close.

tjr


From cmjohnson.mailinglist at gmail.com  Sun Mar 15 02:57:42 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Sat, 14 Mar 2009 15:57:42 -1000
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
	<8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
Message-ID: <3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com>

Jervis Whitley wrote:

> so things like
>
> ? ?test() as x == answer
>
> should work and (for example)
>
> ? ?4 * 4 as x == 16 # True

Would "4 * 4 == 16 as x" be equivalent to "(4 * 4 == 16) as x" or "4 *
4 == (16 as x)"?

Either way, I suspect this is dead in the water, but I guess that
issue should be clarified.

-- Carl


From jervisau at gmail.com  Sun Mar 15 03:13:54 2009
From: jervisau at gmail.com (Jervis Whitley)
Date: Sun, 15 Mar 2009 13:13:54 +1100
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
	<8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
	<3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com>
Message-ID: <8e63a5ce0903141913o6374acf3h6fa37c67537c225a@mail.gmail.com>

>
> Would "4 * 4 == 16 as x" be equivalent to "(4 * 4 == 16) as x" or "4 *
> 4 == (16 as x)"?
>
> Either way, I suspect this is dead in the water, but I guess that
> issue should be clarified.
>

This is one of the matters for discussion here.
I much prefer the latter, that is the assignment expression has
precedence below that of BoolOp but above BinOp. Cheers.

Try running the patch, if nothing else kicking the tires a bit is a bit of fun
(note that i nominally use '->' instead of 'as' in the patch.)
I know that since writing it, it has been if nothing else, fun, doing
assignments
in expressions.


From ben+python at benfinney.id.au  Sun Mar 15 03:31:21 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sun, 15 Mar 2009 13:31:21 +1100
Subject: [Python-ideas] Draft PEP (version 0.5): Standard daemon process
	library
References: <87wscj11fl.fsf@benfinney.id.au> <874oxzxujs.fsf@benfinney.id.au>
	<gpft4d$7vo$1@ger.gmane.org>
Message-ID: <87eiwzv7nq.fsf_-_@benfinney.id.au>

Georg Brandl <g.brandl at gmx.net> writes:

> Ben Finney schrieb:
> > Howdy all,
> > 
> > Significant changes in this release [of the draft PEP]:
> 
> This looks like it should be submitted as a formal PEP now; that
> should also ensure more interest in it, and an eventual resolution.

Thanks for the support. Unless anyone has strong objections within the
next day or so, I'll submit this as a PEP.

-- 
 \        ?With Lisp or Forth, a master programmer has unlimited power |
  `\     and expressiveness. With Python, even a regular guy can reach |
_o__)                               for the stars.? ?Raymond Hettinger |
Ben Finney


From guido at python.org  Sun Mar 15 04:33:54 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 14 Mar 2009 20:33:54 -0700
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<ca471dc20903140757x24a75570u6221c5ed72633c67@mail.gmail.com>
	<8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com>
Message-ID: <ca471dc20903142033g21885667k5c628624af74b2f7@mail.gmail.com>

On Sat, Mar 14, 2009 at 3:52 PM, Jervis Whitley <jervisau at gmail.com> wrote:
> I read your answer as a -0.5, if it is dead in the water, let me know
> we can close the Issue as a 'Wont Fix'.

You're asking the wrong guy. :-) If it were up to me this would never
go through. So, yes, a solid -1 from me. If anything, I'm stronger
against your "as" version than against C-style "=", (a) because the
latter draws more attention to the assignment (I hate side effects
buried deeply) and (b) because it's more familiar. The existing "as"
syntaxes have a different purpose, they are top-level only so they
cannot be deeply buried. But numerically I'd still be -1 on the "="
syntax too. I'm just throwing that preference for "=" out in case a
future BDFL or someone forking the language wants to do it
differently.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ncoghlan at gmail.com  Sun Mar 15 05:39:05 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Mar 2009 14:39:05 +1000
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>	<49BBE61E.6000304@doxdesk.com>
	<8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>
Message-ID: <49BC8669.6000004@gmail.com>

Jervis Whitley wrote:
>>> if (something as x) == other:
>>>    # can now use x.
>> Interesting. I'd definitely prefer that to the C-style inline assignment
>> syntax: I think it reads better, and there's less chance of the Accidental
>> Assignment Instead Of Comparison trap that has plagued other languages.
> 
> However, allowing this "something as x" syntax for assignment would cause
> confusion with the "with contextmanager as x" scenario. "as" was chosen in their
> case because the expr contextmanager is not assigned to x.
> While I do like the "as" syntax too, I have not endorsed it for the
> above reason.

If you look at the current uses for 'as' it is never for direct assignment:

import x as y:
  'x' is not a normal expression here, it's a reference into the module
namespace. The value assigned to 'y' has nothing to do with what you
would get if you evaluated the expression 'x' in the current namespace.

with x as y:
  'y' is assigned the value of x.__enter__(), not x

except x as y:
  'y' is assigned the value of a raised exception that meets the
criteria "isinstance(y, x)".

In all three cases, while the value eventually assigned to 'y' is
*related* to the value of 'x' in some way, it isn't necessarily 'x'
itself that is assigned (although the with statement version can
sometimes give that impression, since many __enter__() methods finish
with "return self").

Proposals for implicit assignment tend to founder on one of two objections:

A. The proposal uses existing assignment syntax ('x = y') and runs afoul
of the C embedded assignment ambiguity problem (i.e. did the programmer
intentionally write "if x = y:" or did they actually mean to write "if x
== y:"?)

B. The proposal uses different assignment syntax ('x -> y', 'y <- x', 'x
as y') and runs afoul of the question of why are there two forms of
assignment statement? (Since any expression can be a statement, the new
embedded assignment syntax would either work as a statement as well, or
else a special rule would have to added to the compiler to say "cannot
use embedded assignment expression as statement - use an assignment
statement instead").

There are also a couple of more general points of confusion related to
nested namespaces as far as embedded assignments go:

1. Assignments inside lambda expressions, list/dict/set comprehensions
and generator expressions (all of which create their own local
namespace) won't affect the current scope, but assignment in any other
expression *will* affect the current scope. Just to make things even
more confusing, assignments in the outermost iterator of a comprehension
or genexp actually *will* affect the current scope.

2. Since global and nonlocal declarations only affect the current
namespace, they're subject to the same kind of confusion as happens with
local assignments: they won't affect assignments embedded inside lambda
expressions, comprehensions and genexps (except for the outermost
iterator for the latter two expression groups).

With the way nested namespaces are set up, allowing embedded assignments
would just be a recipe for long term confusion even if it did
occasionally make some algorithms fractionally easier to write down.

Cheers,
Nick.

Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sun Mar 15 05:58:38 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Mar 2009 14:58:38 +1000
Subject: [Python-ideas] A real limitation of contextlib.nested()
Message-ID: <49BC8AFE.8050209@gmail.com>

I missed the discussion about potentially adding syntactic support for
multiple context managers in the with statement, but figured I should
mention a real limitation of contextlib.nested that *would* be fixed by
adding dedicated syntactic support.

There's a genuine semantic difference between this:

  with cmA():
      with cmB():
          # whatever

and this:

  with nested(cmA(), cmB()):
      # whatever

The latter is actually more accurately translated as:

  mgr1, mgr2 = cmA(), cmB():
  with mgr1:
      with mgr2:
          # whatever

That is, when using nested() the later context managers are created
outside the scope of the earlier context managers.

So, to use Christian's example from the previous discussion:

  with lock:
      with open(infile) as fin:
          with open(outfile, 'w') as fout:
              fout.write(fin.read())

Using contextlib.nested for that would be outright broken:

  with nested(lock, open(infile), open(outfile) as (_, fin, fout):
      fout.write(fin.read())

1. The files are opened without acquiring the lock first
2. If an IOError is raised while opening "outfile", then "infile"
doesn't get closed immediately

I created issue 5491 [1] to point out that the contextlib.nested docs
could do with being tweaked to make this limitation clearer.

Dedicated syntax (such as the form that Christian proposed) would fix
this problem:

  with lock, (open(infile) as fin), (open(outfile, 'w') as fout):
      fout.write(fin.read())

Of course, a custom context manager doesn't suffer any problems either:

    @contextmanager
    def synced_io(lock, infile, outfile):
        with lock:
            with open(infile) as fin:
                with open(outfile) as fout:
                    yield fin, fout

    with synced_io(lock, infile, outfile) as (fin, fout):
       fout.write(fin.read())

Cheers,
Nick.

[1] http://bugs.python.org/issue5491


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jervisau at gmail.com  Sun Mar 15 07:04:00 2009
From: jervisau at gmail.com (Jervis Whitley)
Date: Sun, 15 Mar 2009 17:04:00 +1100
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <49BC8669.6000004@gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<49BBE61E.6000304@doxdesk.com>
	<8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>
	<49BC8669.6000004@gmail.com>
Message-ID: <8e63a5ce0903142304o70607d3ck51f51aa3e0411e8a@mail.gmail.com>

> With the way nested namespaces are set up, allowing embedded assignments
> would just be a recipe for long term confusion even if it did
> occasionally make some algorithms fractionally easier to write down.
>
> Cheers,
> Nick.

Agreed. I wont be taking this argument any further.
Could we close issue http://bugs.python.org/issue1714448?

Cheers,

Jervis


From greg at krypto.org  Sun Mar 15 08:45:23 2009
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 15 Mar 2009 00:45:23 -0700
Subject: [Python-ideas] A real limitation of contextlib.nested()
In-Reply-To: <49BC8AFE.8050209@gmail.com>
References: <49BC8AFE.8050209@gmail.com>
Message-ID: <52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com>

On Sat, Mar 14, 2009 at 9:58 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> I missed the discussion about potentially adding syntactic support for
> multiple context managers in the with statement, but figured I should
> mention a real limitation of contextlib.nested that *would* be fixed by
> adding dedicated syntactic support.
>
> There's a genuine semantic difference between this:
>
>  with cmA():
>      with cmB():
>          # whatever
>
> and this:
>
>  with nested(cmA(), cmB()):
>      # whatever
>
> The latter is actually more accurately translated as:
>
>  mgr1, mgr2 = cmA(), cmB():
>  with mgr1:
>      with mgr2:
>          # whatever
>
> That is, when using nested() the later context managers are created
> outside the scope of the earlier context managers.
>
> So, to use Christian's example from the previous discussion:
>
>  with lock:
>      with open(infile) as fin:
>          with open(outfile, 'w') as fout:
>              fout.write(fin.read())
>
> Using contextlib.nested for that would be outright broken:
>
>  with nested(lock, open(infile), open(outfile) as (_, fin, fout):
>      fout.write(fin.read())
>
> 1. The files are opened without acquiring the lock first
> 2. If an IOError is raised while opening "outfile", then "infile"
> doesn't get closed immediately
>
> I created issue 5491 [1] to point out that the contextlib.nested docs
> could do with being tweaked to make this limitation clearer.
>
> Dedicated syntax (such as the form that Christian proposed) would fix
> this problem:
>
>  with lock, (open(infile) as fin), (open(outfile, 'w') as fout):
>      fout.write(fin.read())
>
> Of course, a custom context manager doesn't suffer any problems either:
>
>    @contextmanager
>    def synced_io(lock, infile, outfile):
>        with lock:
>            with open(infile) as fin:
>                with open(outfile) as fout:
>                    yield fin, fout
>
>    with synced_io(lock, infile, outfile) as (fin, fout):
>       fout.write(fin.read())


fwiw, I believe you could write a version of nested that generates the above
code based on the parameters but I believe it'd be disgustingly slow...

@contextmanager
def slow_nested(*args):
  code_lines = []
  vars = []
  code_lines.append('@contextmanager')
  code_lines.append('def _nested(*args):')
  for idx in xrange(len(args)):
    vars.append('c%d' % idx)
    code_lines.append('%swith args[%d] as %s:' % (' '*(idx+1), idx,
vars[-1]))
  code_lines.append('%syield %s' % (' '*(len(args)+1), ','.join(vars)))
  code = '\n'.join(code_lines)
  print 'CODE:\n', code
  exec(code)
  yield _nested(*args)


>
>
> Cheers,
> Nick.
>
> [1] http://bugs.python.org/issue5491
>
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090315/d99232e3/attachment.html>

From ncoghlan at gmail.com  Sun Mar 15 11:24:53 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 15 Mar 2009 20:24:53 +1000
Subject: [Python-ideas] A real limitation of contextlib.nested()
In-Reply-To: <52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com>
References: <49BC8AFE.8050209@gmail.com>
	<52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com>
Message-ID: <49BCD775.1010602@gmail.com>

> fwiw, I believe you could write a version of nested that generates the
> above code based on the parameters but I believe it'd be disgustingly
> slow...
> 
> @contextmanager
> def slow_nested(*args):
>   code_lines = []
>   vars = []
>   code_lines.append('@contextmanager')
>   code_lines.append('def _nested(*args):')
>   for idx in xrange(len(args)):
>     vars.append('c%d' % idx)
>     code_lines.append('%swith args[%d] as %s:' % (' '*(idx+1), idx,
> vars[-1]))
>   code_lines.append('%syield %s' % (' '*(len(args)+1), ','.join(vars)))
>   code = '\n'.join(code_lines)
>   print 'CODE:\n', code
>   exec(code)
>   yield _nested(*args)

Unfortunately, that doesn't help: the problem is the fact that the
arguments to nested() are evaluated *before* the call to nested() itself.

A version with lazily evaluated arguments (i.e. accepting zero-argument
callables that create the context managers instead of accepting the
context managers themselves) could do the trick, but that then becomes
enough of a pain to use that it wouldn't offer much benefit over writing
a dedicated context manager.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From grosser.meister.morti at gmx.net  Sun Mar 15 13:21:02 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sun, 15 Mar 2009 13:21:02 +0100
Subject: [Python-ideas] A real limitation of contextlib.nested()
In-Reply-To: <49BC8AFE.8050209@gmail.com>
References: <49BC8AFE.8050209@gmail.com>
Message-ID: <49BCF2AE.5030701@gmx.net>

Oh, your right!

+1 for adding the new syntax.

Nick Coghlan wrote:
 > I missed the discussion about potentially adding syntactic support for
 > multiple context managers in the with statement, but figured I should
 > mention a real limitation of contextlib.nested that *would* be fixed by
 > adding dedicated syntactic support.
 >
 > There's a genuine semantic difference between this:
 >
 >   with cmA():
 >       with cmB():
 >           # whatever
 >
 > and this:
 >
 >   with nested(cmA(), cmB()):
 >       # whatever
 > ...


From rrr at ronadam.com  Sun Mar 15 13:58:57 2009
From: rrr at ronadam.com (Ron Adam)
Date: Sun, 15 Mar 2009 07:58:57 -0500
Subject: [Python-ideas] A real limitation of contextlib.nested()
In-Reply-To: <49BC8AFE.8050209@gmail.com>
References: <49BC8AFE.8050209@gmail.com>
Message-ID: <49BCFB91.9040006@ronadam.com>


Nick Coghlan wrote:

> Dedicated syntax (such as the form that Christian proposed) would fix
> this problem:
> 
>   with lock, (open(infile) as fin), (open(outfile, 'w') as fout):
>       fout.write(fin.read())

Could 'and' possibly be used sense it is a flow control operator in python.

    with lock
           and open(infile) as fin
           and open(outfile, 'w' as fout:
       fout.write(fin.read())

Ron


From aahz at pythoncraft.com  Sun Mar 15 15:04:13 2009
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 15 Mar 2009 07:04:13 -0700
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <49BC8669.6000004@gmail.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<49BBE61E.6000304@doxdesk.com>
	<8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>
	<49BC8669.6000004@gmail.com>
Message-ID: <20090315140413.GA26355@panix.com>

On Sun, Mar 15, 2009, Nick Coghlan wrote:
>
> B. The proposal uses different assignment syntax ('x -> y', 'y <- x', 'x
> as y') and runs afoul of the question of why are there two forms of
> assignment statement? (Since any expression can be a statement, the new
> embedded assignment syntax would either work as a statement as well, or
> else a special rule would have to added to the compiler to say "cannot
> use embedded assignment expression as statement - use an assignment
> statement instead").

Just for the record, the most common different syntax suggested has
historically been Pascal's ``:=``
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Adopt A Process -- stop killing all your children!


From Scott.Daniels at Acm.Org  Sun Mar 15 17:58:53 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 15 Mar 2009 09:58:53 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>
	<gpdh2n$qr9$1@ger.gmane.org>
	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
Message-ID: <gpjbud$2n3$1@ger.gmane.org>

Raymond Hettinger wrote:
> Todays updates to:  http://www.python.org/dev/peps/pep-0378/
> 
> * Summarize commentary to date.
> * Add APOSTROPHE and non-breaking SPACE to the list of separators.
> * Add more links to external references.
> * Detail issues with the locale module.
> * Clarify how proposal II is parsed.
Still doesn't specify to digits beyond the decimal point.  I don't
really care what the choice is, but I do care that the choice is
specified.  Is the precision in digits, or is it width of the post-
decimal point field?  If the latter, does a precision of 4 end with
a comma or not?

In particular, what should (format(9876.54321, "13,.5f"),
format(9876.54321, "12,.4f")) produce?
Possible "reasonable" answers:
    A   '  9,876.54321', '  9,876.5432'
    B   ' 9,876.543,21', ' 9,876.543,2'
    C   '  9,876.543,2', '  9,876.543,'
    D   '  9,876.543,2', '   9,876.543'
I prefer B, but I can see an argument for any of the four above.


--Scott David Daniels
Scott.Daniels at Acm.Org


From eric at trueblade.com  Sun Mar 15 18:28:56 2009
From: eric at trueblade.com (Eric Smith)
Date: Sun, 15 Mar 2009 13:28:56 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpjbud$2n3$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>	<gpdh2n$qr9$1@ger.gmane.org>	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
	<gpjbud$2n3$1@ger.gmane.org>
Message-ID: <49BD3AD8.80905@trueblade.com>

Scott David Daniels wrote:
> Still doesn't specify to digits beyond the decimal point.  I don't
> really care what the choice is, but I do care that the choice is
> specified.  Is the precision in digits, or is it width of the post-
> decimal point field?  If the latter, does a precision of 4 end with
> a comma or not?
> 
> In particular, what should (format(9876.54321, "13,.5f"),
> format(9876.54321, "12,.4f")) produce?
> Possible "reasonable" answers:
>    A   '  9,876.54321', '  9,876.5432'
>    B   ' 9,876.543,21', ' 9,876.543,2'
>    C   '  9,876.543,2', '  9,876.543,'
>    D   '  9,876.543,2', '   9,876.543'
> I prefer B, but I can see an argument for any of the four above.

The C locale functions don't support grouping to the right of the 
decimal. I don't think I've ever seen a system that supports it. Do you 
have any examples?

I'd say A.

Eric.


From grosser.meister.morti at gmx.net  Sun Mar 15 18:35:05 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sun, 15 Mar 2009 18:35:05 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpjbud$2n3$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>	<gpdh2n$qr9$1@ger.gmane.org>	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
	<gpjbud$2n3$1@ger.gmane.org>
Message-ID: <49BD3C49.1020904@gmx.net>

Scott David Daniels wrote:
 > Raymond Hettinger wrote:
 >> Todays updates to:  http://www.python.org/dev/peps/pep-0378/
 >>
 >> * Summarize commentary to date.
 >> * Add APOSTROPHE and non-breaking SPACE to the list of separators.
 >> * Add more links to external references.
 >> * Detail issues with the locale module.
 >> * Clarify how proposal II is parsed.
 > Still doesn't specify to digits beyond the decimal point.  I don't
 > really care what the choice is, but I do care that the choice is
 > specified.  Is the precision in digits, or is it width of the post-
 > decimal point field?  If the latter, does a precision of 4 end with
 > a comma or not?
 >
 > In particular, what should (format(9876.54321, "13,.5f"),
 > format(9876.54321, "12,.4f")) produce?
 > Possible "reasonable" answers:
 >    A   '  9,876.54321', '  9,876.5432'
 >    B   ' 9,876.543,21', ' 9,876.543,2'
 >    C   '  9,876.543,2', '  9,876.543,'
 >    D   '  9,876.543,2', '   9,876.543'
 > I prefer B, but I can see an argument for any of the four above.
 >
 >
 > --Scott David Daniels
 > Scott.Daniels at Acm.Org
 >

Has anyone mentioned yet that in german you write the following?
10.000.000,000.001

(In german , and . are swapped.)

Is this aspect taken into account? How is i18n/l10n managed?


	-panzi


From g.brandl at gmx.net  Sun Mar 15 20:18:51 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 15 Mar 2009 20:18:51 +0100
Subject: [Python-ideas] A real limitation of contextlib.nested()
In-Reply-To: <49BCFB91.9040006@ronadam.com>
References: <49BC8AFE.8050209@gmail.com> <49BCFB91.9040006@ronadam.com>
Message-ID: <gpjkc9$or8$1@ger.gmane.org>

Ron Adam schrieb:
> 
> Nick Coghlan wrote:
> 
>> Dedicated syntax (such as the form that Christian proposed) would fix
>> this problem:
>> 
>>   with lock, (open(infile) as fin), (open(outfile, 'w') as fout):
>>       fout.write(fin.read())
> 
> Could 'and' possibly be used sense it is a flow control operator in python.
> 
>     with lock
>            and open(infile) as fin
>            and open(outfile, 'w' as fout:
>        fout.write(fin.read())

But it isn't a control flow operator. It is a boolean operator, and since
"with" expressions are expressions, it's perfectly valid there.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From Scott.Daniels at Acm.Org  Sun Mar 15 21:41:38 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 15 Mar 2009 13:41:38 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49BD3AD8.80905@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>	<gpdh2n$qr9$1@ger.gmane.org>	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>	<gpjbud$2n3$1@ger.gmane.org>
	<49BD3AD8.80905@trueblade.com>
Message-ID: <gpjp01$6lk$1@ger.gmane.org>

Eric Smith wrote:
> Scott David Daniels wrote:
>> Still doesn't specify [how to deal with] digits beyond the decimal point....
>> what should (format(9876.54321, "13,.5f"), format(9876.54321, "12,.4f")) produce?
>>    A   '  9,876.54321', '  9,876.5432'
>>    B   ' 9,876.543,21', ' 9,876.543,2'
>>    C   '  9,876.543,2', '  9,876.543,'
>>    D   '  9,876.543,2', '   9,876.543'
>> I prefer B, but I can see an argument for any of the four above.
> 
> The C locale functions don't support grouping to the right of the 
> decimal. I don't think I've ever seen a system that supports it. Do you 
> have any examples?
I've only used separators to check digits below the decimal point.
Most high-precision tables of constants that I've seen use 5-digit
grouping (e.g. wikipedia for pi):
     3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510
But 3 on the left and 5 on the right really seems to be too much.
> I'd say A.
For me, A and B are the "preferable" solutions; I just think
the PEP needs to say what it chooses.

--Scott David Daniels
Scott.Daniels at Acm.Org


From greg.ewing at canterbury.ac.nz  Sun Mar 15 22:33:44 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Mar 2009 09:33:44 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <loom.20090313T105743-878@post.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
	<49B9B3CB.2090906@canterbury.ac.nz>
	<loom.20090313T095626-65@post.gmane.org>
	<49BA34BD.6070005@canterbury.ac.nz>
	<loom.20090313T105743-878@post.gmane.org>
Message-ID: <49BD7438.6090006@canterbury.ac.nz>

Antoine Pitrou wrote:
> Greg Ewing <greg.ewing at ...> writes:
> 
>>If that's an acceptable thing to do on a daily basis,
>>then we don't need format strings at all.

Because you can do all your formatting by calling a
function to format each number and then concatenating
the results with whatever other text you want.

You can do that now, but someone invented format
strings, so they must have wanted a more convenient way
of going about it.

-- 
Greg


From rasky at develer.com  Sun Mar 15 23:20:56 2009
From: rasky at develer.com (Giovanni Bajo)
Date: Sun, 15 Mar 2009 22:20:56 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
Message-ID: <gpjv08$jef$1@ger.gmane.org>

On Thu, 12 Mar 2009 00:49:24 -0700, Raymond Hettinger wrote:

> [spir]
>> Probably you know that already, but it doesn't hurt anyway. In french
>> and most rroman languages comma is the standard decimal sep; and either
>> space or dot is used, when necessary, to sep thousands. (It's veeery
>> difficult for me to read even short numbers with commas used as
>> thousand separator.)
>>
>> en: 1,234,567.89
>> fr: 1.234.567,89
>> or: 1 234 567,89

I'll notice that the international standard is to use just space:

http://www.bipm.org/jsp/en/ViewCGPMResolution.jsp?CGPM=22&RES=10

reaffirms that "Numbers may be divided in groups of three in order to 
facilitate reading; neither dots nor commas are ever inserted in the 
spaces between groups", as stated in Resolution 7 of the 9th CGPM, 1948.

In Italian, we use a character which is not available in keyboard, nor I 
find it in an Unicode map, so let's ignore it :) On computers, we usually 
simply put a period between thousands.
-- 
Giovanni Bajo
Develer S.r.l.
http://www.develer.com


From pyideas at rebertia.com  Sun Mar 15 23:34:54 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 15 Mar 2009 15:34:54 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <gpjv08$jef$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
	<gpjv08$jef$1@ger.gmane.org>
Message-ID: <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>

On Sun, Mar 15, 2009 at 3:20 PM, Giovanni Bajo <rasky at develer.com> wrote:
> On Thu, 12 Mar 2009 00:49:24 -0700, Raymond Hettinger wrote:
>
>> [spir]
>>> Probably you know that already, but it doesn't hurt anyway. In french
>>> and most rroman languages comma is the standard decimal sep; and either
>>> space or dot is used, when necessary, to sep thousands. (It's veeery
>>> difficult for me to read even short numbers with commas used as
>>> thousand separator.)
>>>
>>> en: 1,234,567.89
>>> fr: 1.234.567,89
>>> or: 1 234 567,89
>
> I'll notice that the international standard is to use just space:
>
> http://www.bipm.org/jsp/en/ViewCGPMResolution.jsp?CGPM=22&RES=10

Of course, that's primarily a /scientific/ standard; others have
explained that commas are apparently the international /financial/
standard.
"Aren't standards great? There's so many to choose from!"

This thread continues to get more complicated by the day...
(Localization doth be *hard*)

Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com


From greg.ewing at canterbury.ac.nz  Sun Mar 15 23:37:25 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Mar 2009 10:37:25 +1200
Subject: [Python-ideas] [Python-Dev] Rough draft:
 Proposed	format	specifier for a	thousands separator (discussion
 moved	from	python-dev)
In-Reply-To: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B8D702.4040004@trueblade.com>
	<1F032CDC26874991B78E47DBDB333917@RaymondLaptop1>
	<49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com>
	<49BB2981.5070008@gmail.com>
	<103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1>
Message-ID: <49BD8325.7060404@canterbury.ac.nz>

Raymond Hettinger wrote:

> 1.  Why mark a non-locale aware form with a flag that indicates
>     locale awareness in another language.
> 2.  It seems to be basic bad design to require an apostrophe
>     to emit commas.

Okay, so how about:

   comma - always use a comma

   apostrophe - use the locale

And for the decimal point:

   dot - always use a dot

   semicolon - use the locale

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sun Mar 15 23:58:28 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Mar 2009 10:58:28 +1200
Subject: [Python-ideas] Inline assignment expression
In-Reply-To: <20090315140413.GA26355@panix.com>
References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com>
	<49BBE61E.6000304@doxdesk.com>
	<8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com>
	<49BC8669.6000004@gmail.com> <20090315140413.GA26355@panix.com>
Message-ID: <49BD8814.7050406@canterbury.ac.nz>

Aahz wrote:

> Just for the record, the most common different syntax suggested has
> historically been Pascal's ``:=``

I'd rather reserve that for a possible future
"in-place assignment" operator, though.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon Mar 16 00:01:50 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 16 Mar 2009 11:01:50 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpjbud$2n3$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>
	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>
	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>
	<49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com>
	<20090313115602.3f9a19b9@o> <gpdh2n$qr9$1@ger.gmane.org>
	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
	<gpjbud$2n3$1@ger.gmane.org>
Message-ID: <49BD88DE.20807@canterbury.ac.nz>

Scott David Daniels wrote:

>    B   ' 9,876.543,21', ' 9,876.543,2'
>    C   '  9,876.543,2', '  9,876.543,'
>    D   '  9,876.543,2', '   9,876.543'

What???

On the planet I come from, nobody uses separators
for digits *after* the decimal point, unless perhaps
if they're spaces. Certainly never commas.

-- 
Greg


From solipsis at pitrou.net  Mon Mar 16 01:10:51 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 16 Mar 2009 00:10:51 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49B92D05.4010807@gmail.com>
	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>
	<49B9804E.3010501@canterbury.ac.nz>
	<loom.20090312T222438-482@post.gmane.org>
	<49B9B3CB.2090906@canterbury.ac.nz>
	<loom.20090313T095626-65@post.gmane.org>
	<49BA34BD.6070005@canterbury.ac.nz>
	<loom.20090313T105743-878@post.gmane.org>
	<49BD7438.6090006@canterbury.ac.nz>
Message-ID: <loom.20090316T000914-926@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:
> 
> You can do that now, but someone invented format
> strings, so they must have wanted a more convenient way
> of going about it.

I don't see how that contradicts what I said and you don't seem eager to produce
understandable explanations, so I'll leave it there.


From python at rcn.com  Mon Mar 16 07:05:42 2009
From: python at rcn.com (Raymond Hettinger)
Date: Sun, 15 Mar 2009 23:05:42 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o><gpdh2n$qr9$1@ger.gmane.org><09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>
	<gpjbud$2n3$1@ger.gmane.org>
Message-ID: <A412DE00970543FDB400E05EF02DF98D@RaymondLaptop1>


> Raymond Hettinger wrote:
>> * Summarize commentary to date.
>> * Add APOSTROPHE and non-breaking SPACE to the list of separators.
>> * Add more links to external references.
>> * Detail issues with the locale module.
>> * Clarify how proposal II is parsed.

[Scott David Daniels]
> Still doesn't specify to digits beyond the decimal point.

Will clarify that the intent is to put thousands separators
only to the left of the decimal point.


> In particular, what should (format(9876.54321, "13,.5f"),
> format(9876.54321, "12,.4f")) produce?
> Possible "reasonable" answers:
>    A   '  9,876.54321', '  9,876.5432'
>    B   ' 9,876.543,21', ' 9,876.543,2'
>    C   '  9,876.543,2', '  9,876.543,'
>    D   '  9,876.543,2', '   9,876.543'
> I prefer B, but I can see an argument for any of the four above.

Am proposing A

That matches the existing precedent in the local module:
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
'English_United States.1252'
>>> locale.format("%15.8f", pi*1000, grouping=True)
' 3,141.59265359'

It also matches what my adding have machines done, what my HP 
calculator does, how excel handles thousands grouping, and the 
other examples cited in the PEP.

Am thinking that anything else this would be a new, made-up requirement.
The closest I've seen to this is grouping of digits in long sequences
of pi and in logarithm tables.  It may be useful to someone somewhere,
but am not going to propose it for the PEP.


Raymond


Raymond


From python at rcn.com  Mon Mar 16 08:01:32 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 00:01:32 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>	<gpdh2n$qr9$1@ger.gmane.org>	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1><gpjbud$2n3$1@ger.gmane.org>
	<49BD3C49.1020904@gmx.net>
Message-ID: <AADDC69B51854F9593B012959E21837D@RaymondLaptop1>


[Mathias Panzenb?ck]
> Has anyone mentioned yet that in german you write the following?
> 10.000.000,000.001

These are all red-herrings.  The proposal is not about internationalization
and it says as much.  There is no doubt that everyone and his brother
can think up a different convention for writing down numbers.

The PEP proposes a non-localized way to specify one of several separators
to group thousands to the left of the decimal point.  At least one
way (spaces or underscores) should be readable, understandable,
and useful to folks from  many diverse backgrounds.  It is not the
intention to be able to reproduce everything that a person can think up.
That would be a fools errand.


Raymond 


From python at rcn.com  Mon Mar 16 08:27:03 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 00:27:03 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49B92D05.4010807@gmail.com>	<cf5b87740903121117q5b54fd5bxac859ebd506eeffe@mail.gmail.com>	<34E3F59B404D467BB2865203C94C1287@RaymondLaptop1>	<fb6fbf560903121229w3e719a48l5c5f8ba82259bc8e@mail.gmail.com>	<AEBB5CD0BAC841E78E3155B4BA3AF575@RaymondLaptop1>	<49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com>	<20090313115602.3f9a19b9@o>	<gpdh2n$qr9$1@ger.gmane.org>	<09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1>	<gpjbud$2n3$1@ger.gmane.org><49BD3AD8.80905@trueblade.com>
	<gpjp01$6lk$1@ger.gmane.org>
Message-ID: <E53D20D5DB5C41CA8B803DE589B55AE5@RaymondLaptop1>


[Scott David Daniels]
>> I'd say A.
> For me, A and B are the "preferable" solutions; I just think
> the PEP needs to say what it chooses.

Thanks.  I've updated the PEP to say A explicitly.


Raymond


From python at rcn.com  Mon Mar 16 08:31:19 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 00:31:19 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	athousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o><F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1><gpjv08$jef$1@ger.gmane.org>
	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
Message-ID: <FE72434366604F0B89614D10A7167DD1@RaymondLaptop1>


[Chris Rebert]
> This thread continues to get more complicated by the day...
> (Localization doth be *hard*)

Good thing the PEP is *not* about localization :-)
It does not attempt to cater to every possible way to write numbers.
Instead, it offers a handful of choices for thousands groupings.
At least one of those choices (perhaps spaces or underscores)
should be readable and useful in many (though not all) contexts.


Raymond


From ncoghlan at gmail.com  Mon Mar 16 13:03:41 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 16 Mar 2009 22:03:41 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier
 for	athousands separator (discussion moved from python-dev)
In-Reply-To: <FE72434366604F0B89614D10A7167DD1@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o><F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1><gpjv08$jef$1@ger.gmane.org>	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
	<FE72434366604F0B89614D10A7167DD1@RaymondLaptop1>
Message-ID: <49BE401D.5050908@gmail.com>

Raymond Hettinger wrote:
> 
> [Chris Rebert]
>> This thread continues to get more complicated by the day...
>> (Localization doth be *hard*)
> 
> Good thing the PEP is *not* about localization :-)
> It does not attempt to cater to every possible way to write numbers.
> Instead, it offers a handful of choices for thousands groupings.
> At least one of those choices (perhaps spaces or underscores)
> should be readable and useful in many (though not all) contexts.

Emphatically agreed that this PEP shouldn't be targeted at end-user
output for a commercial product. There are plenty of good solutions for
that already in the l10n/i18n space. What is currently missing (and what
the PEP will provide) is the ability to easily output more readable
comparatively large integers for debugging output or quick and dirty
"internal" scripts that are not intended for wide distribution.

Having had my eyes glaze over attempting to decipher overly long
integers in debugging output, I look forward to the day when I no longer
have to write my own formatting functions to deal with that (even if the
time when I can use 2.7 or 3.1 day to day is still somewhere in the dim
distant future...)

On a completely different topic, I noticed that the PEP doesn't
currently state what the thousands separator means for bases other than
10 (i.e. octal, hex, binary). Is it ignored? Always delineates groups of
3 digits as for decimal numbers? Delineates an "appropriate" group size
(e.g. 3 for octal, 4 for hex and binary)?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From denis.spir at free.fr  Mon Mar 16 15:10:36 2009
From: denis.spir at free.fr (spir)
Date: Mon, 16 Mar 2009 15:10:36 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
	<gpjv08$jef$1@ger.gmane.org>
	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
Message-ID: <20090316151036.67755239@o>

Le Sun, 15 Mar 2009 15:34:54 -0700,
Chris Rebert <pyideas at rebertia.com> s'exprima ainsi:

> commas are apparently the international /financial/
> standard

Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ??

see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use

Denis
------
la vita e estrany


From eric at trueblade.com  Mon Mar 16 15:23:25 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 16 Mar 2009 10:23:25 -0400
Subject: [Python-ideas] Added a function to parse str.format() mini-language
	specifiers
Message-ID: <49BE60DD.7010603@trueblade.com>

I'd like to add a function (or method) to parse str.format()'s standard 
mini-language format specifiers. It's hard to get right, and if PEP 378 
is accepted, it gets more complex.

The primary use case is for non-builtin numeric types that want to add 
__format__, and want it to support the same mini-language that the built 
in types support. For example see issue 2110, where Mark Dickinson 
implements his own version for Decimal, and suggests it be moved elsewhere.

This function exists in Objects/stringlib/formatter.h, and will just 
need to be exposed to Python code. I propose a function that takes a 
single str (or unicode) and returns a named tuple with the appropriate 
values filled in.

So, is such a function desirable, and if so, where would it go? I could 
expose it through the string module, which is where the sort-of-related 
Formatter class lives.

It could be a method on str and unicode, but I'm not sure that's most 
appropriate.

Eric.


From pyideas at rebertia.com  Mon Mar 16 15:55:23 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Mon, 16 Mar 2009 07:55:23 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <20090316151036.67755239@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
	<gpjv08$jef$1@ger.gmane.org>
	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
	<20090316151036.67755239@o>
Message-ID: <50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com>

On Mon, Mar 16, 2009 at 7:10 AM, spir <denis.spir at free.fr> wrote:
> Le Sun, 15 Mar 2009 15:34:54 -0700,
> Chris Rebert <pyideas at rebertia.com> s'exprima ainsi:
>
>> commas are apparently the international /financial/
>> standard
>
> Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ??
>
> see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use

Yes, I know it varies from nation to nation, but apparently less so
when specifically working internationally (like you, I was surprised
there was a standard at all); see earlier response by Raymond
Hettinger in the parallel c.l.p thread. Relevant quote:

"""
I'm a CPA, was a 15 year division controller
for a Fortune 500 company, and an auditor for an international
accounting firm.  Believe me when I say it is the norm in finance.
"""

"It" referring to period-as-decimal-point and
comma-as-thousands-separator notation.

Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com


From eric at trueblade.com  Mon Mar 16 16:15:06 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 16 Mar 2009 11:15:06 -0400
Subject: [Python-ideas] Add a function to parse str.format()
 mini-language specifiers
In-Reply-To: <49BE60DD.7010603@trueblade.com>
References: <49BE60DD.7010603@trueblade.com>
Message-ID: <49BE6CFA.9010808@trueblade.com>

The subject shouldn't have said "Added". It's not a done deal!

Eric.

Eric Smith wrote:
> I'd like to add a function (or method) to parse str.format()'s standard 
> mini-language format specifiers. It's hard to get right, and if PEP 378 
> is accepted, it gets more complex.
> 
> The primary use case is for non-builtin numeric types that want to add 
> __format__, and want it to support the same mini-language that the built 
> in types support. For example see issue 2110, where Mark Dickinson 
> implements his own version for Decimal, and suggests it be moved elsewhere.
> 
> This function exists in Objects/stringlib/formatter.h, and will just 
> need to be exposed to Python code. I propose a function that takes a 
> single str (or unicode) and returns a named tuple with the appropriate 
> values filled in.
> 
> So, is such a function desirable, and if so, where would it go? I could 
> expose it through the string module, which is where the sort-of-related 
> Formatter class lives.
> 
> It could be a method on str and unicode, but I'm not sure that's most 
> appropriate.
> 
> Eric.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
> 


From dickinsm at gmail.com  Mon Mar 16 17:26:10 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 16 Mar 2009 16:26:10 +0000
Subject: [Python-ideas] Added a function to parse str.format()
	mini-language specifiers
In-Reply-To: <49BE60DD.7010603@trueblade.com>
References: <49BE60DD.7010603@trueblade.com>
Message-ID: <5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com>

On Mon, Mar 16, 2009 at 2:23 PM, Eric Smith <eric at trueblade.com> wrote:
> I'd like to add a function (or method) to parse str.format()'s standard
> mini-language format specifiers.

+1 from me.  (Of course. :-)  Once the 'n' format code goes in, decimal.py
will contain over 200 lines of Python code that really has very little to
do with the decimal module at all.  I'd like to see that code move
somewhere else, partly out of a desire to unclutter the decimal module,
and partly to make it easier to cope with changes and new features
in the formatting mini-language.

Out of curiosity, does anyone know of any numeric types (other
than Decimal) that might benefit from this?

Something like the '_format_align' function from decimal.py
might also be of general use:  it just does the job of padding
and aligning a numeric string (as well as dealing with the sign).

> be exposed to Python code. I propose a function that takes a single str (or
> unicode) and returns a named tuple with the appropriate values filled in.

Are there advantages to using a named tuple instead of a dict?
If there's a possibility that some fields may or may not be
defined depending on the value of other fields, then a dict
may make more sense.  (Not sure whether this can happen
with the mini-language in its current form.)

> So, is such a function desirable, and if so, where would it go?

Yes, and don't know!

> It could be a method on str and unicode, but I'm not sure that's most
> appropriate.

Doesn't seem right to me, either.

Mark


From eric at trueblade.com  Mon Mar 16 17:31:51 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 16 Mar 2009 12:31:51 -0400
Subject: [Python-ideas] Added a function to parse
 str.format()	mini-language specifiers
In-Reply-To: <5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com>
References: <49BE60DD.7010603@trueblade.com>
	<5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com>
Message-ID: <49BE7EF7.6090904@trueblade.com>

Mark Dickinson wrote:
> Something like the '_format_align' function from decimal.py
> might also be of general use:  it just does the job of padding
> and aligning a numeric string (as well as dealing with the sign).

Standby. That's next on my list of proposals.


From hwpuschm at yahoo.de  Mon Mar 16 17:37:59 2009
From: hwpuschm at yahoo.de (Heinrich W Puschmann)
Date: Mon, 16 Mar 2009 16:37:59 +0000 (GMT)
Subject: [Python-ideas] Keyword same in right hand side of assignments
Message-ID: <239732.95222.qm@web25802.mail.ukl.yahoo.com>


Python-Ideas: Keyword same in right hand side of assignments
-------------------------------------------------------------------------------------------

It is proposed to introduce a Keyword "same", 
to be used in the right hand side of assignments, as follows: 
?
? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5" 
? "value =? 2*same + 5"? synonymous with "value =*2; value +=5" 
? "switch = 1 - same"? synonymous with "switch *-1; switch +=1" 
? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" 
? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst" 
? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" 
?
and so on. 
?
?
?
IN GENERALITY, the effect of keyword same would be the following:
 
?
The keyword same should only appear in 
the right hand side expression of an assignment, 
any number of times and wherever a name can appear. 
The left hand side of the assignment would have to be bound to some object. 
?
The keyword same is substituted by a name, 
which is bound to the object binding the left hand side of the assignment. 
The? expression at the right hand side is evaluated. 
The left hand side identifyer is bound to the result of the expression. 
?
?
?
Since I am not a developer, I have no idea on ?
how to difficult or how easy it would be to implement such a feature. 
As a programmer, however, I believe that 
it improves readability and user friendliness, 
and that it fully adjusts to the Python philosophy. 
It is very hard for me to discern, ?
whether a similar idea has already been proposed by somebody else.
 
?
I did not yet have any useful application for statements like 
?
? "xx = xx.do(*args)" or? "xx = yy.do(xx,*args)" or "xx = xx.do(xx,*args)" 
?
but as a matter of generalization, they should also allow subtitution 
of the right hand side appearances of xx by the keyword same. ?
?
?
Heinrich Puschmann, Ulm


From python at rcn.com  Mon Mar 16 17:42:52 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 09:42:52 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	athousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o><F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1><gpjv08$jef$1@ger.gmane.org><50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com><20090316151036.67755239@o>
	<50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com>
Message-ID: <49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1>


> "It" referring to period-as-decimal-point and
> comma-as-thousands-separator notation.

Guys, I just meant that grouping of thousands is common in finance.
The actual grouping separator varies.


Raymond


From eric at trueblade.com  Mon Mar 16 17:51:24 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 16 Mar 2009 12:51:24 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
Message-ID: <49BE838C.9000906@trueblade.com>

I vote we move ahead with Proposal II from PEP 378. I don't think 
there's anything else to add to the discussion.

Eric.


From pyideas at rebertia.com  Mon Mar 16 18:24:40 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Mon, 16 Mar 2009 10:24:40 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	athousands separator (discussion moved from python-dev)
In-Reply-To: <49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
	<gpjv08$jef$1@ger.gmane.org>
	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
	<20090316151036.67755239@o>
	<50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com>
	<49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1>
Message-ID: <50697b2c0903161024p72a8d1c9hb4e052242cf8aec8@mail.gmail.com>

On Mon, Mar 16, 2009 at 9:42 AM, Raymond Hettinger <python at rcn.com> wrote:
>
>> "It" referring to period-as-decimal-point and
>> comma-as-thousands-separator notation.
>
> Guys, I just meant that grouping of thousands is common in finance.
> The actual grouping separator varies.

Ah, evidently I misinterpreted. Apologies.

Cheers,
Chris


From zac256 at gmail.com  Mon Mar 16 18:25:00 2009
From: zac256 at gmail.com (Zac Burns)
Date: Mon, 16 Mar 2009 10:25:00 -0700
Subject: [Python-ideas] PEP links in docs
Message-ID: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>

I would like to see links to relevant PEP / mailing list docs in the main docs.

For example http://docs.python.org/library/dircache.html after
"Deprecated since version 2.6" could include a (PEP x.x) link where I
could read why it was deprecated and probably what to use in it's
place.

The docs right now are quite excellent describing the "what" about
everything, but often have little to say about the "why". This is a
good thing, but links would surely help those that want to learn more.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games


From rdmurray at bitdance.com  Mon Mar 16 17:48:50 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 16 Mar 2009 16:48:50 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<20090312083401.33cc525b@o>
	<F4D5258D6765484AB3936EDD021EAF14@RaymondLaptop1>
	<gpjv08$jef$1@ger.gmane.org>
	<50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com>
	<20090316151036.67755239@o>
	<50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com>
Message-ID: <gplvti$aip$1@ger.gmane.org>

Chris Rebert <pyideas at rebertia.com> wrote:
> On Mon, Mar 16, 2009 at 7:10 AM, spir <denis.spir at free.fr> wrote:
> > Le Sun, 15 Mar 2009 15:34:54 -0700,
> > Chris Rebert <pyideas at rebertia.com> s'exprima ainsi:
> >
> >> commas are apparently the international /financial/
> >> standard
> >
> > Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ??
> >
> > see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use
> 
> Yes, I know it varies from nation to nation, but apparently less so
> when specifically working internationally (like you, I was surprised
> there was a standard at all); see earlier response by Raymond
> Hettinger in the parallel c.l.p thread. Relevant quote:
> 
> """
> I'm a CPA, was a 15 year division controller
> for a Fortune 500 company, and an auditor for an international
> accounting firm.  Believe me when I say it is the norm in finance.
> """
> 
> "It" referring to period-as-decimal-point and
> comma-as-thousands-separator notation.

Regardless of any standards, I find it interesting that I just now ran
into exactly the use case that prompted Raymond to propose this addition.
I need to format a one-off report for a client, and it would be _most_
helpful if I could easily tell Python to format the numbers, which are
reporting bytes transmitted, with comma thousands separators for clarity.

I guess that means I'm +1 for some form of this making it through.

--
R. David Murray           http://www.bitdance.com


From rdmurray at bitdance.com  Mon Mar 16 18:57:14 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 16 Mar 2009 17:57:14 +0000 (UTC)
Subject: [Python-ideas] Keyword same in right hand side of assignments
References: <239732.95222.qm@web25802.mail.ukl.yahoo.com>
Message-ID: <gpm3tq$qij$1@ger.gmane.org>

Heinrich W Puschmann <hwpuschm at yahoo.de> wrote:
> 
> Python-Ideas: Keyword same in right hand side of assignments
> -------------------------------------------------------------------------------------------
> 
> It is proposed to introduce a Keyword "same", 
> to be used in the right hand side of assignments, as follows: 
> ?
> ? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5" 
> ? "value =? 2*same + 5"? synonymous with "value =*2; value +=5" 
> ? "switch = 1 - same"? synonymous with "switch *-1; switch +=1" 
> ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" 
> ? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst" 
> ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" 
> ?
> and so on. 
> ?
> ?
> ?
> IN GENERALITY, the effect of keyword same would be the following:
>  
> ?
> The keyword same should only appear in 
> the right hand side expression of an assignment, 
> any number of times and wherever a name can appear. 
> The left hand side of the assignment would have to be bound to some object. 
> ?
> The keyword same is substituted by a name, 
> which is bound to the object binding the left hand side of the assignment. 
> The? expression at the right hand side is evaluated. 
> The left hand side identifyer is bound to the result of the expression. 

Given your definition, this:

    "lst = same + [5,6]"? synonymous with? "lst += [5,6]" 

is not true.  By your definition (and a programmer's naive expectation
based on other python semantics), lst = same + [5,6] translates to
lst = lst + [5,6].  But:

    >>> old = lst = [1]
    >>> lst = lst + [5,6]
    >>> old, lst
    ([1], [1, 5, 6])
    >>> old is lst
    False

While:

    >>> old = lst = [1]
    >>> lst += [5,6]
    >>> old, lst
    ([1, 5, 6], [1, 5, 6])
    >>> old is lst
    True

So the two are _not_ synonymous.

That point aside, I do not see the utility of this feature.  To me,
it means that my eye would have to scan backward to the start of the
line to find out what 'same' was, whereas in the current formulation
the answer is right under my eyes.  Code is more about reading that it
is about writing, since reading code is something done much more often
than writing it, so I'd rather keep it easier to read.

Personally I wouldn't find typing 'same' any easier than typing the
variable, anyway.

--
R. David Murray           http://www.bitdance.com


From steve at pearwood.info  Mon Mar 16 19:18:35 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 17 Mar 2009 05:18:35 +1100
Subject: [Python-ideas] Keyword same in right hand side of assignments
In-Reply-To: <239732.95222.qm@web25802.mail.ukl.yahoo.com>
References: <239732.95222.qm@web25802.mail.ukl.yahoo.com>
Message-ID: <200903170518.36011.steve@pearwood.info>

On Tue, 17 Mar 2009 03:37:59 am Heinrich W Puschmann wrote:
> Python-Ideas: Keyword same in right hand side of assignments
> ---------------------------------------------------------------------
>----------------------
>
> It is proposed to introduce a Keyword "same",
> to be used in the right hand side of assignments, as follows:
> ?
> ? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5"
> ? "value =? 2*same + 5"? synonymous with "value =*2; value +=5"
> ? "switch = 1 - same"? synonymous with "switch *-1; switch +=1"
> ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]"
> ? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst"
> ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1"
> ?
> and so on.

What's the point? Why not just write the following?

xx = xx + 5
value = 2*value + 5
switch = 1 - switch
lst = lst + [5,6]
lst = [5,6] + lst
lst[2] = 1/lst[2]

What value do we gain by creating a new keyword that obscures what the 
assignment does?

-- 
Steven D'Aprano


From denis.spir at free.fr  Mon Mar 16 19:59:40 2009
From: denis.spir at free.fr (spir)
Date: Mon, 16 Mar 2009 19:59:40 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49BE838C.9000906@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com>
Message-ID: <20090316195940.6e4e1061@o>

Le Mon, 16 Mar 2009 12:51:24 -0400,
Eric Smith <eric at trueblade.com> s'exprima ainsi:

> I vote we move ahead with Proposal II from PEP 378. I don't think 
> there's anything else to add to the discussion.
> 
> Eric.

Agree.

denis
------
la vita e estrany


From denis.spir at free.fr  Mon Mar 16 19:59:09 2009
From: denis.spir at free.fr (spir)
Date: Mon, 16 Mar 2009 19:59:09 +0100
Subject: [Python-ideas] PEP links in docs
In-Reply-To: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>
References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>
Message-ID: <20090316195909.6c81f5e6@o>

Le Mon, 16 Mar 2009 10:25:00 -0700,
Zac Burns <zac256 at gmail.com> s'exprima ainsi:

> I would like to see links to relevant PEP / mailing list docs in the main
> docs.

+++

PEPs -- with purpose and rationale, and deeply reviewed -- are most often great. There are actually often more legible even for non-specialists, simply because they tell you why.
If one doesn't understand the PEP's introduction, then yes, probably this person does not need the feature ;-)

> For example http://docs.python.org/library/dircache.html after
> "Deprecated since version 2.6" could include a (PEP x.x) link where I
> could read why it was deprecated and probably what to use in it's
> place.
> 
> The docs right now are quite excellent describing the "what" about
> everything, but often have little to say about the "why". This is a
> good thing, but links would surely help those that want to learn more.

More than true, imo.
Every feature documentation could (should?) start answering the infamous "why?"; meaning the purpose, issue,... I really do not agree with the "if you don't know why, you don't need it" (elitist) argument.

Denis
------
la vita e estrany


From python at rcn.com  Mon Mar 16 20:19:24 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 12:19:24 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49BE838C.9000906@trueblade.com>
	<20090316195940.6e4e1061@o>
Message-ID: <FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>

Guido,

The conversation on the thousands separator seems to have wound down
and the PEP has stabilized:  http://www.python.org/dev/peps/pep-0378/

Please pronounce.  


Raymond


----- Original Message ----- 
From: "spir" <denis.spir at free.fr>
To: <python-ideas at python.org>
Sent: Monday, March 16, 2009 11:59 AM
Subject: Re: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev)


> Le Mon, 16 Mar 2009 12:51:24 -0400,
> Eric Smith <eric at trueblade.com> s'exprima ainsi:
> 
>> I vote we move ahead with Proposal II from PEP 378. I don't think 
>> there's anything else to add to the discussion.
>> 
>> Eric.
> 
> Agree.
> 
> denis
> ------
> la vita e estrany
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From sligocki at gmail.com  Mon Mar 16 21:02:05 2009
From: sligocki at gmail.com (Shawn Ligocki)
Date: Mon, 16 Mar 2009 13:02:05 -0700
Subject: [Python-ideas] PEP links in docs
In-Reply-To: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>
References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>
Message-ID: <2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com>

+2, Sign me up.

The "why" is the most annoyingly avoided question in teaching anything.

If someone leads this, I'll help contribute!

Cheers,

Shawn Ligocki
sligocki at gmail.com

On Mon, Mar 16, 2009 at 10:25 AM, Zac Burns <zac256 at gmail.com> wrote:

> I would like to see links to relevant PEP / mailing list docs in the main
> docs.
>
> For example http://docs.python.org/library/dircache.html after
> "Deprecated since version 2.6" could include a (PEP x.x) link where I
> could read why it was deprecated and probably what to use in it's
> place.
>
> The docs right now are quite excellent describing the "what" about
> everything, but often have little to say about the "why". This is a
> good thing, but links would surely help those that want to learn more.
>
> --
> Zachary Burns
> (407)590-4814
> Aim - Zac256FL
> Production Engineer (Digital Overlord)
> Zindagi Games
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090316/fd2cf525/attachment.html>

From tjreedy at udel.edu  Mon Mar 16 21:08:33 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 16 Mar 2009 16:08:33 -0400
Subject: [Python-ideas] Added a function to parse str.format()
	mini-language specifiers
In-Reply-To: <49BE60DD.7010603@trueblade.com>
References: <49BE60DD.7010603@trueblade.com>
Message-ID: <gpmbjt$qka$1@ger.gmane.org>

Eric Smith wrote:
> I'd like to add a function (or method) to parse str.format()'s standard 
> mini-language format specifiers. It's hard to get right, and if PEP 378 
> is accepted, it gets more complex.
> 
> The primary use case is for non-builtin numeric types that want to add 
> __format__, and want it to support the same mini-language that the built 
> in types support. For example see issue 2110, where Mark Dickinson 
> implements his own version for Decimal, and suggests it be moved elsewhere.
> 
> This function exists in Objects/stringlib/formatter.h, and will just 
> need to be exposed to Python code. I propose a function that takes a 
> single str (or unicode) and returns a named tuple with the appropriate 
> values filled in.
> 
> So, is such a function desirable, and if so,

Yes, but I would take it further and and consider the string and 
dict/named-tuple as alternate interfaces to the formatting machinery. 
So I would

a) add an inverse function that would take a dict or named tuple and 
produce the field specifier as a string (or raise ValueError).  Such a 
string could be embedded into a complete format string.  Some people 
might prefer this specification method.

b> amend built-in format() to take a dict/n-t as the second argument on 
the basis that it is silly to transform the parse result back into a 
string just to be parsed again.  This would make repeated calls to 
format faster by eliminating the parsing step.

 > where would it go? I could expose it through the string module, which 
is where the sort-of-related
> Formatter class lives.

That seems the most obvious place.
> 
> It could be a method on str and unicode, but I'm not sure that's most 
> appropriate.

Terry Jan Reedy


From eric at trueblade.com  Mon Mar 16 21:31:20 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 16 Mar 2009 16:31:20 -0400
Subject: [Python-ideas] Added a function to parse
 str.format()	mini-language specifiers
In-Reply-To: <gpmbjt$qka$1@ger.gmane.org>
References: <49BE60DD.7010603@trueblade.com> <gpmbjt$qka$1@ger.gmane.org>
Message-ID: <49BEB718.6020708@trueblade.com>

Terry Reedy wrote:
> Eric Smith wrote:
>> I'd like to add a function (or method) to parse str.format()'s 
>> standard mini-language format specifiers. It's hard to get right, and 
>> if PEP 378 is accepted, it gets more complex.
...
>> So, is such a function desirable, and if so,
> 
> Yes, but I would take it further and and consider the string and 
> dict/named-tuple as alternate interfaces to the formatting machinery. So 
> I would

If the only use case for this is for non-builtin numeric types, I'd vote 
for a named tuple. But since Mark (who's one of the primary users) also 
raised the dict issue, I'll give it some thought.

> a) add an inverse function that would take a dict or named tuple and 
> produce the field specifier as a string (or raise ValueError).  Such a 
> string could be embedded into a complete format string.  Some people 
> might prefer this specification method.

This is a pretty simple transformation. I'm not so sure it's all that 
useful.

> b> amend built-in format() to take a dict/n-t as the second argument on 
> the basis that it is silly to transform the parse result back into a 
> string just to be parsed again.  This would make repeated calls to 
> format faster by eliminating the parsing step.

But format() works for any type, including ones that don't understand 
the standard mini-language. What would they do with this info?


From tjreedy at udel.edu  Mon Mar 16 21:53:35 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 16 Mar 2009 16:53:35 -0400
Subject: [Python-ideas] Keyword same in right hand side of assignments
In-Reply-To: <239732.95222.qm@web25802.mail.ukl.yahoo.com>
References: <239732.95222.qm@web25802.mail.ukl.yahoo.com>
Message-ID: <gpme8b$4vo$1@ger.gmane.org>

Heinrich W Puschmann wrote:
> Python-Ideas: Keyword same in right hand side of assignments
> -------------------------------------------------------------------------------------------
> 
> It is proposed to introduce a Keyword "same", 
> to be used in the right hand side of assignments, as follows: 
>  
>   "xx = same + 5" or "xx = 5 + same"  synonymous with  "xx += 5" 

Only if + is commutative and xx in immutable, and even then, see below.

>   "value =  2*same + 5"  synonymous with "value =*2; value +=5" 
>   "switch = 1 - same"  synonymous with "switch *-1; switch +=1" 
>   "lst = same + [5,6]"  synonymous with  "lst += [5,6]" 
>   "lst = [5,6] + same" synonymous with  "lst = [5,6] + lst" 
>   "lst[2] = 1/same" synonymous with  "lst[2] **=-1" 
>  
> and so on. 

This is an intriguing idea, which seems to extend the idea of augmented 
assignment, and which would seem most useful for complicated target 
expressions or for multiple uses of the pre-existing object bound to the 
target.  It is similar in that the target must have a pre-existing 
binding.  While it might work best for one target, it might not have to 
be so restricted.

But one problem is that it changes and complicates the meaning of 
assignment statements.  Currently,

target-expression(s) = source-expression

means evaluate source-expression; then evaluate target expression (s, 
left to right) and bind it (them) to the source object (or objects 
produce by iterating).  This proposal requires that target be evaluated 
first, resolved to a object, bound to 'same' (or the internal 
equivalent) and after the assignment, unbound from 'same'.

Since targets are expressions and not objects, I believe the target 
expression would have to be re-evaluated (without major change to the 
virtual machine) to make the binding, so this constructions would not 
save a target evaluation and would not be synonymous with augmented 
assignment even if it otherwise would be.  So

lst[2] = 1/same

would really be equivalent to

____ = lst[2]; lst[2] = 1/____; del ____

Another problem is that any new short keyword breaks code and therefore 
needs a strong justification that it will also improve a substantial 
amount of other code.

Terry Jan Reedy


From guido at python.org  Mon Mar 16 22:05:47 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 14:05:47 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
Message-ID: <ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>

On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger <python at rcn.com> wrote:
> The conversation on the thousands separator seems to have wound down
> and the PEP has stabilized: ?http://www.python.org/dev/peps/pep-0378/
>
> Please pronounce.

That's not a PEP, it's just a summary of a discussion without any
choice. :-) Typically PEPs put the discussion of alternatives in some
section at the end, after the specification and other stuff relevant
going forward.

Just to add more fuel to the fire, did anyone propose refactoring the
problem into (a) a way to produce output with a thousands separator,
and (b) a way to localize formats? We could solve (a) by adding a
comma to all numeric format languages along Nick's proposal, and we
could solve (b) either now or later by adding some other flag that
means "use locale-specific numeric formatting for this value". Or
perhaps there could be two separate flags corresponding to the
grouping and monetary arguments to locale.format(). I'd be happy to
punt on (b) until later.

This is somewhat analogous to the approach for strftime() which has
syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p,
%x, %X).

I guess in the end this means I am in favor of Nick's alternative.

One thing I don't understand: the PEP seems to exclude the 'e' and 'g'
format. I would think that in case 'g' defers to 'f' it should act the
same, and in case it defers to 'e', well, in the future (under (b)
above) that could still change the period into a comma, right?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ncoghlan at gmail.com  Mon Mar 16 22:52:26 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 17 Mar 2009 07:52:26 +1000
Subject: [Python-ideas] Added a function to parse str.format()
 mini-language specifiers
In-Reply-To: <49BE60DD.7010603@trueblade.com>
References: <49BE60DD.7010603@trueblade.com>
Message-ID: <49BECA1A.9020100@gmail.com>

Eric Smith wrote:
> So, is such a function desirable, and if so, where would it go? I could
> expose it through the string module, which is where the sort-of-related
> Formatter class lives.

string.parse_format and string.build_format perhaps? The inverse
operation would be useful if you just wanted to do something like "use a
default precision of 3" but otherwise leave things up to the original
object.

  def custom_format(fmt, value):
    details = string.parse_format(fmt)
    if details["precision"] is None: # Assumes None indicates missing
      details["precision"] = 3
    fmt = string.build_format(details)
    return format(fmt, value)

While having to rebuild and reparse the string is a little annoying,
changing that would involve changing the spec for the __format__ magic
method and I don't think we want to go there.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From g.brandl at gmx.net  Mon Mar 16 22:56:24 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 16 Mar 2009 22:56:24 +0100
Subject: [Python-ideas] PEP links in docs
In-Reply-To: <2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com>
References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com>
	<2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com>
Message-ID: <gpmhvn$hu5$1@ger.gmane.org>

Feel free to send patches, as small as they may seem.  Mark up PEP numbers
in reST like this -- :pep:`42` -- to get automatic linking.

Georg

Shawn Ligocki schrieb:
> +2, Sign me up.
> 
> The "why" is the most annoyingly avoided question in teaching anything.
> 
> If someone leads this, I'll help contribute!
> 
> Cheers,
> 
> Shawn Ligocki
> sligocki at gmail.com
> <mailto:sligocki at gmail.com>
> 
> On Mon, Mar 16, 2009 at 10:25 AM, Zac Burns
> <zac256 at gmail.com
> <mailto:zac256 at gmail.com>> wrote:
> 
>     I would like to see links to relevant PEP / mailing list docs in the
>     main docs.
> 
>     For example http://docs.python.org/library/dircache.html after
>     "Deprecated since version 2.6" could include a (PEP x.x) link where I
>     could read why it was deprecated and probably what to use in it's
>     place.
> 
>     The docs right now are quite excellent describing the "what" about
>     everything, but often have little to say about the "why". This is a
>     good thing, but links would surely help those that want to learn more.
> 
>     --
>     Zachary Burns
>     (407)590-4814
>     Aim - Zac256FL
>     Production Engineer (Digital Overlord)
>     Zindagi Games
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at python.org
>     <mailto:Python-ideas at python.org>
>     http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From python at rcn.com  Mon Mar 16 23:05:24 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 15:05:24 -0700
Subject: [Python-ideas] PEP links in docs
References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com><2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com>
	<gpmhvn$hu5$1@ger.gmane.org>
Message-ID: <671579EA9C314A70AE166A0173289355@RaymondLaptop1>


> Feel free to send patches, as small as they may seem.  Mark up PEP numbers
> in reST like this -- :pep:`42` -- to get automatic linking.

We might want to include a PEP index in the documentation
but I think it's a really bad idea to include links from within
the docs.  The PEPs get out of date quickly.  They document
an early decision but not its ultimate evolution and diffusion
through-out the language.

Raymond


From tjreedy at udel.edu  Mon Mar 16 23:09:47 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 16 Mar 2009 18:09:47 -0400
Subject: [Python-ideas] Added a function to parse str.format()
	mini-language specifiers
In-Reply-To: <49BEB718.6020708@trueblade.com>
References: <49BE60DD.7010603@trueblade.com> <gpmbjt$qka$1@ger.gmane.org>
	<49BEB718.6020708@trueblade.com>
Message-ID: <gpmin7$ksn$1@ger.gmane.org>

Eric Smith wrote:
> Terry Reedy wrote:
>> Eric Smith wrote:
>>> I'd like to add a function (or method) to parse str.format()'s 
>>> standard mini-language format specifiers. It's hard to get right, and 
>>> if PEP 378 is accepted, it gets more complex.
> ...
>>> So, is such a function desirable, and if so,
>>
>> Yes, but I would take it further and and consider the string and 
>> dict/named-tuple as alternate interfaces to the formatting machinery. 
>> So I would
> 
> If the only use case for this is for non-builtin numeric types, I'd vote 
> for a named tuple. But since Mark (who's one of the primary users) also 
> raised the dict issue, I'll give it some thought.
> 
>> a) add an inverse function that would take a dict or named tuple and 
>> produce the field specifier as a string (or raise ValueError).  Such a 
>> string could be embedded into a complete format string.  Some people 
>> might prefer this specification method.
> 
> This is a pretty simple transformation. I'm not so sure it's all that 
> useful.
> 
>> b> amend built-in format() to take a dict/n-t as the second argument 
>> on the basis that it is silly to transform the parse result back into 
>> a string just to be parsed again.  This would make repeated calls to 
>> format faster by eliminating the parsing step.
> 
> But format() works for any type, including ones that don't understand 
> the standard mini-language. What would they do with this info?

The same thing they would do (whatever that is) if the second argument 
were instead the equivalent unparsed format-spec string.  The easiest 
implementation of what I am proposing would be for the parse_spec 
function whose output you propose to expose were to recognize when its 
input is not a string but previous parse output.  Just as iter(iter(ob)) 
is iter(ob), parse(parse(spec)) should be the same as parse(spec).

tjr


From tjreedy at udel.edu  Mon Mar 16 23:19:55 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 16 Mar 2009 18:19:55 -0400
Subject: [Python-ideas] Added a function to parse str.format()
	mini-language specifiers
In-Reply-To: <49BECA1A.9020100@gmail.com>
References: <49BE60DD.7010603@trueblade.com> <49BECA1A.9020100@gmail.com>
Message-ID: <gpmja7$mqu$1@ger.gmane.org>

Nick Coghlan wrote:
> Eric Smith wrote:
>> So, is such a function desirable, and if so, where would it go? I could
>> expose it through the string module, which is where the sort-of-related
>> Formatter class lives.
> 
> string.parse_format and string.build_format perhaps? The inverse
> operation would be useful if you just wanted to do something like "use a
> default precision of 3" but otherwise leave things up to the original
> object.
> 
>   def custom_format(fmt, value):
>     details = string.parse_format(fmt)
>     if details["precision"] is None: # Assumes None indicates missing
>       details["precision"] = 3
>     fmt = string.build_format(details)
>     return format(fmt, value)

     return format(value, fmt) # ;-)

> While having to rebuild and reparse the string is a little annoying,

yes

> changing that would involve changing the spec for the __format__ magic
> method and I don't think we want to go there.

If parse_format were idempotent for its output like iter, then the 
change would, I think, be pretty minimal.  I am assuming here that the 
__format__ method calls the parse_format(fmt) function that Eric 
proposed to expose.  If details == parse_format(details) == 
parse_format(build_format(details)), then the rebuild and reparse is not 
needed and passing details instead of the rebuilt string should be 
transparent to __format__.

tjr


From python at rcn.com  Mon Mar 16 23:24:26 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 15:24:26 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
Message-ID: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>


[Guido van Rossum]
> Typically PEPs put the discussion of alternatives in some
> section at the end, after the specification and other stuff relevant
> going forward.

Okay, re-arranged to make it more peppy.


> I guess in the end this means I am in favor of Nick's alternative.

Was hoping you would be more attracted to the other proposal
which more people's needs right out of the box.  No matter
what country you're in, it's nice to have the option to switch
to spaces or underscores regardless of your local convention.

In the end, most respondants seemed to support the more flexible
version (Eric's proposal). 


> One thing I don't understand: the PEP seems to exclude the 'e' and 'g'
> format. I would think that in case 'g' defers to 'f' it should act the
> same, and in case it defers to 'e', well, in the future (under (b)
> above) that could still change the period into a comma, right?

Makes sense.  So noted in the PEP.


Raymond


From ncoghlan at gmail.com  Mon Mar 16 23:37:41 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 17 Mar 2009 08:37:41 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49BE838C.9000906@trueblade.com>
	<20090316195940.6e4e1061@o>	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
Message-ID: <49BED4B5.2060100@gmail.com>

Raymond Hettinger wrote:
>> I guess in the end this means I am in favor of Nick's alternative.
> 
> Was hoping you would be more attracted to the other proposal
> which more people's needs right out of the box.  No matter
> what country you're in, it's nice to have the option to switch
> to spaces or underscores regardless of your local convention.

I actually prefer proposal II as well. It provides a decent quick
solution for one-off scripts and debugging output, while leaving proper
l10n/i18n support to the appropriate (heavier) tools.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From guido at python.org  Mon Mar 16 23:46:37 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 15:46:37 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
Message-ID: <ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>

On Mon, Mar 16, 2009 at 3:24 PM, Raymond Hettinger <python at rcn.com> wrote:
> [Guido van Rossum]
>>
>> Typically PEPs put the discussion of alternatives in some
>> section at the end, after the specification and other stuff relevant
>> going forward.
>
> Okay, re-arranged to make it more peppy.

>> I guess in the end this means I am in favor of Nick's alternative.
>
> Was hoping you would be more attracted to the other proposal
> which more people's needs right out of the box. ?No matter
> what country you're in, it's nice to have the option to switch
> to spaces or underscores regardless of your local convention.

Your preference wasn't clear from the PEP. :-)

> In the end, most respondants seemed to support the more flexible
> version (Eric's proposal).

Well, Python survived for about 19 years without having a way to
override the decimal point *except* by using the locale module. I
guess that divides our users in two classes:

(1) Those for whom the default (C) locale is sufficient -- either
because they live in the US (1a), or because they're used to
programming languages US-centric approach (1b).

(2) Those who absolutely need their numbers formatted for a locale --
either because they want to write heavy-duty localized code (2a), or
because their locale doesn't use a comma and their end users would be
upset to see US-formatted numbers (2b).

For category (1), Nick's minimal proposal is good enough; someone in
category (1b) who can live with a US-centric decimal point can also
live with a US-centric thousands separator.

For category (2a), Eric's proposal is not good enough. Which leaves
category (2b), which must be pretty small because they've apparently
put up with using the locale module anyways.

>> One thing I don't understand: the PEP seems to exclude the 'e' and 'g'
>> format. I would think that in case 'g' defers to 'f' it should act the
>> same, and in case it defers to 'e', well, in the future (under (b)
>> above) that could still change the period into a comma, right?
>
> Makes sense. ?So noted in the PEP.

On Mon, Mar 16, 2009 at 3:37 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I actually prefer proposal II as well. It provides a decent quick
> solution for one-off scripts and debugging output, while leaving proper
> l10n/i18n support to the appropriate (heavier) tools.

For debugging output and one-offs I don't think the period-vs-comma
issue matters much; I'd expect those all to fall in category (1).

Another way to look at it is: adding a thousands separator makes a
*huge* difference for a large group of potential users, because
interpreting numbers with more than 5 or 6 digits is very cumbersome
otherwise. However adding a facility to specify a different character
for the decimal point and for the separator only matters for a much
smaller group of people (2b only), and IMO isn't worth the extra
syntactic complexities. I would much rather add syntactic complexity
to address a larger issue like (2a).

I also have to say that I find Eric's proposal a bit ambiguous: why
shouldn't {:8,d} mean "insert commas between thousands"?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Tue Mar 17 00:02:52 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 16:02:52 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
Message-ID: <385A4935485649C38210DC189A08C9BC@RaymondLaptop1>

> I also have to say that I find Eric's proposal a bit ambiguous: why
> shouldn't {:8,d} mean "insert commas between thousands"?

It does.  That is the sixth example listed:

  format(1234, "8.1f")     -->    '  1234.0'
  format(1234, "8,1f")     -->    '  1234,0'
  format(1234, "8.,1f")    -->    ' 1.234,0'
  format(1234, "8 ,f")     -->    ' 1 234,0'
  format(1234, "8d")       -->    '    1234'
  format(1234, "8,d")      -->    '   1,234'
  format(1234, "8_d")      -->    '   1_234'


Raymond


From guido at python.org  Tue Mar 17 00:06:12 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 16:06:12 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
Message-ID: <ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>

On Mon, Mar 16, 2009 at 4:02 PM, Raymond Hettinger <python at rcn.com> wrote:
>> I also have to say that I find Eric's proposal a bit ambiguous: why
>> shouldn't {:8,d} mean "insert commas between thousands"?
>
> It does. ?That is the sixth example listed:
>
> ?format(1234, "8.1f") ? ? --> ? ?' ?1234.0'
> ?format(1234, "8,1f") ? ? --> ? ?' ?1234,0'
> ?format(1234, "8.,1f") ? ?--> ? ?' 1.234,0'
> ?format(1234, "8 ,f") ? ? --> ? ?' 1 234,0'
> ?format(1234, "8d") ? ? ? --> ? ?' ? ?1234'
> ?format(1234, "8,d") ? ? ?--> ? ?' ? 1,234'
> ?format(1234, "8_d") ? ? ?--> ? ?' ? 1_234'

Argh! So "8,1f" means "use comma instead of point" wherease "8,1d"
means "use comma as 1000 separator"? You guys can't seriously propose
that.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Tue Mar 17 00:25:58 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 16:25:58 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
Message-ID: <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>

> Argh! So "8,1f" means "use comma instead of point" wherease "8,1d"
> means "use comma as 1000 separator"?

They both mean use the comma for the thousands separator.  The decimal separator only gets overridden as part of the precision 
specification if provided:      format(1234, "8,1f")     -->    '  1234,0'

Originally, I proposed prefixing the thousands separator with the letter T:      format(1234, "8T,d")      -->    '   1,234'.  That 
made it crystal clear that the next character was the thousands separator.  But people found it to be ugly and reacted badly.  Eric 
then noticed that the T wasn't essential as long as the decimal separator is tightly associated with the precision specifier.

If you find that to be screwy, then I guess Nick comma-only alternative wins.

Or, there is an alternative that is a little more flexible.  Make the thousands separator one of SPACE, UNDERSCORE, COMMA, or 
APOSTROPHE, leaving out the DOT which is reserved to be the sole decimal separator.  That is unambiguous but doesn't help folks who 
want both a DOT thousands separator and COMMA decimal separator.


Raymond 


From guido at python.org  Tue Mar 17 00:42:10 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 16:42:10 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
Message-ID: <ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>

On Mon, Mar 16, 2009 at 4:25 PM, Raymond Hettinger <python at rcn.com> wrote:
>> Argh! So "8,1f" means "use comma instead of point" wherease "8,1d"
>> means "use comma as 1000 separator"?
>
> They both mean use the comma for the thousands separator. ?The decimal
> separator only gets overridden as part of the precision specification if
> provided: ? ? ?format(1234, "8,1f") ? ? --> ? ?' ?1234,0'

So I misread, but it is exceedingly subtle indeed: apparently if
there's *one* special character it's the decimal point with 'f' and
the thousands separator with 'd'; only 'f' supports *two* special
characters and then the *first* one is the decimal point.

The fact that we need so many emails to sort this out makes it clear
that this proposal will lead to endless user confusion.

> Originally, I proposed prefixing the thousands separator with the letter T:
> ? ? ?format(1234, "8T,d") ? ? ?--> ? ?' ? 1,234'. ?That made it crystal
> clear that the next character was the thousands separator. ?But people found
> it to be ugly and reacted badly. ?Eric then noticed that the T wasn't
> essential as long as the decimal separator is tightly associated with the
> precision specifier.
>
> If you find that to be screwy, then I guess Nick comma-only alternative
> wins.

Yes.

> Or, there is an alternative that is a little more flexible. ?Make the
> thousands separator one of SPACE, UNDERSCORE, COMMA, or APOSTROPHE, leaving
> out the DOT which is reserved to be the sole decimal separator. ?That is
> unambiguous but doesn't help folks who want both a DOT thousands separator
> and COMMA decimal separator.

Right. Let's go ahead with Nick's proposal and put ways of specifying
alternate separators (either via the locale or hardcoded) on the back
burner

Note that, unlike with the original % syntax, in .format() strings we
can easily append extra syntax to the end. E.g. format(1234.5,
"08,.1f;L"} could mean "use the locale", wherease format(1234.5,
"08,.1f;T=_;D=,") could mean "use '_' for thousands, ',' for decimal
point. But please, let's put this off and get Nick's simple proposal
in first.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Tue Mar 17 00:45:56 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Mar 2009 11:45:56 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <20090316195940.6e4e1061@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
Message-ID: <49BEE4B4.6040709@canterbury.ac.nz>

> Eric Smith <eric at trueblade.com> s'exprima ainsi:
>  
>>I vote we move ahead with Proposal II from PEP 378.

Looks fairly good to me.

-- 
Greg


From guido at python.org  Tue Mar 17 00:48:50 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 16:48:50 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BEE4B4.6040709@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<49BEE4B4.6040709@canterbury.ac.nz>
Message-ID: <ca471dc20903161648y3f704125r73fbd6b4fe088aff@mail.gmail.com>

On Mon, Mar 16, 2009 at 4:45 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> Eric Smith <eric at trueblade.com> s'exprima ainsi:
>>
>>>
>>> I vote we move ahead with Proposal II from PEP 378.
>
> Looks fairly good to me.

Of course this is by now ambiguous -- the latest version of the PEP no
longer numbers the versions I and II, and has Nick's version second.
(Which may be reversed by the time you read this if Raymond keeps
updating the PEP in real time. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz at pythoncraft.com  Tue Mar 17 00:54:44 2009
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 16 Mar 2009 16:54:44 -0700
Subject: [Python-ideas] PEP links in docs
In-Reply-To: <671579EA9C314A70AE166A0173289355@RaymondLaptop1>
References: <gpmhvn$hu5$1@ger.gmane.org>
	<671579EA9C314A70AE166A0173289355@RaymondLaptop1>
Message-ID: <20090316235444.GB26292@panix.com>

On Mon, Mar 16, 2009, Raymond Hettinger wrote:
>attribution for Georg Brandl deleted:
>>
>> Feel free to send patches, as small as they may seem.  Mark up PEP numbers
>> in reST like this -- :pep:`42` -- to get automatic linking.
>
> We might want to include a PEP index in the documentation
> but I think it's a really bad idea to include links from within
> the docs.  The PEPs get out of date quickly.  They document
> an early decision but not its ultimate evolution and diffusion
> through-out the language.

There are two reasons to link to PEPs:

* Provide the historical context

* Give detailed info lacking in the docs

The first purpose will always exist, and I see no reason to delete such
a link if documented as such.  Links for the latter purpose can be
removed when the docs are updated to match what's available in the PEP.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

Adopt A Process -- stop killing all your children!


From tjreedy at udel.edu  Tue Mar 17 01:01:53 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 16 Mar 2009 20:01:53 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49BE838C.9000906@trueblade.com>
	<20090316195940.6e4e1061@o>	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
Message-ID: <gpmp9d$91s$1@ger.gmane.org>

Guido van Rossum wrote:
> On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger <python at rcn.com> wrote:
>> The conversation on the thousands separator seems to have wound down
>> and the PEP has stabilized:  http://www.python.org/dev/peps/pep-0378/
>>
>> Please pronounce.
> 
> That's not a PEP, it's just a summary of a discussion without any
> choice. :-)

I hope Raymond can understand this.  To me, the choice presented is to 
add the Main Proposal syntax extension, or not.

  Typically PEPs put the discussion of alternatives in some
> section at the end, after the specification and other stuff relevant
> going forward.

You want more alternatives than the Nick's Alternative Proposal, 
discussed at the end?  I believe most of the other ideas on the list 
were directed at some sense of (b) below.

> Just to add more fuel to the fire, did anyone propose refactoring the
> problem into (a) a way to produce output with a thousands separator,
> and (b) a way to localize formats?

Since a way to produce output with a choice of thousands separators is a 
necessary part of a way localize formats, I am not sure of what 
distinction you are trying to draw.

'Localize formats' has two quite distinct meanings: 'format this number 
in a particular way (which can vary from number to number or at least 
user to user)' versus 'format all numbers according to a particular 
national standard'.

 > We could solve (a) by adding a
> comma to all numeric format languages along Nick's proposal,

Raymond current proposal, based on discussion, is to offer users a 
choice of 5 chars as thousands separators (and allow a choice of decimal 
separator).  Nick's proposal is to only offer comma as thousands 
separator.  While the latter meets my current parochial needs, I favor 
the more inclusive approach.

> and we could solve (b) either now

Raymond's main proposal partially solves that now (which is to say, 
completely solves than now for most of the world) in the first sense I 
gave for (b), on a case-by-case basis.

 > or later by adding some other flag that
> means "use locale-specific numeric formatting for this value".

As I understand from Raymond's introductory comments and those in the 
locale module docs, the global C locale setting is not intended to be 
changed on an output-by-output basis.  Hence, while useful for 
nationalizing software, it is not so useful for individualized output 
from global software.

> perhaps there could be two separate flags corresponding to the
> grouping and monetary arguments to locale.format().

The flags just say to use the global locale settings, which have the 
limitations indicated above.  Raymond's proposal is that a Python 
programmer should be better able to say "Format this number how I (or a 
particular user) want it to be formatted, regardless of the 'locale' 
setting".

 > I'd be happy to punt on (b) until later.

> This is somewhat analogous to the approach for strftime() which has
> syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p,
> %x, %X).

With the attendant pluses and minuses.

> I guess in the end this means I am in favor of Nick's alternative.

I fail to see how this follows from your previous comments.

> One thing I don't understand: the PEP seems to exclude the 'e' and 'g'
> format.

Both proposals claim to include e and g.  However, since thousands 
separators only apply to the left of the decimal point, and e notation 
only has one digit to the left, no thousands separator proposal will 
apply the e (and g when it produces e).  The only known separator used 
to the left is a space, typically in groups of 5 digits, in some math 
tables.  The decimal separator part of the PEP *does* apply to e and g.

> I would think that in case 'g' defers to 'f' it should act the
> same, and in case it defers to 'e', well, in the future (under (b)
> above) that could still change the period into a comma, right?

With the main proposal, one could simply specify, for instance, '8,1f' 
instead of '8.1f' to make that change *now*.  I consider that much 
better than post-processing, which Nick's alternative would continue to 
require, and which gets worse with thousands separators added.

Terry Jan Reedy


From python at rcn.com  Tue Mar 17 01:12:20 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 17:12:20 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	athousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49BE838C.9000906@trueblade.com>
	<20090316195940.6e4e1061@o><49BEE4B4.6040709@canterbury.ac.nz>
	<ca471dc20903161648y3f704125r73fbd6b4fe088aff@mail.gmail.com>
Message-ID: <D1C6566FFB444F4B85C0E0E5FBB0369B@RaymondLaptop1>


>>>> I vote we move ahead with Proposal II from PEP 378.
>>
>> Looks fairly good to me.
> 
> Of course this is by now ambiguous -- the latest version of the PEP no
> longer numbers the versions I and II, and has Nick's version second.
> (Which may be reversed by the time you read this if Raymond keeps
> updating the PEP in real time. :-)

To keep the conversation in sync with today's real-time updates,
I've put back in the "perma-names", Proposal I (nick's) and Proposal II (eric's).


Raymond


From greg.ewing at canterbury.ac.nz  Tue Mar 17 01:36:39 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Mar 2009 12:36:39 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903161648y3f704125r73fbd6b4fe088aff@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<49BEE4B4.6040709@canterbury.ac.nz>
	<ca471dc20903161648y3f704125r73fbd6b4fe088aff@mail.gmail.com>
Message-ID: <49BEF097.2080408@canterbury.ac.nz>

Guido van Rossum wrote:

> Of course this is by now ambiguous -- the latest version of the PEP no
> longer numbers the versions I and II

To be clear, I'm in favour of Nick's version.

(I share your concern about the apparent ambiguities
in Eric's version -- it confused me too the first
few times I read it!)

-- 
Greg


From greg.ewing at canterbury.ac.nz  Tue Mar 17 01:41:11 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Mar 2009 12:41:11 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpmp9d$91s$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<gpmp9d$91s$1@ger.gmane.org>
Message-ID: <49BEF1A7.3030007@canterbury.ac.nz>

Concerning the difficulty of exchanging "." and "," by
post-processing, it might be generally useful to have
a swap(s1, s2) method on strings that would replace
occurrences of s1 by s2 and vice versa.

-- 
Greg


From rdmurray at bitdance.com  Tue Mar 17 02:55:51 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 17 Mar 2009 01:55:51 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<49BEE4B4.6040709@canterbury.ac.nz>
	<ca471dc20903161648y3f704125r73fbd6b4fe088aff@mail.gmail.com>
	<49BEF097.2080408@canterbury.ac.nz>
Message-ID: <gpmvv7$o0i$1@ger.gmane.org>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> To be clear, I'm in favour of Nick's version.
> 
> (I share your concern about the apparent ambiguities
> in Eric's version -- it confused me too the first
> few times I read it!)

I'll chime in in favor of the simpler proposal and leaving the 'specify
what characters to use' ability for later.  That's the way I've felt from
the beginning of the discussion, for what it's worth.  It feels like the
factoring Guido talked about ("yes I want thousands separators" and then
separately "here's what I want to use for thousands/decimal separators")
is the correct way to break down the problem.

--
R. David Murray           http://www.bitdance.com


From guido at python.org  Tue Mar 17 03:06:30 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 19:06:30 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <gpmp9d$91s$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<gpmp9d$91s$1@ger.gmane.org>
Message-ID: <ca471dc20903161906wb4a3291m3f81f5e354dbce85@mail.gmail.com>

Our emails crossed.

On Mon, Mar 16, 2009 at 5:01 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Guido van Rossum wrote:
>>
>> On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger <python at rcn.com>
>> wrote:
>>>
>>> The conversation on the thousands separator seems to have wound down
>>> and the PEP has stabilized: ?http://www.python.org/dev/peps/pep-0378/
>>>
>>> Please pronounce.
>>
>> That's not a PEP, it's just a summary of a discussion without any
>> choice. :-)
>
> I hope Raymond can understand this. ?To me, the choice presented is to add
> the Main Proposal syntax extension, or not.
>
> ?Typically PEPs put the discussion of alternatives in some
>>
>> section at the end, after the specification and other stuff relevant
>> going forward.
>
> You want more alternatives than the Nick's Alternative Proposal, discussed
> at the end? ?I believe most of the other ideas on the list were directed at
> some sense of (b) below.
>
>> Just to add more fuel to the fire, did anyone propose refactoring the
>> problem into (a) a way to produce output with a thousands separator,
>> and (b) a way to localize formats?
>
> Since a way to produce output with a choice of thousands separators is a
> necessary part of a way localize formats, I am not sure of what distinction
> you are trying to draw.
>
> 'Localize formats' has two quite distinct meanings: 'format this number in a
> particular way (which can vary from number to number or at least user to
> user)' versus 'format all numbers according to a particular national
> standard'.
>
>> We could solve (a) by adding a
>>
>> comma to all numeric format languages along Nick's proposal,
>
> Raymond current proposal, based on discussion, is to offer users a choice of
> 5 chars as thousands separators (and allow a choice of decimal separator).
> ?Nick's proposal is to only offer comma as thousands separator. ?While the
> latter meets my current parochial needs, I favor the more inclusive
> approach.
>
>> and we could solve (b) either now
>
> Raymond's main proposal partially solves that now (which is to say,
> completely solves than now for most of the world) in the first sense I gave
> for (b), on a case-by-case basis.
>
>> or later by adding some other flag that
>>
>> means "use locale-specific numeric formatting for this value".
>
> As I understand from Raymond's introductory comments and those in the locale
> module docs, the global C locale setting is not intended to be changed on an
> output-by-output basis. ?Hence, while useful for nationalizing software, it
> is not so useful for individualized output from global software.
>
>> perhaps there could be two separate flags corresponding to the
>> grouping and monetary arguments to locale.format().
>
> The flags just say to use the global locale settings, which have the
> limitations indicated above. ?Raymond's proposal is that a Python programmer
> should be better able to say "Format this number how I (or a particular
> user) want it to be formatted, regardless of the 'locale' setting".
>
>> I'd be happy to punt on (b) until later.
>
>> This is somewhat analogous to the approach for strftime() which has
>> syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p,
>> %x, %X).
>
> With the attendant pluses and minuses.
>
>> I guess in the end this means I am in favor of Nick's alternative.
>
> I fail to see how this follows from your previous comments.
>
>> One thing I don't understand: the PEP seems to exclude the 'e' and 'g'
>> format.
>
> Both proposals claim to include e and g. ?However, since thousands
> separators only apply to the left of the decimal point, and e notation only
> has one digit to the left, no thousands separator proposal will apply the e
> (and g when it produces e). ?The only known separator used to the left is a
> space, typically in groups of 5 digits, in some math tables. ?The decimal
> separator part of the PEP *does* apply to e and g.
>
>> I would think that in case 'g' defers to 'f' it should act the
>> same, and in case it defers to 'e', well, in the future (under (b)
>> above) that could still change the period into a comma, right?
>
> With the main proposal, one could simply specify, for instance, '8,1f'
> instead of '8.1f' to make that change *now*. ?I consider that much better
> than post-processing, which Nick's alternative would continue to require,
> and which gets worse with thousands separators added.
>
> Terry Jan Reedy
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Tue Mar 17 03:33:16 2009
From: python at rcn.com (Raymond Hettinger)
Date: Mon, 16 Mar 2009 19:33:16 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
Message-ID: <AE9C5191227F4947B49D27339D168021@RaymondLaptop1>

> Right. Let's go ahead with Nick's proposal and put ways of specifying
> alternate separators (either via the locale or hardcoded) on the back
> burner

Mark PEP 378 as accepted with Nick's original comma-only version?


Raymond


From guido at python.org  Tue Mar 17 04:14:58 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 16 Mar 2009 20:14:58 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
Message-ID: <ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>

On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger <python at rcn.com> wrote:
>> Right. Let's go ahead with Nick's proposal and put ways of specifying
>> alternate separators (either via the locale or hardcoded) on the back
>> burner
>
> Mark PEP 378 as accepted with Nick's original comma-only version?

OK, done. Looking forward to a swift implementation!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From eric at trueblade.com  Tue Mar 17 09:23:44 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 04:23:44 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
Message-ID: <49BF5E10.6060205@trueblade.com>

Guido van Rossum wrote:
> On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger <python at rcn.com> wrote:
>>> Right. Let's go ahead with Nick's proposal and put ways of specifying
>>> alternate separators (either via the locale or hardcoded) on the back
>>> burner
>> Mark PEP 378 as accepted with Nick's original comma-only version?
> 
> OK, done. Looking forward to a swift implementation!
> 

I'm on it.

Eric.


From dickinsm at gmail.com  Tue Mar 17 10:01:27 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 17 Mar 2009 09:01:27 +0000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
Message-ID: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>

On Tue, Mar 17, 2009 at 3:14 AM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger <python at rcn.com> wrote:
>> Mark PEP 378 as accepted with Nick's original comma-only version?
>
> OK, done. Looking forward to a swift implementation!

I'll implement this for Decimal;  it shouldn't take long.

One question from the PEP, which I've been too slow to read until
this morning:  should commas appear in the zero-filled part of a
number?  That is, should format(1234, "09,d") give '00001,234'
or '0,001,234'?  The PEP specifies that format(1234, "08,d")
should give '0001,234', but that's something of a special case:
',001,234' isn't really a viable alternative.

Mark


From cmjohnson.mailinglist at gmail.com  Tue Mar 17 10:30:12 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Mon, 16 Mar 2009 23:30:12 -1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BEF1A7.3030007@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o>
	<FB9F3DDBC14144C6A3707F4603F7DBFC@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<gpmp9d$91s$1@ger.gmane.org> <49BEF1A7.3030007@canterbury.ac.nz>
Message-ID: <3bdda690903170230w24a9352cu9621d87fc0b0d59f@mail.gmail.com>

Greg Ewing wrote:

> Concerning the difficulty of exchanging "." and "," by
> post-processing, it might be generally useful to have
> a swap(s1, s2) method on strings that would replace
> occurrences of s1 by s2 and vice versa.

I would appreciate having that. There are a lot of small jobs where
str.translate and re are overkill, but s.replace(s1, TEMPCHAR); is
awkward, since you're not sure what you can safely use as a tempchar.

-- Carl Johnson


From python at rcn.com  Tue Mar 17 10:50:28 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 02:50:28 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
Message-ID: <15F41E9FF37D47E98CBCD67418787D49@RaymondLaptop1>

 
[Mark]
> One question from the PEP, which I've been too slow to read until
> this morning:  should commas appear in the zero-filled part of a
> number? 

I think it should.  That lets all the commas and decimals line up vertically.
Anything else would look weird.

>>>   for n in seq:
...          print format(n, "09,d")

1,234,567
0,000,001
0,255,989


Raymond


From denis.spir at free.fr  Tue Mar 17 11:28:02 2009
From: denis.spir at free.fr (spir)
Date: Tue, 17 Mar 2009 11:28:02 +0100
Subject: [Python-ideas] PEP links in docs
In-Reply-To: <20090316235444.GB26292@panix.com>
References: <gpmhvn$hu5$1@ger.gmane.org>
	<671579EA9C314A70AE166A0173289355@RaymondLaptop1>
	<20090316235444.GB26292@panix.com>
Message-ID: <20090317112802.4ef95767@o>

Le Mon, 16 Mar 2009 16:54:44 -0700,
Aahz <aahz at pythoncraft.com> s'exprima ainsi:

> On Mon, Mar 16, 2009, Raymond Hettinger wrote:
> >attribution for Georg Brandl deleted:
> >>
> >> Feel free to send patches, as small as they may seem.  Mark up PEP
> >> numbers in reST like this -- :pep:`42` -- to get automatic linking.
> >
> > We might want to include a PEP index in the documentation
> > but I think it's a really bad idea to include links from within
> > the docs.  The PEPs get out of date quickly.  They document
> > an early decision but not its ultimate evolution and diffusion
> > through-out the language.
> 
> There are two reasons to link to PEPs:
> 
> * Provide the historical context

Agree with Raymond. It should be made clear along with the pointer that the pointed PEP could be outdated. Maybe simply writing
	(original PEP: :pep:`42`)
is enough? The word 'original' implicitely stating that things could have changed?

> * Give detailed info lacking in the docs
> 
> The first purpose will always exist, and I see no reason to delete such
> a link if documented as such.  Links for the latter purpose can be
> removed when the docs are updated to match what's available in the PEP.

To me the most important aspect is not about having details (still, it's important). Instead it is to get an (even partial or obscure) answer to "why", that often simply misses in the official docs. This answer is necessary to interpret the "what" and/or "how". Nobody can properly understand a feature description without knowing its purpose. In the best case, the reader will guess it, in he worst, he will guess wrong. Explicit is better than... also for docs!

There is also a pedagogical aspect that I find even more relevant. An unexperienced programmer or pythonist should at least be able to figure out vaguely what this or that feature is about. No doubt this is very difficult to achieve -- especially for experts! I imagine a 2-stage "why" introduction to every feature description in the docs: the first one targeted to non-specialists, the second one more technical. (I'm sure that the first one would also sometimes help specialists.)
The pedagogical one must be worded by, or reviewed by, or written in colloboration with, "tutors" that are able to imagine where/how/why newbies may stuck. It needs not beeing long -- sometimes a few words may be enough. (But it will sometimes be the hardest part ;-). It would also benefit from newbie feedback. I wonder whether this task could be partially delegated to the python-tutor mailing list activists.

Denis

PS: If ever, I volonteer to take part to this kind of task -- for the french version. [Have been technical trainer (in automation) in a previous life.]

------
la vita e estrany


From greg.ewing at canterbury.ac.nz  Tue Mar 17 12:04:34 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 17 Mar 2009 23:04:34 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>
	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
Message-ID: <49BF83C2.9090505@canterbury.ac.nz>

Mark Dickinson wrote:
> The PEP specifies that format(1234, "08,d")
> should give '0001,234', but that's something of a special case:
> ',001,234' isn't really a viable alternative.

Both of those look equally unviable to me. I don't
think I'd ever use zero filling together with commas
myself, as it looks decidedly weird, but if I had
to pick a meaning for format(1234, "08,d") I think
I would make it

   ' 001,234'

the reasoning being that since a comma falls on the
first position of an 8-char field, you can never
put a digit there, and putting a comma at the
beginning is no use.

If there are more than 6 digits, then you get a
comma plus an extra digit, making the field overflow
to 9 characters, e.g. format(1234567, "08,d") gives

  '1,234,567'

-- 
Greg


From eric at trueblade.com  Tue Mar 17 12:15:13 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 07:15:13 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
Message-ID: <49BF8641.5080609@trueblade.com>

Mark Dickinson wrote:
> On Tue, Mar 17, 2009 at 3:14 AM, Guido van Rossum <guido at python.org> wrote:
>> On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger <python at rcn.com> wrote:
>>> Mark PEP 378 as accepted with Nick's original comma-only version?
>> OK, done. Looking forward to a swift implementation!
> 
> I'll implement this for Decimal;  it shouldn't take long.
> 
> One question from the PEP, which I've been too slow to read until
> this morning:  should commas appear in the zero-filled part of a
> number?  That is, should format(1234, "09,d") give '00001,234'
> or '0,001,234'?  The PEP specifies that format(1234, "08,d")
> should give '0001,234', but that's something of a special case:
> ',001,234' isn't really a viable alternative.

Hmm. No good answers here. I'd vote for not putting the commas in the 
leading zeros.  I don't think anyone would ever actually use this 
combination, and putting commas there complicates things due to the 
special case with the first digit.

Plus, they're not inserted by the 'n' formatter, and no one has 
complained (which might mean no one's using it, of course).

In 2.6:

 >>> import locale
 >>> locale.setlocale(locale.LC_ALL, 'en_US.UTF8')
'en_US.UTF8'
 >>> format(12345, '010n')
'000012,345'
 >>> format(12345, '09n')
'00012,345'
 >>> format(12345, '08n')
'0012,345'
 >>> format(12345, '07n')
'012,345'
 >>> format(12345, '06n')
'12,345'
 >>>


From hwpuschm at yahoo.de  Tue Mar 17 13:23:23 2009
From: hwpuschm at yahoo.de (hwpuschm at yahoo.de)
Date: Tue, 17 Mar 2009 12:23:23 +0000 (GMT)
Subject: [Python-ideas] Keyword same in right hand side of assignments (rev)
Message-ID: <339525.47450.qm@web25805.mail.ukl.yahoo.com>


Thank you very much for correctly remarking that the "definition" I formulated contradicts the examples I gave and is therefore utterly inadecuate:

> It is proposed to introduce a Keyword "same", 
> to be used in the right hand side of assignments, as
> follows: 
> ?
> ? "xx = same + 5" or "...DELETED..." synonymous with "xx += 5"
> ? "value =? 2*same + 5"? synonymous with "value =*2;
> value +=5" 
> ? "switch = 1 - same"? synonymous with "switch *-1;
> switch +=1" 
> ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]"
> 
> ? "lst = [5,6] + same" synonymous with? "...DELETED..."
> ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" 
> ?
> and so on.

What I would like is to extend the augmented assignment
and make it easy to understand for naive readers.
I hope the following literary definition 
is consistent enough to convey the correct meaning:
? "whenever it is possible, modify the target IN PLACE 
? according to the right hand side expression.
? If it is not possible to do such a thing,
? substitute the target object with 
? an object that is build according to the right hand side expression
? and subsequently deleted"

The following examples should be correct:
? "xx = same + 5"? synonymous with? "xx += 5" 
? "value =? 2*same + 5"? synonymous with? "value =*2; value +=5" 
? "switch = 1 - same"? synonymous with? "switch *-1; switch +=1" 
? "lst = same + [5,6]"? synonymous with? "lst += [5,6]"
? "lst[2] = 1/same" synonymous with? "lst[2] **=-1"
The following examples would be extensions:
? "lst = [5,6] + same" synonymous with
? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()"
? "inmutable = same*(same+1)"? synonymous with
? ? ? "unused=inmutable+1; inmutable*=unused; del unused"

There seems to be no really simple expression for the above extensions,
and I take that as an indication
that the proposed feature could be quite useful.


From hwpuschm at yahoo.de  Tue Mar 17 13:34:33 2009
From: hwpuschm at yahoo.de (hwpuschm at yahoo.de)
Date: Tue, 17 Mar 2009 12:34:33 +0000 (GMT)
Subject: [Python-ideas] Keyword same in right hand side of assignments (rev)
Message-ID: <433096.96894.qm@web25807.mail.ukl.yahoo.com>


Thank you very much for correctly remarking
that the "definition" I formulated contradicts the examples I gave
and is therefore utterly inadecuate:

> It is proposed to introduce a Keyword "same",
> to be used in the right hand side of assignments, as
> follows:
>? 
>???"xx = same + 5" or "...DELETED..." synonymous with "xx += 5"
>???"value =? 2*same + 5"? synonymous with "value =*2;
> value +=5"
>???"switch = 1 - same"? synonymous with "switch *-1;
> switch +=1"
>???"lst = same + [5,6]"? synonymous with? "lst += [5,6]"
>
>???"lst = [5,6] + same" synonymous with? "...DELETED..."
>???"lst[2] = 1/same" synonymous with? "lst[2] **=-1"
>? 
> and so on.

What I would like is to extend the augmented assignment
and make it easy to understand for naive readers.
I hope the following literary definition
is consistent enough to convey the correct meaning:
? "whenever it is possible, modify the target IN PLACE
? according to the right hand side expression.
? If it is not possible to do such a thing,
? substitute the target object with
? an object that is build according to the right hand side expression
? and subsequently deleted"

The following examples should be correct:
? "xx = same + 5"? synonymous with? "xx += 5"
? "value =? 2*same + 5"? synonymous with? "value =*2; value +=5"
? "switch = 1 - same"? synonymous with? "switch *-1; switch +=1"
? "lst = same + [5,6]"? synonymous with? "lst += [5,6]"
? "lst[2] = 1/same" synonymous with? "lst[2] **=-1"
The following examples would be extensions:
? "lst = [5,6] + same" synonymous with
? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()"
? "inmutable = same*(same+1)"? synonymous with
? ? ? "unused=inmutable+1; inmutable*=unused; del unused"

There seems to be no really simple expression for the above extensions,
and I take that as an indication
that the proposed feature could be quite useful.


From andre.roberge at gmail.com  Tue Mar 17 13:41:42 2009
From: andre.roberge at gmail.com (Andre Roberge)
Date: Tue, 17 Mar 2009 09:41:42 -0300
Subject: [Python-ideas] Keyword same in right hand side of assignments
	(rev)
In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
Message-ID: <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>

On Tue, Mar 17, 2009 at 9:23 AM, <hwpuschm at yahoo.de> wrote:

>
> Thank you very much for correctly remarking that the "definition" I
> formulated contradicts the examples I gave and is therefore utterly
> inadecuate:
>
> > It is proposed to introduce a Keyword "same",
> > to be used in the right hand side of assignments, as
> > follows:


I once wrote a blog post on how an expression like "N = N + 1" was confusing
to beginners, so I'm sympathetic with the underlying idea.
(http://aroberge.blogspot.com/2005/05/n-n-1.html    Note that there are much
better explanations for naming objects in Python than this old post I
wrote.)

However, I am -1 on this proposal.  IMO, it decreases readability and
achieves very little in terms of clearing up the confusion.

Quick test: which is the easier to read and get right?

n = same + 1
n = sane + 1
n = n + 1


Andr?


>
> >
> >   "xx = same + 5" or "...DELETED..." synonymous with "xx += 5"
> >   "value =  2*same + 5"  synonymous with "value =*2;
> > value +=5"
> >   "switch = 1 - same"  synonymous with "switch *-1;
> > switch +=1"
> >   "lst = same + [5,6]"  synonymous with  "lst += [5,6]"
> >
> >   "lst = [5,6] + same" synonymous with  "...DELETED..."
> >   "lst[2] = 1/same" synonymous with  "lst[2] **=-1"
> >
> > and so on.
>
> What I would like is to extend the augmented assignment
> and make it easy to understand for naive readers.
> I hope the following literary definition
> is consistent enough to convey the correct meaning:
>   "whenever it is possible, modify the target IN PLACE
>   according to the right hand side expression.
>   If it is not possible to do such a thing,
>   substitute the target object with
>   an object that is build according to the right hand side expression
>   and subsequently deleted"
>
> The following examples should be correct:
>   "xx = same + 5"  synonymous with  "xx += 5"
>   "value =  2*same + 5"  synonymous with  "value =*2; value +=5"
>   "switch = 1 - same"  synonymous with  "switch *-1; switch +=1"
>   "lst = same + [5,6]"  synonymous with  "lst += [5,6]"
>   "lst[2] = 1/same" synonymous with  "lst[2] **=-1"
> The following examples would be extensions:
>   "lst = [5,6] + same" synonymous with
>       "lst.reverse(); lst.extend([6,5]); lst.reverse()"
>   "inmutable = same*(same+1)"  synonymous with
>       "unused=inmutable+1; inmutable*=unused; del unused"
>
> There seems to be no really simple expression for the above extensions,
> and I take that as an indication
> that the proposed feature could be quite useful.
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090317/906a5d8c/attachment.html>

From dickinsm at gmail.com  Tue Mar 17 13:57:26 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 17 Mar 2009 12:57:26 +0000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BF8641.5080609@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
Message-ID: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>

On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith <eric at trueblade.com> wrote:
> Hmm. No good answers here. I'd vote for not putting the commas in the
> leading zeros. ?I don't think anyone would ever actually use this
> combination, and putting commas there complicates things due to the special
> case with the first digit.
>
> Plus, they're not inserted by the 'n' formatter, and no one has complained
> (which might mean no one's using it, of course).

But they *are* inserted by locale.format, and
presumably no-one has complained about that either. :-)

>>> format('%014f', 123.456, grouping=1)
'0,000,123.456000'

It appears that locale.format adds the thousand separators after
the fact, so the issue with the leading comma doesn't come up.
That also means that the relationship between the field width (14
in this case) and the string length (16) is somewhat obscured.

Mark


From dickinsm at gmail.com  Tue Mar 17 14:18:42 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 17 Mar 2009 13:18:42 +0000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BF83C2.9090505@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF83C2.9090505@canterbury.ac.nz>
Message-ID: <5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com>

On Tue, Mar 17, 2009 at 11:04 AM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> [...] as it looks decidedly weird, but if I had
> to pick a meaning for format(1234, "08,d") I think
> I would make it
>
> ?' 001,234'

Yes, that looks better than either of the alternatives I gave.

I think I prefer that commas *do* appear in the zero padding, though
as Eric says, it does add some extra complication to the code.  In
the case of the decimal code that complication is significant, mainly
because of the need to figure out how much space is available
for the zeros *before* doing the comma insertion.

Mark


From eric at trueblade.com  Tue Mar 17 14:24:04 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 09:24:04 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>	
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>	
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
Message-ID: <49BFA474.9040308@trueblade.com>

Mark Dickinson wrote:
> On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith <eric at trueblade.com> wrote:
>> Hmm. No good answers here. I'd vote for not putting the commas in the
>> leading zeros.  I don't think anyone would ever actually use this
>> combination, and putting commas there complicates things due to the special
>> case with the first digit.
>>
>> Plus, they're not inserted by the 'n' formatter, and no one has complained
>> (which might mean no one's using it, of course).
> 
> But they *are* inserted by locale.format, and
> presumably no-one has complained about that either. :-)
> 
>>>> format('%014f', 123.456, grouping=1)
> '0,000,123.456000'
> 
> It appears that locale.format adds the thousand separators after
> the fact, so the issue with the leading comma doesn't come up.
> That also means that the relationship between the field width (14
> in this case) and the string length (16) is somewhat obscured.

Ick. Presumably you specified a width because that's how wide you wanted 
the output to be!

I still like leaving the commas out of leading zeros.


From jh at improva.dk  Tue Mar 17 14:21:46 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 17 Mar 2009 14:21:46 +0100
Subject: [Python-ideas] Keyword same in right hand side of assignments
 (rev)
In-Reply-To: <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
	<7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>
Message-ID: <49BFA3EA.70108@improva.dk>

Andre Roberge wrote:
>
>
> On Tue, Mar 17, 2009 at 9:23 AM, <hwpuschm at yahoo.de 
> <mailto:hwpuschm at yahoo.de>> wrote:
>
>
>     Thank you very much for correctly remarking that the "definition"
>     I formulated contradicts the examples I gave and is therefore
>     utterly inadecuate:
>
>     > It is proposed to introduce a Keyword "same",
>     > to be used in the right hand side of assignments, as
>     > follows:
>
>
>
> I once wrote a blog post on how an expression like "N = N + 1" was 
> confusing to beginners, so I'm sympathetic with the underlying idea.
> (http://aroberge.blogspot.com/2005/05/n-n-1.html    Note that there 
> are much better explanations for naming objects in Python than this 
> old post I wrote.)
>
> However, I am -1 on this proposal.  IMO, it decreases readability and 
> achieves very little in terms of clearing up the confusion.
>
> Quick test: which is the easier to read and get right?
>
> n = same + 1
> n = sane + 1
> n = n + 1
>
I believe that as soon as the left-hand side stops being a simple 
variable and it is used in non-trivial expressions on the right-hand 
side, using the keyword would help clarify the intent.  What I mean is 
that the examples you should be looking at are more like:

A[n+1] = same*same + 1
B[2*j].foo = frobnicate(same, same+1)
...

If you try expanding these into current python with minimal change in 
semantics you will end up with something like

_1 = n+1
_2 = A[_1]
A[_1] = _2*_2 + 1
del _1
del _2

_1 = B[2*j]
_2 = _1.foo
_1.foo = frobnicate(_2, _2+1)
del _1
del _2

which is much less readable.

I still think that the cost of a new keyword is probably too high a 
price to pay, but I like the idea.

Jacob


From guido at python.org  Tue Mar 17 15:17:10 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Mar 2009 07:17:10 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BFA474.9040308@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
Message-ID: <ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>

On Tue, Mar 17, 2009 at 6:24 AM, Eric Smith <eric at trueblade.com> wrote:
> Mark Dickinson wrote:
>>
>> On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith <eric at trueblade.com> wrote:
>>>
>>> Hmm. No good answers here. I'd vote for not putting the commas in the
>>> leading zeros. ?I don't think anyone would ever actually use this
>>> combination, and putting commas there complicates things due to the
>>> special
>>> case with the first digit.
>>>
>>> Plus, they're not inserted by the 'n' formatter, and no one has
>>> complained
>>> (which might mean no one's using it, of course).
>>
>> But they *are* inserted by locale.format, and
>> presumably no-one has complained about that either. :-)
>>
>>>>> format('%014f', 123.456, grouping=1)
>>
>> '0,000,123.456000'
>>
>> It appears that locale.format adds the thousand separators after
>> the fact, so the issue with the leading comma doesn't come up.
>> That also means that the relationship between the field width (14
>> in this case) and the string length (16) is somewhat obscured.
>
> Ick. Presumably you specified a width because that's how wide you wanted the
> output to be!
>
> I still like leaving the commas out of leading zeros.

Ick, the discrepancy between the behavior of locale.format() and PEP
378 is unfortunate. I agree that the given width should include the
commas, but I strongly feel that leading zeros should be comma-fied
just like everything else.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From eric at trueblade.com  Tue Mar 17 15:56:07 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 10:56:07 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>	<49BF83C2.9090505@canterbury.ac.nz>
	<5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com>
Message-ID: <49BFBA07.1080105@trueblade.com>

Mark Dickinson wrote:
> On Tue, Mar 17, 2009 at 11:04 AM, Greg Ewing
> <greg.ewing at canterbury.ac.nz> wrote:
>> [...] as it looks decidedly weird, but if I had
>> to pick a meaning for format(1234, "08,d") I think
>> I would make it
>>
>>  ' 001,234'
> 
> Yes, that looks better than either of the alternatives I gave.
> 
> I think I prefer that commas *do* appear in the zero padding, though
> as Eric says, it does add some extra complication to the code.  In
> the case of the decimal code that complication is significant, mainly
> because of the need to figure out how much space is available
> for the zeros *before* doing the comma insertion.

If you look at _PyString_InsertThousandsGrouping, you'll see that it 
gets called twice. Once to compute the size, and once to actually do the 
inserting.


From eric at trueblade.com  Tue Mar 17 15:58:17 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 10:58:17 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>	<49BF8641.5080609@trueblade.com>	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
Message-ID: <49BFBA89.3060406@trueblade.com>

Guido van Rossum wrote:
> Ick, the discrepancy between the behavior of locale.format() and PEP
> 378 is unfortunate. I agree that the given width should include the
> commas, but I strongly feel that leading zeros should be comma-fied
> just like everything else.

And what happens when the comma would be the first character?

,012,345
0012,345

or something else?


From mishok13 at gmail.com  Tue Mar 17 16:00:52 2009
From: mishok13 at gmail.com (Andrii V. Mishkovskyi)
Date: Tue, 17 Mar 2009 17:00:52 +0200
Subject: [Python-ideas] dict '+' operator and slicing support for pop
Message-ID: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>

First of all, this is my first attempt at submitting an idea to
python-ideas. So, here it goes. :)

1. Add ability to use '+' operator for dicts

I often wonder why list and tuple instances have '+' and '+='
operators but dicts don't?
It's not that rare in my code (and code written by others, as it
seems) that i have to write:

a.update(b)
return a

I do understand that adding additional magic method may be
inappropriate for dict, but I think it would be nice addition to a
language. So, my proposal is that:

x = a + b
would become equivalent to
x = dict(a, **b)

a += b
would become equivalent to
a.update(b)

And the example I gave before would be translated to:

return a + b

Note, that there is a difference between these two examples in
semantics, the latter one creates a new dict. But that's what user
doesn't care about in 99% of use-cases.
A very basic implementation in Python:

>>> class Dict(dict):
...     def __add__(self, other):
...         return self.__class__(self, **other)
...     def __iadd__(self, other):
...         self.update(other)
...         return self
...
>>> a = Dict(foo=12, bar=14, baz=16)
>>> b = Dict(spam=13, eggs=17, bacon=19)
>>> a + b
{'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16}
>>> a += b
>>> a
{'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16}
>>> b
{'eggs': 17, 'bacon': 19, 'spam': 13}

>>> class Dict(dict):
...     def __add__(self, other):
...         return self.__class__(self, **other)
...     def __iadd__(self, other):
...         self.update(other)
...         return self
...
>>> a = Dict(foo=12, bar=14, baz=16)
>>> b = Dict(spam=13, eggs=17, bacon=19)
>>> a + b
{'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16}
>>> a += b
>>> a
{'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16}
>>> b
{'eggs': 17, 'bacon': 19, 'spam': 13}

Note: if this is ever going to be implemented, then Mapping ABCs will
have to implement these methods, which doesn't sound like
backwards-compatible to me. :)

2. Ability to use slices in pop

This was discussed earlier (4.5 years ago, actually) in this thread:
http://mail.python.org/pipermail/python-dev/2004-November/049895.html
Even though original request became a full-blown proposal (PEP-3132)
and was implemented in py3k, the internal discussion about pop
allowing slice as arguments has silenced.
There was some positive feedback from Python developers and I think I
can provide a patch for this functionality in 2 weeks. Is there still
some interest in this? There is nothing really hard, this would be
something like this:

>>> class List(list):
...     def pop(self, index_or_slice):
...         ret = self[index_or_slice]
...         del self[index_or_slice]
...         return ret
...
>>> x = List(range(10))
>>> x.pop(slice(1, 4))
[1, 2, 3]
>>> x
[0, 4, 5, 6, 7, 8, 9]
>>> x.pop(5)
8
>>> x
[0, 4, 5, 6, 7, 9]

Note: some people think that pop returning different list or item
depending on what is being passed to pop() is bad or something. I
don't see a problem here, because simple some_list[index_or_slice] can
also return list or just one item depending on what type
`index_or_slice` is.

--
Wbr, Andrii V. Mishkovskyi.

He's got a heart of a little child, and he keeps it in a jar on his desk.


From dickinsm at gmail.com  Tue Mar 17 16:12:20 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 17 Mar 2009 15:12:20 +0000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BFBA89.3060406@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
Message-ID: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>

On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith <eric at trueblade.com> wrote:
> And what happens when the comma would be the first character?
>
> ,012,345
> 0012,345
>
> or something else?

Options are:

(A) ",012,345"
(B) "0012,345"
(C) " 012,345"
(D) "0,012,345"
(E) write-in option here

I vote for (D):  it's one character too large, but the given precision
is only supposed to be a minimum anyway.  We already end up
with a length-9 string when formatting 1234567.

(D) is the minimum width string that:
  doesn't look weird (like (A) and (B)),
  has length at least 8, and
  is still in the right basic format

(C) would be my second choice, but I find the extra space padding
to be somewhat arbitrary (why a space? why not some other
padding character?)

Mark


From denis.spir at free.fr  Tue Mar 17 16:36:51 2009
From: denis.spir at free.fr (spir)
Date: Tue, 17 Mar 2009 16:36:51 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
Message-ID: <20090317163651.056a2807@o>

Le Tue, 17 Mar 2009 15:12:20 +0000,
Mark Dickinson <dickinsm at gmail.com> s'exprima ainsi:

> On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith <eric at trueblade.com> wrote:
> > And what happens when the comma would be the first character?
> >
> > ,012,345
> > 0012,345
> >
> > or something else?
> 
> Options are:
> 
> (A) ",012,345"
> (B) "0012,345"
> (C) " 012,345"
> (D) "0,012,345"
> (E) write-in option here
> 
> I vote for (D):  it's one character too large, but the given precision
> is only supposed to be a minimum anyway.  We already end up
> with a length-9 string when formatting 1234567.
> 
> (D) is the minimum width string that:
>   doesn't look weird (like (A) and (B)),
>   has length at least 8, and
>   is still in the right basic format
> 
> (C) would be my second choice, but I find the extra space padding
> to be somewhat arbitrary (why a space? why not some other
> padding character?)

I agree with all the comments above.

* A is ... (censured).
* B does not comply with user choice.
* D is the best in theory, but would trouble table-like vertical alignment.
* So remains only C for me.

Also, the issue here comes from user inconsistency: a (total) width of 8 simply cannot fit with group separators every 3 digits (warning?). At best, there should be some information on this topic to avoid bad surprises, but then the implementation should not care much.

> Mark

Denis
------
la vita e estrany


From dickinsm at gmail.com  Tue Mar 17 16:53:38 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 17 Mar 2009 15:53:38 +0000
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <20090317163651.056a2807@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
	<20090317163651.056a2807@o>
Message-ID: <5c6f2a5d0903170853w5bf40eddu1e0c140829240a2f@mail.gmail.com>

On Tue, Mar 17, 2009 at 3:36 PM, spir <denis.spir at free.fr> wrote:
> * A is ... (censured).
> * B does not comply with user choice.
> * D is the best in theory, but would trouble table-like vertical alignment.

I don't see why it would:  could you elaborate?

Mark


From george.sakkis at gmail.com  Tue Mar 17 17:03:16 2009
From: george.sakkis at gmail.com (George Sakkis)
Date: Tue, 17 Mar 2009 12:03:16 -0400
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
Message-ID: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>

On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
<mishok13 at gmail.com> wrote:

> 1. Add ability to use '+' operator for dicts
>
> I often wonder why list and tuple instances have '+' and '+='
> operators but dicts don't?
> It's not that rare in my code (and code written by others, as it
> seems) that i have to write:
>
> a.update(b)
> return a
>
> I do understand that adding additional magic method may be
> inappropriate for dict, but I think it would be nice addition to a
> language. So, my proposal is that:
>
> x = a + b
> would become equivalent to
> x = dict(a, **b)
>
> a += b
> would become equivalent to
> a.update(b)

That's one way to define dict addition but it's not the only, or even,
the best one. It's hard to put in words exactly why but I expect "a+b"
to take into account the full state of the operands, not just a part
of it. In your proposal the values of the first dict for the common
keys are effectively ignored, which doesn't seem to me as a good fit
for an additive operation. I would find at least as reasonable and
intuitive the following definition that doesn't leak information:

def sum_dicts(*dicts):
    from collections import defaultdict
    s = defaultdict(list)
    for d in dicts:
        for k,v in d.iteritems():
            s[k].append(v)
    return s

>>> d1 = {'a':2,'b':5}
>>> d2 = {'a':2,'c':6,'z':3}
>>> d3 = {'b':2,'c':5}
>>> sum_dicts(d1,d2,d3)
defaultdict(<type 'list'>, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]})

George


From guido at python.org  Tue Mar 17 17:36:10 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Mar 2009 09:36:10 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
Message-ID: <ca471dc20903170936j1c4400acv8f0d3007efc29cf6@mail.gmail.com>

On Tue, Mar 17, 2009 at 8:12 AM, Mark Dickinson <dickinsm at gmail.com> wrote:
> On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith <eric at trueblade.com> wrote:
>> And what happens when the comma would be the first character?
>>
>> ,012,345
>> 0012,345
>>
>> or something else?
>
> Options are:
>
> (A) ",012,345"
> (B) "0012,345"

Neither (A) nor (B) is acceptable.

> (C) " 012,345"
> (D) "0,012,345"
> (E) write-in option here
>
> I vote for (D): ?it's one character too large, but the given precision
> is only supposed to be a minimum anyway. ?We already end up
> with a length-9 string when formatting 1234567.
>
> (D) is the minimum width string that:
> ?doesn't look weird (like (A) and (B)),
> ?has length at least 8, and
> ?is still in the right basic format
>
> (C) would be my second choice, but I find the extra space padding
> to be somewhat arbitrary (why a space? why not some other
> padding character?)

It's tough to choose between (C) and (D). I guess we'll have to look
at use cases for leading zeros. I can think of two use cases for
leading zeros are: (1) To avoid font-width issues -- many
variable-width fonts are designed so that all digits have the same
width, but their (default) space is much narrower. (2) To avoid fraud
when printing certain documents -- it's easier to insert a '1' in
front of a small number than to change a '0' into something else.

Since both use cases are trying to avoid spaces, I think (D) is the winner here.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From josiah.carlson at gmail.com  Tue Mar 17 18:36:19 2009
From: josiah.carlson at gmail.com (Josiah Carlson)
Date: Tue, 17 Mar 2009 10:36:19 -0700
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
Message-ID: <e6511dbf0903171036s58d180a5nf254bb18f508d6c5@mail.gmail.com>

On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis <george.sakkis at gmail.com> wrote:
> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
> <mishok13 at gmail.com> wrote:
>
>> 1. Add ability to use '+' operator for dicts
>>
>> I often wonder why list and tuple instances have '+' and '+='
>> operators but dicts don't?
>> It's not that rare in my code (and code written by others, as it
>> seems) that i have to write:
>>
>> a.update(b)
>> return a
>>
>> I do understand that adding additional magic method may be
>> inappropriate for dict, but I think it would be nice addition to a
>> language. So, my proposal is that:
>>
>> x = a + b
>> would become equivalent to
>> x = dict(a, **b)
>>
>> a += b
>> would become equivalent to
>> a.update(b)
>
> That's one way to define dict addition but it's not the only, or even,
> the best one. It's hard to put in words exactly why but I expect "a+b"
> to take into account the full state of the operands, not just a part
> of it. In your proposal the values of the first dict for the common
> keys are effectively ignored, which doesn't seem to me as a good fit
> for an additive operation. I would find at least as reasonable and
> intuitive the following definition that doesn't leak information:
>
> def sum_dicts(*dicts):
> ? ?from collections import defaultdict
> ? ?s = defaultdict(list)
> ? ?for d in dicts:
> ? ? ? ?for k,v in d.iteritems():
> ? ? ? ? ? ?s[k].append(v)
> ? ?return s
>
>>>> d1 = {'a':2,'b':5}
>>>> d2 = {'a':2,'c':6,'z':3}
>>>> d3 = {'b':2,'c':5}
>>>> sum_dicts(d1,d2,d3)
> defaultdict(<type 'list'>, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]})

Both of the ideas suffer from "+ is no longer commutative", which
sort-of bothers me.  I say sort-of, because I would actually prefer
Andrii's semantics over yours, and if you prefer the elements from b,
you use 'a + b', but if you prefer the elements from a, you use 'b +
a'.  Then again, I'm tending towards a -.75 on the entire idea;
despite it being convenient, I can see non-comutativity as being
confusing.

As for the list slice popping...I'm tending towards a -1.  While I can
see the convenience in some cases, I'm just not sure it's compelling
enough (especially because you need to generate the slice in advance
of using it).

As stated in the past...not all 2 line functions need to be built-in
or syntax.  I don't believe either of these are able to pass the "is
it necessary as part of a compelling use-case?" question.

 - Josiah


From guido at python.org  Tue Mar 17 18:39:38 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Mar 2009 10:39:38 -0700
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
Message-ID: <ca471dc20903171039t5acf2f6dka854f135e23ca419@mail.gmail.com>

Because there so many different ways to think about this, it's better
not to guess and force the user to be explicit.

On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis <george.sakkis at gmail.com> wrote:
> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
> <mishok13 at gmail.com> wrote:
>
>> 1. Add ability to use '+' operator for dicts
>>
>> I often wonder why list and tuple instances have '+' and '+='
>> operators but dicts don't?
>> It's not that rare in my code (and code written by others, as it
>> seems) that i have to write:
>>
>> a.update(b)
>> return a
>>
>> I do understand that adding additional magic method may be
>> inappropriate for dict, but I think it would be nice addition to a
>> language. So, my proposal is that:
>>
>> x = a + b
>> would become equivalent to
>> x = dict(a, **b)
>>
>> a += b
>> would become equivalent to
>> a.update(b)
>
> That's one way to define dict addition but it's not the only, or even,
> the best one. It's hard to put in words exactly why but I expect "a+b"
> to take into account the full state of the operands, not just a part
> of it. In your proposal the values of the first dict for the common
> keys are effectively ignored, which doesn't seem to me as a good fit
> for an additive operation. I would find at least as reasonable and
> intuitive the following definition that doesn't leak information:
>
> def sum_dicts(*dicts):
> ? ?from collections import defaultdict
> ? ?s = defaultdict(list)
> ? ?for d in dicts:
> ? ? ? ?for k,v in d.iteritems():
> ? ? ? ? ? ?s[k].append(v)
> ? ?return s
>
>>>> d1 = {'a':2,'b':5}
>>>> d2 = {'a':2,'c':6,'z':3}
>>>> d3 = {'b':2,'c':5}
>>>> sum_dicts(d1,d2,d3)
> defaultdict(<type 'list'>, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]})
>
> George
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From qrczak at knm.org.pl  Tue Mar 17 18:55:53 2009
From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk)
Date: Tue, 17 Mar 2009 18:55:53 +0100
Subject: [Python-ideas] Keyword same in right hand side of assignments
	(rev)
In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
Message-ID: <3f4107910903171055s33384a02jb32a843ab0ff6d1c@mail.gmail.com>

On Tue, Mar 17, 2009 at 13:23,  <hwpuschm at yahoo.de> wrote:

> ? "lst = [5,6] + same" synonymous with
> ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()"

What about:
   lst = x + same
?

-- 
Marcin Kowalczyk
qrczak at knm.org.pl
http://qrnik.knm.org.pl/~qrczak/


From python at zesty.ca  Tue Mar 17 18:58:47 2009
From: python at zesty.ca (Ka-Ping Yee)
Date: Tue, 17 Mar 2009 10:58:47 -0700 (PDT)
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <e6511dbf0903171036s58d180a5nf254bb18f508d6c5@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
	<e6511dbf0903171036s58d180a5nf254bb18f508d6c5@mail.gmail.com>
Message-ID: <alpine.DEB.0.99.0903171050550.7961@holly.lfw.org>

On Tue, 17 Mar 2009, Josiah Carlson wrote:

> On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis <george.sakkis at gmail.com> wrote:
>> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
>> <mishok13 at gmail.com> wrote:
>>
>>> 1. Add ability to use '+' operator for dicts
>>>
>
> Both of the ideas suffer from "+ is no longer commutative", which
> sort-of bothers me.

I don't find that a convincing argument, since + is not commutative
for lists or tuples either.  Andrii's original proposal is the most
natural interpretation -- notice that if x and y are dicts:

     dict(x.items()) gives x

     dict(x.items() + y.items()) gives x + y

That looks perfectly consistent to me.

George's counter-proposal doesn't make sense to me at all -- it
messes up the types of all the values in the dict.  And it's
inconsistent with the built-in behaviour of + with other types:
it doesn't add lists element-by-element, so it shouldn't add
dicts element-by-element either.


-- ?!ng


From pyideas at rebertia.com  Tue Mar 17 19:14:51 2009
From: pyideas at rebertia.com (Chris Rebert)
Date: Tue, 17 Mar 2009 11:14:51 -0700
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <alpine.DEB.0.99.0903171050550.7961@holly.lfw.org>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
	<e6511dbf0903171036s58d180a5nf254bb18f508d6c5@mail.gmail.com>
	<alpine.DEB.0.99.0903171050550.7961@holly.lfw.org>
Message-ID: <50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com>

On Tue, Mar 17, 2009 at 10:58 AM, Ka-Ping Yee <python at zesty.ca> wrote:
> On Tue, 17 Mar 2009, Josiah Carlson wrote:
>
>> On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis <george.sakkis at gmail.com>
>> wrote:
>>>
>>> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
>>> <mishok13 at gmail.com> wrote:
>>>
>>>> 1. Add ability to use '+' operator for dicts
>>>>
>>
>> Both of the ideas suffer from "+ is no longer commutative", which
>> sort-of bothers me.
>
> I don't find that a convincing argument, since + is not commutative
> for lists or tuples either. ?Andrii's original proposal is the most
> natural interpretation -- notice that if x and y are dicts:
>
> ? ?dict(x.items()) gives x
>
> ? ?dict(x.items() + y.items()) gives x + y
>
> That looks perfectly consistent to me.
>
> George's counter-proposal doesn't make sense to me at all -- it
> messes up the types of all the values in the dict. ?And it's
> inconsistent with the built-in behaviour of + with other types:
> it doesn't add lists element-by-element, so it shouldn't add
> dicts element-by-element either.

Not to put words into people's mouths, but it seems like the concern
was really less over the non-commutativity and move over the fact that
data from the first dict gets silently clobbered by the second dict;
whereas in the list, tuple, and string cases, no data is ever lost in
the process.

Cheers,
Chris
-- 
I have a blog:
http://blog.rebertia.com


From python at rcn.com  Tue Mar 17 19:17:02 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 11:17:02 -0700
Subject: [Python-ideas] dict '+' operator and slicing support for pop
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com><91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
	<ca471dc20903171039t5acf2f6dka854f135e23ca419@mail.gmail.com>
Message-ID: <DB6A170CDD054DB5A724250634126068@RaymondLaptop1>


>> a.update(b)
>> return a

Why take two short, simple lines with unequivocal meaning
and then abbreviate them with something mysterious (or
at least something with multiple possible interpretations)?

Mappings exist in many languages now.  Can you point
to another language that has found it worthwhile to have
both an update() method and an addition operator?

Also, consider that dicts are one of our most basic APIs
and many other objects model that API.  It behooves us
to keep that API as simple and thin as possible.

IMO, this change would be gratuituous.  None of the code
presented so far is significantly improved.  Essentially, we're
looking at a trivial abbreviation, not an actual offering of
new capabilities.

-1 all the way around.


Raymond


From george.sakkis at gmail.com  Tue Mar 17 19:26:44 2009
From: george.sakkis at gmail.com (George Sakkis)
Date: Tue, 17 Mar 2009 14:26:44 -0400
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
	<e6511dbf0903171036s58d180a5nf254bb18f508d6c5@mail.gmail.com>
	<alpine.DEB.0.99.0903171050550.7961@holly.lfw.org>
	<50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com>
Message-ID: <91ad5bf80903171126t3c427d31p89338f7b6ac172f8@mail.gmail.com>

On Tue, Mar 17, 2009 at 2:14 PM, Chris Rebert <pyideas at rebertia.com> wrote:
> On Tue, Mar 17, 2009 at 10:58 AM, Ka-Ping Yee <python at zesty.ca> wrote:
>> On Tue, 17 Mar 2009, Josiah Carlson wrote:
>>
>>> On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis <george.sakkis at gmail.com>
>>> wrote:
>>>>
>>>> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi
>>>> <mishok13 at gmail.com> wrote:
>>>>
>>>>> 1. Add ability to use '+' operator for dicts
>>>>>
>>>
>>> Both of the ideas suffer from "+ is no longer commutative", which
>>> sort-of bothers me.
>>
>> I don't find that a convincing argument, since + is not commutative
>> for lists or tuples either. ?Andrii's original proposal is the most
>> natural interpretation -- notice that if x and y are dicts:
>>
>> ? ?dict(x.items()) gives x
>>
>> ? ?dict(x.items() + y.items()) gives x + y
>>
>> That looks perfectly consistent to me.
>>
>> George's counter-proposal doesn't make sense to me at all -- it
>> messes up the types of all the values in the dict. ?And it's
>> inconsistent with the built-in behaviour of + with other types:
>> it doesn't add lists element-by-element, so it shouldn't add
>> dicts element-by-element either.
>
> Not to put words into people's mouths, but it seems like the concern
> was really less over the non-commutativity and move over the fact that
> data from the first dict gets silently clobbered by the second dict;
> whereas in the list, tuple, and string cases, no data is ever lost in
> the process.

Just to be clear, I'm between -0.5 and -1 to the whole idea; my
counter-proposal was simply meant to point out the potential ambiguity
in semantics and the fact that the original proposal silently loses
data.

George


From greg.ewing at canterbury.ac.nz  Tue Mar 17 21:39:55 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 Mar 2009 08:39:55 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>
	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
Message-ID: <49C00A9B.4080509@canterbury.ac.nz>

Mark Dickinson wrote:

>>>>format('%014f', 123.456, grouping=1)
> 
> '0,000,123.456000'
> 
> That also means that the relationship between the field width (14
> in this case) and the string length (16) is somewhat obscured.

I'd consider that part a bug that we shouldn't imitate.
The field width should always be what you say it is,
unless the value is too big to fit.

-- 
Greg


From python at rcn.com  Tue Mar 17 21:40:51 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 13:40:51 -0700
Subject: [Python-ideas] Customizing format()
Message-ID: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>

I've been exploring how to customize our thousands separators and decimal
separators and wanted to offer-up an idea.  It arose when I was looking at 
Java's DecimalFormat class and its customization tool DecimalFormatSymbols
http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html .
Also, I looked at how regular expression patterns provide options to change
the meaning of its special characters using (?iLmsux).

I.  Simplest version -- Translation pairs

    format(1234, "8,.1f")         -->   ' 1,234.0'
    format(1234, "(,_)8,.1f")     -->   ' 1_234.0'
    format(1234, "(,_)(.,)8,.1f") -->   ' 1_234,0'

This approach is very easy to implement and it doesn't make life difficult
for the parser which can continue to look for just a comma and period
with their standardized meaning.  It also fits nicely in our current framework
and doesn't require any changes to the format() builtin.  Of all the options,
I find this one to be the easiest to read.

Also, this version makes it easy to employ a couple of techniques to factor-out
formatting decisions.  Here's a gettext() style approach.

    def _(s):
         return '(,.)(.,)' + s
    . . .
    format(x, _('8.1f'))

Here's another approach using implicit string concatenation:

     DEB = '(,_)'        # style for debugging
     EXT = '(, )'         # style for external display
     . . .
     format(x, DEB '8.1f')
     format(y, EXT '8d')

There are probably many ways to factor-out the decision.  We don't need to
decide which is best, we just need to make it possible.

One other thought, this approach makes it possible to customize all of the
characters that are currently hardwired (including zero and space padding
characters and the 'E' or 'e' exponent symbols).


II.  Javaesque version -- FormatSymbols object

This is essentially the same idea as previous one but involves modifying 
the format() builtin to accept a symbols object and pass it to __format__ 
methods. This moves the work outside of the format string itself:

      DEB = FormatSymbols(comma='_')
      EXT = FormatSymbols(comma=' ')
      . . .
      format(x, '8.1f', DEB)
      format(y, '8d', EXT)

The advantage is that this technique is easily extendable beyond simple
symbol translations and could possibly allow specification of grouping
sizes in hundreds and whatnot.  It also looks more like a real program
as opposed to a formatting mini-language.  The disadvantage is that
it is likely slower and it requires mucking with the currently dirt simple
format() / __format__() protocol.  It may also be harder to integrate
with existing __format__ methods which are currently very string oriented.


Raymond

   
From greg.ewing at canterbury.ac.nz  Tue Mar 17 21:52:20 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 Mar 2009 08:52:20 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
Message-ID: <49C00D84.6020109@canterbury.ac.nz>

Guido van Rossum wrote:
> I agree that the given width should include the
> commas, but I strongly feel that leading zeros should be comma-fied
> just like everything else.

I think we need some use cases before a proper
decision can be made about this. If you were using
comma-separated zero-filled numbers, what would
your objective be, and what choice would best
fulfill it?

-- 
Greg


From python at rcn.com  Tue Mar 17 22:00:15 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 14:00:15 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com><5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1><ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com><AE9C5191227F4947B49D27339D168021@RaymondLaptop1><ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com><5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com><49BF8641.5080609@trueblade.com><5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com><49BFA474.9040308@trueblade.com><ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49C00D84.6020109@canterbury.ac.nz>
Message-ID: <8C68FC2427504B9E90D340FD2664385B@RaymondLaptop1>


[Guido van Rossum]
>> I agree that the given width should include the
>> commas, but I strongly feel that leading zeros should be comma-fied
>> just like everything else.

+1

[Greg Ewing]
> I think we need some use cases before a proper
> decision can be made about this. If you were using
> comma-separated zero-filled numbers, what would
> your objective be, and what choice would best
> fulfill it?

I gave one example of writing out numbers in columns
and that makes it clear that putting commas in the
leading zeros is the right thing to do (anything else
looks unusably weird).

Also, as Guido pointed-out, anyone specifying zero-padding
is saying that they intend to not be showing spaces where
digits would go.  Our choice ought to respect that intention.


Raymond


From greg.ewing at canterbury.ac.nz  Tue Mar 17 22:03:54 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 Mar 2009 09:03:54 +1200
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
Message-ID: <49C0103A.1070308@canterbury.ac.nz>

George Sakkis wrote:

> It's hard to put in words exactly why but I expect "a+b"
> to take into account the full state of the operands, not just a part
> of it.

I think one expects a + operator to be somehow symmetrical
with respect to its operands. The lopsidedness of dict
updating violates this expectation, and so is better
represented by an asymmetrical syntax.

-- 
Greg


From eric at trueblade.com  Tue Mar 17 22:13:14 2009
From: eric at trueblade.com (Eric Smith)
Date: Tue, 17 Mar 2009 17:13:14 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <49BF8641.5080609@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903161405s68f25539v839e1002d534671d@mail.gmail.com>	<0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1>	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
Message-ID: <49C0126A.3060405@trueblade.com>

Eric Smith wrote:
> Mark Dickinson wrote:
>> One question from the PEP, which I've been too slow to read until
>> this morning:  should commas appear in the zero-filled part of a
>> number?  That is, should format(1234, "09,d") give '00001,234'
>> or '0,001,234'?  The PEP specifies that format(1234, "08,d")
>> should give '0001,234', but that's something of a special case:
>> ',001,234' isn't really a viable alternative.
> 
> Hmm. No good answers here. I'd vote for not putting the commas in the 
> leading zeros.  I don't think anyone would ever actually use this 
> combination, and putting commas there complicates things due to the 
> special case with the first digit.
> 
> Plus, they're not inserted by the 'n' formatter, and no one has 
> complained (which might mean no one's using it, of course).
> 
> In 2.6:
> 
>  >>> import locale
>  >>> locale.setlocale(locale.LC_ALL, 'en_US.UTF8')
> 'en_US.UTF8'
>  >>> format(12345, '010n')
> '000012,345'
>  >>> format(12345, '09n')
> '00012,345'
>  >>> format(12345, '08n')
> '0012,345'
>  >>> format(12345, '07n')
> '012,345'
>  >>> format(12345, '06n')
> '12,345'

I think this is a bug that should be fixed in the same way we implement 
it for PEP 378.

It's more complex for 'n', because you might have funny groupings (like 
very 3, then 2). But I hope our solution for PEP 378 will generalize to 
this case, too.

Eric.


From greg.ewing at canterbury.ac.nz  Tue Mar 17 22:41:11 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 Mar 2009 09:41:11 +1200
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903170936j1c4400acv8f0d3007efc29cf6@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
	<ca471dc20903170936j1c4400acv8f0d3007efc29cf6@mail.gmail.com>
Message-ID: <49C018F7.8080201@canterbury.ac.nz>

Guido van Rossum wrote:
> (1) To avoid font-width issues -- many
> variable-width fonts are designed so that all digits have the same
> width, but their (default) space is much narrower.

That's a good point.

This alone doesn't necessarily rule out (A), though.
It could be considered a case of user stupidity if they
specify a field width that results in a comma at the
beginning and don't like the result.

It doesn't necessarily rule out (C) either, since there
will always be a space at the beginning unless the value
overflows, and then all your alignment guarantees are
blown away anyhow.

  (2) To avoid fraud
> when printing certain documents -- it's easier to insert a '1' in
> front of a small number than to change a '0' into something else.

However it's easy to add a '1' before a string of leading
zeroes if there's a sliver of space available, so it's
better still to fill with some other character such as
'*'. You need a cooperative font for that to work.

-- 
Greg


From tjreedy at udel.edu  Tue Mar 17 22:43:10 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 17 Mar 2009 17:43:10 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
Message-ID: <gpp5he$m12$1@ger.gmane.org>

Raymond Hettinger wrote:
> I've been exploring how to customize our thousands separators and decimal
> separators and wanted to offer-up an idea.  It arose when I was looking 
> at Java's DecimalFormat class and its customization tool 
> DecimalFormatSymbols
> http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html .
> Also, I looked at how regular expression patterns provide options to change
> the meaning of its special characters using (?iLmsux).
> 
> I.  Simplest version -- Translation pairs
> 
>    format(1234, "8,.1f")         -->   ' 1,234.0'
>    format(1234, "(,_)8,.1f")     -->   ' 1_234.0'
>    format(1234, "(,_)(.,)8,.1f") -->   ' 1_234,0'
> 
> This approach is very easy to implement and it doesn't make life difficult
> for the parser which can continue to look for just a comma and period
> with their standardized meaning.  It also fits nicely in our current 
> framework
> and doesn't require any changes to the format() builtin.  Of all the 
> options,
> I find this one to be the easiest to read.

I strongly prefer suffix to prefix modification.  The format gives the 
overall structure of the output, the rest are details, which a reader 
may not care so much about.

> Also, this version makes it easy to employ a couple of techniques to 
> factor-out

These techniques apply to any "augment the basic format with an affix" 
method.

> formatting decisions.  Here's a gettext() style approach.
> 
>    def _(s):
>         return '(,.)(.,)' + s
>    . . .
>    format(x, _('8.1f'))
> 
> Here's another approach using implicit string concatenation:
> 
>     DEB = '(,_)'        # style for debugging
>     EXT = '(, )'         # style for external display
>     . . .
>     format(x, DEB '8.1f')
>     format(y, EXT '8d')
> 
> There are probably many ways to factor-out the decision.  We don't need to
> decide which is best, we just need to make it possible.
> 
> One other thought, this approach makes it possible to customize all of the
> characters that are currently hardwired (including zero and space padding
> characters and the 'E' or 'e' exponent symbols).

Any "augment the format with affixes" method should do the same.
I prefer at most a separator (;) between affixes rather than fences 
around them.

I also prefer, mnemonic key letters to mark the start of each affix, 
such as in Guido's quick suggestion: Thousands, Decimal_point, Exponent, 
Grouping, Pad_char, Money, and so on.  But I do not think '=' is needed. 
  Since the replacement will almost always be a single non-captital 
letter char, I am not sure a separator is even needed, but it would make 
parsing much easier. G would be followed by one or more digits 
indicating grouping from Decimal_point leftward, with the last repeated. 
  If grouping by 9s is not large enough, allow a-f to get grouping up to 
15 ;-).  Example above would be

format(1234, '8.1f;T.;P,')

> II.  Javaesque version -- FormatSymbols object
> 
> This is essentially the same idea as previous one but involves modifying 
> the format() builtin to accept a symbols object and pass it to 
> __format__ methods. This moves the work outside of the format string 
> itself:
> 
>      DEB = FormatSymbols(comma='_')
>      EXT = FormatSymbols(comma=' ')
>      . . .
>      format(x, '8.1f', DEB)
>      format(y, '8d', EXT)
> 
> The advantage is that this technique is easily extendable beyond simple
> symbol translations and could possibly allow specification of grouping
> sizes in hundreds and whatnot.  It also looks more like a real program
> as opposed to a formatting mini-language.  The disadvantage is that
> it is likely slower and it requires mucking with the currently dirt simple
> format() / __format__() protocol.  It may also be harder to integrate
> with existing __format__ methods which are currently very string oriented.

I suggested in the thread in exposing the format parse result that the 
resulting structure (dict or named tuple) could become an alternative, 
wordy interface to the format functions.  I think the mini-language 
itself should stay mini.

Terry Jan Reedy


From guido at python.org  Tue Mar 17 22:50:46 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 17 Mar 2009 14:50:46 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <49C018F7.8080201@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49BFBA89.3060406@trueblade.com>
	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>
	<ca471dc20903170936j1c4400acv8f0d3007efc29cf6@mail.gmail.com>
	<49C018F7.8080201@canterbury.ac.nz>
Message-ID: <ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>

On Tue, Mar 17, 2009 at 2:41 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>>
>> (1) To avoid font-width issues -- many
>> variable-width fonts are designed so that all digits have the same
>> width, but their (default) space is much narrower.
>
> That's a good point.
>
> This alone doesn't necessarily rule out (A), though.
> It could be considered a case of user stupidity if they
> specify a field width that results in a comma at the
> beginning and don't like the result.

(A) is ruled out on the basis of aesthetics alone.

> It doesn't necessarily rule out (C) either, since there
> will always be a space at the beginning unless the value
> overflows, and then all your alignment guarantees are
> blown away anyhow.
>
> ?(2) To avoid fraud
>>
>> when printing certain documents -- it's easier to insert a '1' in
>> front of a small number than to change a '0' into something else.
>
> However it's easy to add a '1' before a string of leading
> zeroes if there's a sliver of space available, so it's
> better still to fill with some other character such as
> '*'. You need a cooperative font for that to work.

What I've seen is the '$' sign immediately in front, e.g. $001,000.00.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From python at rcn.com  Tue Mar 17 23:25:18 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 15:25:18 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
	<gpp5he$m12$1@ger.gmane.org>
Message-ID: <C013BB81464D478FB4180D8DCB38C401@RaymondLaptop1>


[Terry Reedy]
> I strongly prefer suffix to prefix modification.  

Given the way that the formatting parsers are written,
I think suffix would work just as well as prefix.  Also,
your idea may help with the mental parsing as well
(because the rest of the format string uses the 
untranslated symbols so that translation pairs should
be at the end).


>> Also, this version makes it easy to employ a couple of techniques to 
>> factor-out
>
> These techniques apply to any "augment the basic format with an
> affix" method.

Right.


> I also prefer, mnemonic key letters to mark the start of each affix, 
...
> format(1234, '8.1f;T.;P,')


I think it's better to be explicit that periods are translated to commas
and commas to periods.  Introducing a new letter just adds more to
more memory load and makes the notation more verbose.  In the
previous newgroup discussions, people reacted badly to letter
mnemonics finding them to be so ugly that they would refuse to 
use them (remember the early proposal of format(x,"8T,.f)).

Also, the translation pairs approach lets you swap other hardwired
characters like the E or a 0 pad.


Raymond


From steve at pearwood.info  Wed Mar 18 00:29:59 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 18 Mar 2009 10:29:59 +1100
Subject: [Python-ideas] Keyword same in right hand side of assignments
	(rev)
In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
Message-ID: <200903181029.59485.steve@pearwood.info>

On Tue, 17 Mar 2009 11:23:23 pm hwpuschm at yahoo.de wrote:

> What I would like is to extend the augmented assignment
> and make it easy to understand for naive readers.
[...]
> The following examples would be extensions:
> ? "lst = [5,6] + same" synonymous with
> ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()"
> ? "inmutable = same*(same+1)"? synonymous with
> ? ? ? "unused=inmutable+1; inmutable*=unused; del unused"
>
> There seems to be no really simple expression for the above
> extensions

Instead of the proposed "lst = [5,6] + same" or the 
obfuscated "lst.reverse(); lst.extend([6,5]); lst.reverse()", what 
about this simple assignment?

lst = [5, 6] + lst

Instead of the proposed "inmutable = same*(same+1)" or the 
obfuscated "unused=inmutable+1; inmutable*=unused; del unused", what 
about the simple:

inmutable = inmutable*(inmutable+1)

Since your claimed intention is to make it easy for naive users, why 
replace the standard idiom:

xx += 5

with an assignment containing a mysterious "same"? Many of those naive 
users will surely assume "same" is a variable name, not a magic 
keyword, and spend much time looking for where it is assigned. I 
predict that if your idea goes ahead, we'll get dozens of questions "I 
can't find where the variable same gets its value from", and we'll have 
to explain that it is a magic variable that gets its value from the 
left hand side of the assignment.

One last question -- what should happen here?

x, y, z = (same, same+1, same+2)


-- 
Steven D'Aprano


From steve at pearwood.info  Wed Mar 18 00:30:09 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 18 Mar 2009 10:30:09 +1100
Subject: [Python-ideas] Keyword same in right hand side of assignments
	(rev)
In-Reply-To: <49BFA3EA.70108@improva.dk>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
	<7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>
	<49BFA3EA.70108@improva.dk>
Message-ID: <200903181030.09394.steve@pearwood.info>

On Wed, 18 Mar 2009 12:21:46 am Jacob Holm wrote:

> I believe that as soon as the left-hand side stops being a simple
> variable and it is used in non-trivial expressions on the right-hand
> side, using the keyword would help clarify the intent.  What I mean
> is that the examples you should be looking at are more like:
>
> A[n+1] = same*same + 1
> B[2*j].foo = frobnicate(same, same+1)
> ...
>
> If you try expanding these into current python with minimal change in
> semantics you will end up with something like
>
> _1 = n+1
> _2 = A[_1]
> A[_1] = _2*_2 + 1
> del _1
> del _2
>
> _1 = B[2*j]
> _2 = _1.foo
> _1.foo = frobnicate(_2, _2+1)
> del _1
> del _2
>
> which is much less readable.

Of course it is, because it's obfuscated. What's with the leading 
underscore names? Inside a function, they're not accessible to outside 
callers, so the notion of "private" and "public" doesn't apply, and in 
module-level code you delete them at the end, so they won't be imported 
because they no longer exist. (BTW, there's no need to delete the names 
one at a time. "del _1, _2" does what you want.)

What's wrong with the clear, simple and obvious?

A[n+1] = A[n+1]**2 + 1

If you really care about calculating n+1 twice then just use a 
meaningful name instead of an obfuscated name. This clarifies the 
intent of the code, instead of hiding it:

index = n+1  # or even just i
A[index] = A[index]**2 + 1

Likewise:

tmp = B[2*j].foo
B[2*j].foo = frobnicate(tmp, tmp+1)

Or any combination of standard idioms. If you really insist, you can 
even delete the temporary names afterwards, but why would you bother 
inside a function?


-- 
Steven D'Aprano


From jh at improva.dk  Wed Mar 18 01:16:21 2009
From: jh at improva.dk (Jacob Holm)
Date: Wed, 18 Mar 2009 01:16:21 +0100
Subject: [Python-ideas] Keyword same in right hand side of assignments
 (rev)
In-Reply-To: <200903181030.09394.steve@pearwood.info>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>	<7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>	<49BFA3EA.70108@improva.dk>
	<200903181030.09394.steve@pearwood.info>
Message-ID: <49C03D55.7050604@improva.dk>

Steven D'Aprano wrote:
> On Wed, 18 Mar 2009 12:21:46 am Jacob Holm wrote:
>
>   
>> I believe that as soon as the left-hand side stops being a simple
>> variable and it is used in non-trivial expressions on the right-hand
>> side, using the keyword would help clarify the intent.  What I mean
>> is that the examples you should be looking at are more like:
>>
>> A[n+1] = same*same + 1
>> B[2*j].foo = frobnicate(same, same+1)
>> ...
>>
>> If you try expanding these into current python with minimal change in
>> semantics you will end up with something like
>>
>> _1 = n+1
>> _2 = A[_1]
>> A[_1] = _2*_2 + 1
>> del _1
>> del _2
>>
>> _1 = B[2*j]
>> _2 = _1.foo
>> _1.foo = frobnicate(_2, _2+1)
>> del _1
>> del _2
>>
>> which is much less readable.
>>     
>
> Of course it is, because it's obfuscated. What's with the leading 
> underscore names? Inside a function, they're not accessible to outside 
> callers, so the notion of "private" and "public" doesn't apply, and in 
> module-level code you delete them at the end, so they won't be imported 
> because they no longer exist. (BTW, there's no need to delete the names 
> one at a time. "del _1, _2" does what you want.)
>
> What's wrong with the clear, simple and obvious?
>
> A[n+1] = A[n+1]**2 + 1
>
>   
What is wrong is that it computes (n+1) twice, and it uses a different 
operator to avoid doing the __getitem__ twice. The whole point of the 
exercise was to get as close as possible to what I think the expression 
using "same" should mean. I tried to follow the common style for that 
kind of expansion as seen elsewhere on this list to make that clear. 
Obviously I failed.

Jacob


From python at rcn.com  Wed Mar 18 01:25:42 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 17:25:42 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
	<gpp5he$m12$1@ger.gmane.org>
Message-ID: <567E18322D754B74B56013E8811D10AA@RaymondLaptop1>


Mark Dickinson's test code suggested a good, extensible approach to the problem.  Here's the idea in a nutshell:

  format(value, format_spec='', conventions=None)
     'calls value.__format__(format_spec, conventions)'

Where conventions is an optional dictionary with formatting control values.  Any value object can accept custom controls, but the 
names for standard ones would be taken from the standards provided by localeconv():

  {
   'decimal_point': '.',
   'grouping': [3, 0],
   'negative_sign': '-',
   'positive_sign': '',
   'thousands_sep': ','}

The would let you store several locales using localeconv() and use them at will, thus solving the global variable and threading 
problems with locale:

     import locale
     loc = locale.getlocale() # get current locale
     locale.setlocale(locale.LC_ALL, 'de_DE')
     DE = locale.localeconv()
     locale.setlocale(locale.LC_ALL, 'en_US')
     US = locale.localeconv()
     locale.setlocale(locale.LC_ALL, loc) # restore saved locale

     . . .

     format(x, '8,.f', DE)
     format(y, '8,d', US)


It also lets you write your own conventions on the fly:

     DEB = dict(thousands_sep='_')       # style for debugging
     EXT = dict(thousands_sep=',')       # style for external display
     . . .
     format(x, '8.1f', DEB)
     format(y, '8d', EXT)


Raymond 


From python at rcn.com  Wed Mar 18 01:34:16 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 17:34:16 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
Message-ID: <CFB1A935CB4B421BA2B57713A7691493@RaymondLaptop1>


> Where conventions is an optional dictionary with formatting control values.  Any value object can accept custom controls, but the 
> names for standard ones would be taken from the standards provided by localeconv():

Forgot to mention that this approach make life easier on people writing __format__ methods because it lets them re-use the work 
they've already done to implement the "n" type specifier.

Also, this approach is very similar to the one taken in Java with its DecimalFormatSymbols object.  The main differences are that 
they use a custom class instead of a dictionary, that we would use standard names that work well with localeconv(), and that our 
approach is extensible for use with custom formatters (i.e. the datetime module could have its own set of key/value pairs for 
formatting controls).


Raymond 


From jh at improva.dk  Wed Mar 18 01:43:52 2009
From: jh at improva.dk (Jacob Holm)
Date: Wed, 18 Mar 2009 01:43:52 +0100
Subject: [Python-ideas] Keyword same in right hand side of assignments
 (rev)
In-Reply-To: <200903181029.59485.steve@pearwood.info>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>
	<200903181029.59485.steve@pearwood.info>
Message-ID: <49C043C8.90708@improva.dk>

Steven D'Aprano wrote:
> One last question -- what should happen here?
>
> x, y, z = (same, same+1, same+2)
>
>
>   

Obviously a typeerror, as you cannot add one or two to a tuple...  :)

But the same question for the statement

x, y, z = (same, foo(same), bar(same))

has a simple obvious anwer (at least to my eyes).  It should be 
equivalent to:

tmp = x, y, z
z, y, z = (tmp, foo(tmp), bar(tmp))

However, On rereading the proposal(s), I can see that all the operations 
using same are supposedly defined in terms of augmented assignment 
operators which really doesn't make any sense to me. (Augmented 
assignment is one of the very few things in python I find really 
unclean.  I know practicality beats purity and all that, but it just 
doesn't sit right with me).

It is quite possible that I have been interpreting this whole idea 
differently than the original author intended.  If so, I apologize for 
the confusion.  In any case I am at best -0.5 on it, because the benefit 
does not outweigh the cost of adding a new keyword.

Jacob


From tjreedy at udel.edu  Wed Mar 18 02:50:14 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 17 Mar 2009 21:50:14 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <C013BB81464D478FB4180D8DCB38C401@RaymondLaptop1>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>	<gpp5he$m12$1@ger.gmane.org>
	<C013BB81464D478FB4180D8DCB38C401@RaymondLaptop1>
Message-ID: <gppk0m$rmd$1@ger.gmane.org>

Raymond Hettinger wrote:
> 
> [Terry Reedy]
>> I strongly prefer suffix to prefix modification.  
> 
> Given the way that the formatting parsers are written,
> I think suffix would work just as well as prefix.  Also,
> your idea may help with the mental parsing as well
> (because the rest of the format string uses the untranslated symbols so 
> that translation pairs should
> be at the end).
> 
> 
>>> Also, this version makes it easy to employ a couple of techniques to 
>>> factor-out
>>
>> These techniques apply to any "augment the basic format with an
>> affix" method.
> 
> Right.
> 
> 
>> I also prefer, mnemonic key letters to mark the start of each affix, 
> ...
>> format(1234, '8.1f;T.;P,')

This should have been

format(1234, '8.1f;T.;D,')

> 
> 
> I think it's better to be explicit that periods are translated to commas
> and commas to periods.  Introducing a new letter just adds more to
> more memory load and makes the notation more verbose.  In the
> previous newgroup discussions, people reacted badly to letter
> mnemonics finding them to be so ugly that they would refuse to use them 
> (remember the early proposal of format(x,"8T,.f)).
> 
> Also, the translation pairs approach lets you swap other hardwired
> characters like the E or a 0 pad.

So does the key letter approach.  The pairs approach does not allow easy 
alteration of the grouping spec, because there is no hard-wired char to 
swap, unless you would allow something cryptic like {3,(4,2,3)) (for 
India, I believe).

Even with the tranlation pair, one could use a separator rather than fences.

format(1234, '8.1f;T.;D,') # could be

format(1234, '8,.1f;,.;.,')

The two approachs could even be mixed by using a char only when clearer, 
such as 'G' for grouping instead of '3' for the existing grouping value. 
  I think whatever scheme adopted should be complete.

Terry Jan Reedy


From lie.1296 at gmail.com  Wed Mar 18 03:27:26 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Wed, 18 Mar 2009 13:27:26 +1100
Subject: [Python-ideas] Keyword same in right hand side of assignments
	(rev)
In-Reply-To: <49C03D55.7050604@improva.dk>
References: <339525.47450.qm@web25805.mail.ukl.yahoo.com>	<7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com>	<49BFA3EA.70108@improva.dk>	<200903181030.09394.steve@pearwood.info>
	<49C03D55.7050604@improva.dk>
Message-ID: <gppm6f$vtm$1@ger.gmane.org>

Jacob Holm wrote:
> What is wrong is that it computes (n+1) twice, and it uses a different 
> operator to avoid doing the __getitem__ twice. The whole point of the 
> exercise was to get as close as possible to what I think the expression 
> using "same" should mean. I tried to follow the common style for that 
> kind of expansion as seen elsewhere on this list to make that clear. 
> Obviously I failed.
> 

Unless in a very tight loop, I see no reason why computing n+1 and 
__getitem__ twice is a problem. And using temporary variable is 
sufficiently clear enough unless your temporary variable's name starts 
with _.


From lie.1296 at gmail.com  Wed Mar 18 03:42:27 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Wed, 18 Mar 2009 13:42:27 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>	<49BF8641.5080609@trueblade.com>	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>	<49BFA474.9040308@trueblade.com>	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>	<49BFBA89.3060406@trueblade.com>	<5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com>	<ca471dc20903170936j1c4400acv8f0d3007efc29cf6@mail.gmail.com>	<49C018F7.8080201@canterbury.ac.nz>
	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>
Message-ID: <gppn2l$1t7$1@ger.gmane.org>

Guido van Rossum wrote:
>>
>>  (2) To avoid fraud
>>> when printing certain documents -- it's easier to insert a '1' in
>>> front of a small number than to change a '0' into something else.
>> However it's easy to add a '1' before a string of leading
>> zeroes if there's a sliver of space available, so it's
>> better still to fill with some other character such as
>> '*'. You need a cooperative font for that to work.
> 
> What I've seen is the '$' sign immediately in front, e.g. $001,000.00.
> 

I think I'd rather see something like: $==1,000.00==

I wouldn't use zeroes, if I were the bank. It is bad on the aesthetics, 
and too easy to fraud.


From lie.1296 at gmail.com  Wed Mar 18 03:47:22 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Wed, 18 Mar 2009 13:47:22 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49C00A9B.4080509@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903161546q411d50eaq8892b2e4132a5f4a@mail.gmail.com>	<385A4935485649C38210DC189A08C9BC@RaymondLaptop1>	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>	<49BF8641.5080609@trueblade.com>	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49C00A9B.4080509@canterbury.ac.nz>
Message-ID: <gppnbs$2cn$1@ger.gmane.org>

Greg Ewing wrote:
> Mark Dickinson wrote:
> 
>>>>> format('%014f', 123.456, grouping=1)
>>
>> '0,000,123.456000'
>>
>> That also means that the relationship between the field width (14
>> in this case) and the string length (16) is somewhat obscured.
> 
> I'd consider that part a bug that we shouldn't imitate.
> The field width should always be what you say it is,
> unless the value is too big to fit.
> 

Should there be an option for using hard-width? If hard-width flag is 
on, then if the value is too big to fit, then the number will get 
trimmed instead of changing the width (and perhaps there would be 
prepend character).

So:

width: 4, number: 123456, ppchar

"<456"

So not to break table alignment...


From python at rcn.com  Wed Mar 18 05:56:12 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 17 Mar 2009 21:56:12 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
Message-ID: <F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>

Am curious whether you guys like this proposal?

Raymond

----- Original Message ----- 

[Raymond Hettinger]
> Mark Dickinson's test code suggested a good, extensible approach to the problem.  Here's the idea in a nutshell:
>
>  format(value, format_spec='', conventions=None)
>     'calls value.__format__(format_spec, conventions)'
>
> Where conventions is an optional dictionary with formatting control values.  Any value object can accept custom controls, but the 
> names for standard ones would be taken from the standards provided by localeconv():
>
>  {
>   'decimal_point': '.',
>   'grouping': [3, 0],
>   'negative_sign': '-',
>   'positive_sign': '',
>   'thousands_sep': ','}
>
> The would let you store several locales using localeconv() and use them at will, thus solving the global variable and threading 
> problems with locale:
>
>     import locale
>     loc = locale.getlocale() # get current locale
>     locale.setlocale(locale.LC_ALL, 'de_DE')
>     DE = locale.localeconv()
>     locale.setlocale(locale.LC_ALL, 'en_US')
>     US = locale.localeconv()
>     locale.setlocale(locale.LC_ALL, loc) # restore saved locale
>
>     . . .
>
>     format(x, '8,.f', DE)
>     format(y, '8,d', US)
>
>
> It also lets you write your own conventions on the fly:
>
>     DEB = dict(thousands_sep='_')       # style for debugging
>     EXT = dict(thousands_sep=',')       # style for external display
>     . . .
>     format(x, '8.1f', DEB)
>     format(y, '8d', EXT)
>
>
> Raymond
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas 


From mishok13 at gmail.com  Wed Mar 18 08:52:35 2009
From: mishok13 at gmail.com (Andrii V. Mishkovskyi)
Date: Wed, 18 Mar 2009 09:52:35 +0200
Subject: [Python-ideas] dict '+' operator and slicing support for pop
In-Reply-To: <DB6A170CDD054DB5A724250634126068@RaymondLaptop1>
References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com>
	<91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com>
	<ca471dc20903171039t5acf2f6dka854f135e23ca419@mail.gmail.com>
	<DB6A170CDD054DB5A724250634126068@RaymondLaptop1>
Message-ID: <192840a00903180052y568c9108v2bfed293a4ac2d5f@mail.gmail.com>

On Tue, Mar 17, 2009 at 8:17 PM, Raymond Hettinger <python at rcn.com> wrote:
>
>>> a.update(b)
>>> return a
>
> Why take two short, simple lines with unequivocal meaning
> and then abbreviate them with something mysterious (or
> at least something with multiple possible interpretations)?
>
> Mappings exist in many languages now. ?Can you point
> to another language that has found it worthwhile to have
> both an update() method and an addition operator?
>
> Also, consider that dicts are one of our most basic APIs
> and many other objects model that API. ?It behooves us
> to keep that API as simple and thin as possible.
>
> IMO, this change would be gratuituous. ?None of the code
> presented so far is significantly improved. ?Essentially, we're
> looking at a trivial abbreviation, not an actual offering of
> new capabilities.

Reasonable enough.

>
> -1 all the way around.

Does that also mean -1 on list.pop() accepting slices proposal?

>
>
> Raymond
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Wbr, Andrii V. Mishkovskyi.

He's got a heart of a little child, and he keeps it in a jar on his desk.


From denis.spir at free.fr  Wed Mar 18 09:05:25 2009
From: denis.spir at free.fr (spir)
Date: Wed, 18 Mar 2009 09:05:25 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49C00D84.6020109@canterbury.ac.nz>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903161606q55180793t7253e69173c894e8@mail.gmail.com>
	<5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1>
	<ca471dc20903161642o6e32e29amdee04d419eddac14@mail.gmail.com>
	<AE9C5191227F4947B49D27339D168021@RaymondLaptop1>
	<ca471dc20903162014n1ad73607yd697d4ee4df485f8@mail.gmail.com>
	<5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com>
	<49BF8641.5080609@trueblade.com>
	<5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com>
	<49BFA474.9040308@trueblade.com>
	<ca471dc20903170717r6eb83a79k8e1ebbc0261e238f@mail.gmail.com>
	<49C00D84.6020109@canterbury.ac.nz>
Message-ID: <20090318090525.708d29ba@o>

Le Wed, 18 Mar 2009 08:52:20 +1200,
Greg Ewing <greg.ewing at canterbury.ac.nz> s'exprima ainsi:

> Guido van Rossum wrote:
> > I agree that the given width should include the
> > commas, but I strongly feel that leading zeros should be comma-fied
> > just like everything else.
> 
> I think we need some use cases before a proper
> decision can be made about this. If you were using
> comma-separated zero-filled numbers, what would
> your objective be, and what choice would best
> fulfill it?
> 

I think the point is just this:

   0,000,000.89
   1,234,567.89
looks right.

   0000000.89
   1,234,567.89
looks wrong.

     0000000.89
   1,234,567.89
looks wrong.

   000000000.89
   1,234,567.89
looks wrong.

------
la vita e estrany


From solipsis at pitrou.net  Wed Mar 18 11:42:47 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 18 Mar 2009 10:42:47 +0000 (UTC)
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>
Message-ID: <loom.20090318T104221-790@post.gmane.org>

Raymond Hettinger <python at ...> writes:
> 
> Am curious whether you guys like this proposal?

I find it good for the builtin format() function, but how does it work for
str.format()?


From steve at pearwood.info  Wed Mar 18 12:39:40 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 18 Mar 2009 22:39:40 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <gppn2l$1t7$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>
	<gppn2l$1t7$1@ger.gmane.org>
Message-ID: <200903182239.40862.steve@pearwood.info>

On Wed, 18 Mar 2009 01:42:27 pm Lie Ryan wrote:
> Guido van Rossum wrote:
> >>  (2) To avoid fraud
> >>
> >>> when printing certain documents -- it's easier to insert a '1' in
> >>> front of a small number than to change a '0' into something else.
> >>
> >> However it's easy to add a '1' before a string of leading
> >> zeroes if there's a sliver of space available, so it's
> >> better still to fill with some other character such as
> >> '*'. You need a cooperative font for that to work.
> >
> > What I've seen is the '$' sign immediately in front, e.g.
> > $001,000.00.
>
> I think I'd rather see something like: $==1,000.00==
>
> I wouldn't use zeroes, if I were the bank. It is bad on the
> aesthetics, and too easy to fraud.


What I've generally seen on cheques is $****1,000.00


-- 
Steven D'Aprano


From steve at pearwood.info  Wed Mar 18 12:45:11 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 18 Mar 2009 22:45:11 +1100
Subject: [Python-ideas] Customizing format()
In-Reply-To: <567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
	<gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
Message-ID: <200903182245.11720.steve@pearwood.info>

On Wed, 18 Mar 2009 11:25:42 am Raymond Hettinger wrote:
> Mark Dickinson's test code suggested a good, extensible approach to
> the problem.  Here's the idea in a nutshell:
>
>   format(value, format_spec='', conventions=None)
>      'calls value.__format__(format_spec, conventions)'


For what was supposed to be a nice, simple way of formatting numbers, it 
sure became confusing. So thank you for the nutshell.

I like this idea, especially if it means we can simplify the 
format_spec. Can we have the format_spec in a nutshell too?


> Where conventions is an optional dictionary with formatting control
> values.  Any value object can accept custom controls, but the names
> for standard ones would be taken from the standards provided by
> localeconv():
>
>   {
>    'decimal_point': '.',
>    'grouping': [3, 0],
>    'negative_sign': '-',
>    'positive_sign': '',
>    'thousands_sep': ','}

Presumably we value compatibility with localeconv()? If not, then 
perhaps a better name for 'thousands_sep' is 'group_sep', on account 
that if you group by something other than 3 it won't represent 
thousands.

Would this allow you to format a float like this?

1,234,567.89012 34567 89012

(group by threes for the integer part, and by fives for the fractional 
part). Or is that out-of-scope for this proposal?


+1 for a conventions dict. Good plan!


-- 
Steven D'Aprano


From eric at trueblade.com  Wed Mar 18 12:45:55 2009
From: eric at trueblade.com (Eric Smith)
Date: Wed, 18 Mar 2009 07:45:55 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <loom.20090318T104221-790@post.gmane.org>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>
	<loom.20090318T104221-790@post.gmane.org>
Message-ID: <49C0DEF3.2070905@trueblade.com>

Antoine Pitrou wrote:
> Raymond Hettinger <python at ...> writes:
>> Am curious whether you guys like this proposal?
> 
> I find it good for the builtin format() function, but how does it work for
> str.format()?

I agree: I like it, but it's not enough. I use str.format() way more 
often than I hope to ever use builtin format(). If we make any change, 
I'd rather see it focused on the format mini-language.

Eric.


From ncoghlan at gmail.com  Wed Mar 18 13:03:07 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 Mar 2009 22:03:07 +1000
Subject: [Python-ideas] Customizing format()
In-Reply-To: <49C0DEF3.2070905@trueblade.com>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>
	<49C0DEF3.2070905@trueblade.com>
Message-ID: <49C0E2FB.6010301@gmail.com>

Eric Smith wrote:
> Antoine Pitrou wrote:
>> Raymond Hettinger <python at ...> writes:
>>> Am curious whether you guys like this proposal?
>>
>> I find it good for the builtin format() function, but how does it work
>> for
>> str.format()?
> 
> I agree: I like it, but it's not enough. I use str.format() way more
> often than I hope to ever use builtin format(). If we make any change,
> I'd rather see it focused on the format mini-language.

Perhaps we could add a new ! type to the formatting language that allows
the developer to mark a particular argument as the conventions
dictionary? Then you could do something like:

# DE and US dicts as per Raymond's format() example
fmt = "The value is {:,.5f}{!conv}"
fmt.format(num, DE)
fmt.format(num, US)
fmt.format(num, dict(thousands_sep='''))

As with !a and !s, you could use any normal field specifier to select
the conventions dictionary. Obviously, the formatting arguments would be
ignored for that particular field.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Wed Mar 18 13:09:13 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 Mar 2009 22:09:13 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <200903182239.40862.steve@pearwood.info>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org>
	<200903182239.40862.steve@pearwood.info>
Message-ID: <49C0E469.2080207@gmail.com>

Steven D'Aprano wrote:
> What I've generally seen on cheques is $****1,000.00

Interestingly, str.format will actually be able to produce directly in 3.1:

  "${:*>,.2f}".format(value)

...although that makes seq[::-1] look positively coherent :)

Wondering-who-will-ask-for-a-{!verbose}-string-formatting-flag'ly,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From python at rcn.com  Wed Mar 18 13:44:53 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 18 Mar 2009 05:44:53 -0700
Subject: [Python-ideas] Rough draft: Proposed format specifier for
	a	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org><200903182239.40862.steve@pearwood.info>
	<49C0E469.2080207@gmail.com>
Message-ID: <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>


>> What I've generally seen on cheques is $****1,000.00
> 
> Interestingly, str.format will actually be able to produce directly in 3.1:
> 
>  "${:*>,.2f}".format(value)

What we have already in SVN courtesy of Mark Dickinson:

>>> from decimal import Decimal
>>> value = Decimal(1000)
>>> "${:*>12,.2f}".format(value)
'$****1,000.00'


Raymond


From lie.1296 at gmail.com  Wed Mar 18 14:41:56 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Thu, 19 Mar 2009 00:41:56 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org><200903182239.40862.steve@pearwood.info>	<49C0E469.2080207@gmail.com>
	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
Message-ID: <gpqtn6$8dc$1@ger.gmane.org>

Raymond Hettinger wrote:
> 
>>> What I've generally seen on cheques is $****1,000.00
>>
>> Interestingly, str.format will actually be able to produce directly in 
>> 3.1:
>>
>>  "${:*>,.2f}".format(value)
> 
> What we have already in SVN courtesy of Mark Dickinson:
> 
>>>> from decimal import Decimal
>>>> value = Decimal(1000)
>>>> "${:*>12,.2f}".format(value)
> '$****1,000.00'
> 

Anything but zeroes that isn't too similar to numeric character should 
be fine for "finance-related number".

PS: On this side of the world, the commas and the dots are reversed so I 
would not dream any solution that doesn't encompass at least that (which 
doesn't require additional function wrapping). I'd personally prefer 
fully customizable separator, as my personal preference is using space 
and decimal commas
PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER 
SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR.
PPPS: The next statement is a lie.
PPPPS: The mental institution thing is true.


From lie.1296 at gmail.com  Wed Mar 18 14:47:25 2009
From: lie.1296 at gmail.com (Lie Ryan)
Date: Thu, 19 Mar 2009 00:47:25 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <gpqtn6$8dc$1@ger.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org><200903182239.40862.steve@pearwood.info>	<49C0E469.2080207@gmail.com>	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
	<gpqtn6$8dc$1@ger.gmane.org>
Message-ID: <gpqu1f$9ib$1@ger.gmane.org>

Lie Ryan wrote:
> Anything but zeroes that isn't too similar to numeric character should 
> be fine for "finance-related number".
> 
> PS: On this side of the world, the commas and the dots are reversed so I 
> would not dream any solution that doesn't encompass at least that (which 
> doesn't require additional function wrapping). I'd personally prefer 
> fully customizable separator, as my personal preference is using space 
> and decimal commas
> PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER 
> SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR.
> PPPS: The next statement is a lie.
> PPPPS: The mental institution thing is true.
PPPPPS: The first postscript includes financial institution
PPPPPPS: The fact that you can wrap the formatting in function call is 
not an excuse for not providing fully customizable separators.
PPPPPPPS: The financial world != American financial institutions


From solipsis at pitrou.net  Wed Mar 18 14:57:50 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 18 Mar 2009 13:57:50 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org><200903182239.40862.steve@pearwood.info>	<49C0E469.2080207@gmail.com>	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
	<gpqtn6$8dc$1@ger.gmane.org> <gpqu1f$9ib$1@ger.gmane.org>
Message-ID: <loom.20090318T135202-162@post.gmane.org>

Lie Ryan <lie.1296 at ...> writes:
> 
> > PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER 
> > SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR.
> > PPPS: The next statement is a lie.
> > PPPPS: The mental institution thing is true.

I am fully sympathetic.

> PPPPPPPS: The financial world != American financial institutions

Agreed, but they have the largest debts.
Therefore, real-life examples of commas used as thousands separators should
include a negative sign.


From gerald.britton at gmail.com  Wed Mar 18 18:10:52 2009
From: gerald.britton at gmail.com (Gerald Britton)
Date: Wed, 18 Mar 2009 13:10:52 -0400
Subject: [Python-ideas] thoughts on generator.throw()
Message-ID: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>

Today I was reviewing changes in Python 2.5 and I noticed the
generator throw() method for the first time.  While thinking about
what it does and why, a question arose in my mind:

Why is it called "throw"?  (Yes, I know that Java and possibly other
languages use this keyword!)  In Python, we have long had a "raise"
statement to raise exceptions.  I would have thought that the
generator method would have been called "raise" as well.  But then I
saw that it would have been impossible to implement since "raise" is a
Python keyword.  *Then* I wondered why "raise" is a keyword and not a
function.  If it were a function you could use it easily in places
where today you cannot:

     if 'foo' == 'bar' or raise(FooBar):  # only proceed if 'foo'
equals 'bar' otherwise raise FooBar exception

is invalid syntax because raise is not a function.  Now, I can get around it:

    def raise_(exception):
         raise exception
    ...
    if 'foo' == 'bar' or raise_(FooBar):
    ...

I have a similar question about the "assert" statement.  It could
possibly benefit from being a function instead. Of course, changing
this would break lots of code, but maybe not any more than making
print a function as in 3.0.

Thoughts?

-- 
Gerald Britton


From solipsis at pitrou.net  Wed Mar 18 18:20:11 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 18 Mar 2009 17:20:11 +0000 (UTC)
Subject: [Python-ideas] thoughts on generator.throw()
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
Message-ID: <loom.20090318T171330-112@post.gmane.org>

Gerald Britton <gerald.britton at ...> writes:
> 
> But then I
> saw that it would have been impossible to implement since "raise" is a
> Python keyword.  *Then* I wondered why "raise" is a keyword and not a
> function.  If it were a function you could use it easily in places
> where today you cannot:
> 
>      if 'foo' == 'bar' or raise(FooBar):  # only proceed if 'foo'
> equals 'bar' otherwise raise FooBar exception

I find this horrible, awfully Perlish. Non-local control transfers should stick
out clearly when reading source code, not be hidden at the end of a conditional.

As for why raise is a keyword, I think there are several explanations:
- raise is a control flow operation, as are "return", "continue", "break" and
others.
- raise has to create a traceback capturing the current frame stack, which is
easier with a dedicated bytecode.
- raise should be decently fast, which is easier with a dedicated bytecode.

> I have a similar question about the "assert" statement.  It could
> possibly benefit from being a function instead.

I think the point is that assert is entirely a no-op when the interpreter is run
with "-O", while there would be a significant overhead if it was a regular
function call.

But I agree that the situation is less clear-cut than with the raise statement.

Regards

Antoine.


From leif.walsh at gmail.com  Wed Mar 18 18:42:42 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Wed, 18 Mar 2009 13:42:42 -0400
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <loom.20090318T171330-112@post.gmane.org>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<loom.20090318T171330-112@post.gmane.org>
Message-ID: <cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com>

On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> - raise should be decently fast, which is easier with a dedicated bytecode.

Why does raise have to be decently fast?

In my average case, at least, it's encountered at most once per
program execution.  Even if I was good about catching exceptions, the
point is that they're _exceptional_ cases, so they shouldn't be
happening very often.  I'm not about to say raise should be a
function, but I don't think it's got a huge speed requirement.

-- 
Cheers,
Leif


From george.sakkis at gmail.com  Wed Mar 18 19:07:47 2009
From: george.sakkis at gmail.com (George Sakkis)
Date: Wed, 18 Mar 2009 14:07:47 -0400
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<loom.20090318T171330-112@post.gmane.org>
	<cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com>
Message-ID: <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>

On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh <leif.walsh at gmail.com> wrote:
> On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> - raise should be decently fast, which is easier with a dedicated bytecode.
>
> Why does raise have to be decently fast?
>
> In my average case, at least, it's encountered at most once per
> program execution. ?Even if I was good about catching exceptions, the
> point is that they're _exceptional_ cases, so they shouldn't be
> happening very often.

That's not always true; StopIteration comes to mind.

George


From denis.spir at free.fr  Wed Mar 18 19:23:13 2009
From: denis.spir at free.fr (spir)
Date: Wed, 18 Mar 2009 19:23:13 +0100
Subject: [Python-ideas] logics (was:thoughts on generator.throw())
In-Reply-To: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
Message-ID: <20090318192313.7bc137f5@o>

Le Wed, 18 Mar 2009 13:10:52 -0400,
Gerald Britton <gerald.britton at gmail.com> s'exprima ainsi:

> *Then* I wondered why "raise" is a keyword and not a
> function.  If it were a function you could use it easily in places
> where today you cannot:
> 
>      if 'foo' == 'bar' or raise(FooBar):  # only proceed if 'foo'
> equals 'bar' otherwise raise FooBar exception
> 
> is invalid syntax because raise is not a function.

I'm very happy this is invalid syntax :-)
I consider this kind of practice conceptual distortion. More precisely: an abuse of both (!) flow control and logical operator semantics. It reminds me of joyful (hum!) times with C routines written by "clever" people.

Denis

PS

I would go much farther that python about logical types and operators.

Lazy evaluation is ok, because the alternative is not simpler:
   if n != 0 and 1/n > threshold: 

But I'm not happy at all with the following:
   >>> (3==3) + 1
   2
   >>> 1 or True
   1

I think logical operators (and or not) should accept only logical value. And logical values should not operate with numbers. 
------
la vita e estrany


From cmjohnson.mailinglist at gmail.com  Wed Mar 18 19:33:56 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Wed, 18 Mar 2009 08:33:56 -1000
Subject: [Python-ideas] Customizing format()
In-Reply-To: <49C0E2FB.6010301@gmail.com>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
	<gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>
	<loom.20090318T104221-790@post.gmane.org>
	<49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com>
Message-ID: <3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com>

I haven't entirely been following this conversation, so I may be
missing something, but what about something like:

"Balance = ${balance:{minilang}}".format(balance=1.00,
minilang=mini_formatter(thousands_sep=",", ...))

That way, even if the mini-language gets really confusing we'll have
an easy to call function that manages it. I always thought it was
weird that things co


From cmjohnson.mailinglist at gmail.com  Wed Mar 18 19:36:44 2009
From: cmjohnson.mailinglist at gmail.com (Carl Johnson)
Date: Wed, 18 Mar 2009 08:36:44 -1000
Subject: [Python-ideas] Customizing format()
In-Reply-To: <3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1>
	<gpp5he$m12$1@ger.gmane.org>
	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>
	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>
	<loom.20090318T104221-790@post.gmane.org>
	<49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com>
	<3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com>
Message-ID: <3bdda690903181136h7dbf7095l50c85014bb29e63d@mail.gmail.com>

Carl Johnson wrote:

> I haven't entirely been following this conversation, so I may be
> missing something, but what about something like:
>
> "Balance = ${balance:{minilang}}".format(balance=1.00,
> minilang=mini_formatter(thousands_sep=",", ...))
>
> That way, even if the mini-language gets really confusing we'll have
> an easy to call function that manages it. I always thought it was
> weird that things co

Sorry, Google mail has been being weird lately, signing me out and
suddenly sending mail, etc.

?So, I always thought it was weird that {}s could nest in the new
format language, but if we have that capability, we may as well use
it.

-- Carl


From arnodel at googlemail.com  Wed Mar 18 20:55:41 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Wed, 18 Mar 2009 19:55:41 +0000
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<loom.20090318T171330-112@post.gmane.org>
	<cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com>
	<91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>
Message-ID: <EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>


On 18 Mar 2009, at 18:07, George Sakkis wrote:

> On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh <leif.walsh at gmail.com>  
> wrote:
>> On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou  
>> <solipsis at pitrou.net> wrote:
>>> - raise should be decently fast, which is easier with a dedicated  
>>> bytecode.
>>
>> Why does raise have to be decently fast?
>>
>> In my average case, at least, it's encountered at most once per
>> program execution.  Even if I was good about catching exceptions, the
>> point is that they're _exceptional_ cases, so they shouldn't be
>> happening very often.
>
> That's not always true; StopIteration comes to mind.

But StopIteration is not usually raised explicitly.

-- 
Arnaud


From leif.walsh at gmail.com  Wed Mar 18 20:58:06 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Wed, 18 Mar 2009 15:58:06 -0400 (EDT)
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
Message-ID: <fsgfr51al3fsn2ek0lUYAxe124vaj_firegpg@mail.gmail.com>

On Wed, Mar 18, 2009 at 3:55 PM, Arnaud Delobelle <arnodel at googlemail.com> wrote:
> But StopIteration is not usually raised explicitly.

He's got a point though, raise should be fast.

-- 
Cheers,
Leif

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 270 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090318/e300ede1/attachment.pgp>

From python at rcn.com  Wed Mar 18 21:04:12 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 18 Mar 2009 13:04:12 -0700
Subject: [Python-ideas] thoughts on generator.throw()
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com><loom.20090318T171330-112@post.gmane.org><cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com><91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>
	<EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
Message-ID: <E737B30F86CE41C09ACA9404B4218941@RaymondLaptop1>


>>>.  Even if I was good about catching exceptions, the
>>> point is that they're _exceptional_ cases, so they shouldn't be
>>> happening very often.

Not everyone programs that way.  Python has long
advertised exceptions for other than the exceptional.
You're misapplying C++ lore to Python.


Raymond


From tjreedy at udel.edu  Wed Mar 18 21:14:38 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 Mar 2009 16:14:38 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <49C0DEF3.2070905@trueblade.com>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>
	<49C0DEF3.2070905@trueblade.com>
Message-ID: <gprkne$31f$1@ger.gmane.org>

Eric Smith wrote:
> Antoine Pitrou wrote:
>> Raymond Hettinger <python at ...> writes:
>>> Am curious whether you guys like this proposal?
>>
>> I find it good for the builtin format() function, but how does it work 
>> for
>> str.format()?
> 
> I agree: I like it, but it's not enough. I use str.format() way more 
> often than I hope to ever use builtin format(). If we make any change, 
> I'd rather see it focused on the format mini-language.

I agree.  My impression was that format() was added mostly for 
consistency with the policy of having a 'public' interface to special 
methods, and that .__format__ was added to support str.format.  Hence, 
any new capability of .__format__ must be accessible from format strings 
with replacement fields.

tjr


From tjreedy at udel.edu  Wed Mar 18 21:33:49 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 Mar 2009 16:33:49 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <49C0E2FB.6010301@gmail.com>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com>
	<49C0E2FB.6010301@gmail.com>
Message-ID: <gprlrd$719$1@ger.gmane.org>

Nick Coghlan wrote:
> Eric Smith wrote:

>> I agree: I like it, but it's not enough. I use str.format() way more
>> often than I hope to ever use builtin format(). If we make any change,
>> I'd rather see it focused on the format mini-language.
> 
> Perhaps we could add a new ! type to the formatting language that allows
> the developer to mark a particular argument as the conventions
> dictionary? Then you could do something like:
> 
> # DE and US dicts as per Raymond's format() example
> fmt = "The value is {:,.5f}{!conv}"

A new conversion specifier should follow the current pattern and be a 
single letter, such as 'c' for 'custom' or 'd' for dict.

If, as I would expect, str.format scans left to right and interprets and 
replaces each field spec as it goes, then the above would not work. So 
put the conversion field before the fields it applies to.

This, of course, makes string formatting stateful.  With a 'shift lock' 
field added, an 'unshift' field should also be added.  This, though, has 
the problem that a blank 'field-name' will in 3.1 either be 
auto-numbered or flagged as an error (if there are other explicitly 
numbered fields).

I am a little uneasy about 'replacement fields' that are not really 
replacement fields.

> fmt.format(num, DE)
> fmt.format(num, US)
> fmt.format(num, dict(thousands_sep='''))
> 
> As with !a and !s, you could use any normal field specifier to select
> the conventions dictionary. Obviously, the formatting arguments would be
> ignored for that particular field.

Terry Jan Reedy


From python at rcn.com  Wed Mar 18 21:37:47 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 18 Mar 2009 13:37:47 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com>
	<gprlrd$719$1@ger.gmane.org>
Message-ID: <B4BC46C196F140058EF0CFE7D1C63C40@RaymondLaptop1>


>> # DE and US dicts as per Raymond's format() example
>> fmt = "The value is {:,.5f}{!conv}"
> 
> A new conversion specifier should follow the current pattern and be a 
> single letter, such as 'c' for 'custom' or 'd' for dict.
> 
> If, as I would expect, str.format scans left to right and interprets and 
> replaces each field spec as it goes, then the above would not work. So 
> put the conversion field before the fields it applies to.

My interpretation is that the conv-dictionary applies to the whole
string (not field-by-field) and that it can go at the end (because
it doesn't affect parsing, rather it applies to the translation phase).


Raymond


From tjreedy at udel.edu  Wed Mar 18 22:27:08 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 Mar 2009 17:27:08 -0400
Subject: [Python-ideas] Customizing format()
In-Reply-To: <B4BC46C196F140058EF0CFE7D1C63C40@RaymondLaptop1>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com>	<gprlrd$719$1@ger.gmane.org>
	<B4BC46C196F140058EF0CFE7D1C63C40@RaymondLaptop1>
Message-ID: <gprovc$icr$1@ger.gmane.org>

Raymond Hettinger wrote:
> 
> 
>>> # DE and US dicts as per Raymond's format() example
>>> fmt = "The value is {:,.5f}{!conv}"
>>
>> A new conversion specifier should follow the current pattern and be a 
>> single letter, such as 'c' for 'custom' or 'd' for dict.
>>
>> If, as I would expect, str.format scans left to right and interprets 
>> and replaces each field spec as it goes, then the above would not 
>> work. So put the conversion field before the fields it applies to.
> 
> My interpretation is that the conv-dictionary applies to the whole
> string (not field-by-field) 

That was not specified.  If so, then a statement like
"""A number such as {0:15.2f} can be formatted many ways:
USA: {0:15,.2f), EU: {0:15<whatever>f},
India: {0:15<whatever>f), China {0:15<whatever>f)"
would not be possible.

Why not allow extra flexibility?  Unless the conversion is set by 
setting a global variable ala locale, the c-dict will be *used* 
field-by-field in each call to ob.__format__(fmt, conv), so there is no 
reason to force each call in a particular series to use the same conversion.

 > and that it can go at the end (because
> it doesn't affect parsing, rather it applies to the translation phase).

We agree that parsing out the conversion spec must happen before the 
translation it affects.  If, as I supposed  above (because of how I 
would think to write the code), parsing and translation are intermixed, 
then parsing the spec *after* translation will not work.

Even if they are done in two batches, it would still be easy to rebind 
the c-dict var during the second-phase scan of the replacement fields.

Terry Jan Reedy


From python at rcn.com  Wed Mar 18 22:44:24 2009
From: python at rcn.com (Raymond Hettinger)
Date: Wed, 18 Mar 2009 14:44:24 -0700
Subject: [Python-ideas] Customizing format()
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com>	<gprlrd$719$1@ger.gmane.org><B4BC46C196F140058EF0CFE7D1C63C40@RaymondLaptop1>
	<gprovc$icr$1@ger.gmane.org>
Message-ID: <203D79CECB0B411AA2C2762C06900068@RaymondLaptop1>


>> My interpretation is that the conv-dictionary applies to the whole
>> string (not field-by-field) 
> 
> That was not specified.  If so, then a statement like
> """A number such as {0:15.2f} can be formatted many ways:
> USA: {0:15,.2f), EU: {0:15<whatever>f},
> India: {0:15<whatever>f), China {0:15<whatever>f)"
> would not be possible.
> 
> Why not allow extra flexibility?  Unless the conversion is set by 
> setting a global variable ala locale, the c-dict will be *used* 
> field-by-field in each call to ob.__format__(fmt, conv), so there is no 
> reason to force each call in a particular series to use the same conversion.

-1   Unattractive and unnecessary hyper-generalization.


Raymond


From aahz at pythoncraft.com  Wed Mar 18 23:06:12 2009
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 18 Mar 2009 15:06:12 -0700
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<loom.20090318T171330-112@post.gmane.org>
	<cc7430500903181042l3cf60100tf46cda246beb4e53@mail.gmail.com>
	<91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>
	<EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
Message-ID: <20090318220612.GA7221@panix.com>

On Wed, Mar 18, 2009, Arnaud Delobelle wrote:
> On 18 Mar 2009, at 18:07, George Sakkis wrote:
>> On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh <leif.walsh at gmail.com>  
>> wrote:
>>>
>>> Why does raise have to be decently fast?
>>>
>>> In my average case, at least, it's encountered at most once per
>>> program execution.  Even if I was good about catching exceptions, the
>>> point is that they're _exceptional_ cases, so they shouldn't be
>>> happening very often.
>>
>> That's not always true; StopIteration comes to mind.
>
> But StopIteration is not usually raised explicitly.

This is a standard Python idiom:

        try:
            for field in curr_fields:
                for item in record[field]:
                    item = item.lower()
                    for filter in excludes:
                        if match(item, filter):
                            raise Excluded
        except Excluded: 
            continue 
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Programming language design is not a rational science. Most reasoning
about it is at best rationalization of gut feelings, and at worst plain
wrong."  --GvR, python-ideas, 2009-3-1


From ncoghlan at gmail.com  Wed Mar 18 23:06:52 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 08:06:52 +1000
Subject: [Python-ideas] Customizing format()
In-Reply-To: <gprkne$31f$1@ger.gmane.org>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com>
	<gprkne$31f$1@ger.gmane.org>
Message-ID: <49C1707C.7090105@gmail.com>

Terry Reedy wrote:
> I agree.  My impression was that format() was added mostly for
> consistency with the policy of having a 'public' interface to special
> methods, and that .__format__ was added to support str.format.  Hence,
> any new capability of .__format__ must be accessible from format strings
> with replacement fields.

format() was also added because the PEP 3101 syntax is pretty
heavyweight when it comes to formatting a single value:

     "%.2f" % (x)
and  "{0:.2f}".format(x)

Being able to write format(".2f", x) instead meant dropping 4 characters
(now 3 with str.format autonumbering) over the latter option.

Agreed that any solution in this area needs to help with str.format()
and not just format() though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From steve at pearwood.info  Wed Mar 18 23:19:55 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 19 Mar 2009 09:19:55 +1100
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com>
	<EFAF3DB6-F22A-4803-BDC0-BB8567B4F18B@googlemail.com>
Message-ID: <200903190919.55427.steve@pearwood.info>

On Thu, 19 Mar 2009 06:55:41 am Arnaud Delobelle wrote:
> On 18 Mar 2009, at 18:07, George Sakkis wrote:
> > On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh <leif.walsh at gmail.com>
> >
> > wrote:
> >> On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou
> >>
> >> <solipsis at pitrou.net> wrote:
> >>> - raise should be decently fast, which is easier with a dedicated
> >>> bytecode.
> >>
> >> Why does raise have to be decently fast?
> >>
> >> In my average case, at least, it's encountered at most once per
> >> program execution.  Even if I was good about catching exceptions,
> >> the point is that they're _exceptional_ cases, so they shouldn't
> >> be happening very often.
> >
> > That's not always true; StopIteration comes to mind.
>
> But StopIteration is not usually raised explicitly.

It still has to be raised. It's not just StopIteration either, the 
iteration protocol also catches IndexError:

>>> class C(object):
...     def __getitem__(self, i):
...         if i < 3: return i
...         else: raise IndexError
...
>>> c = C()
>>> for i in c:
...     print i
...
0
1
2
>>>


-- 
Steven D'Aprano


From ncoghlan at gmail.com  Wed Mar 18 23:21:09 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 08:21:09 +1000
Subject: [Python-ideas] Customizing format()
In-Reply-To: <gprlrd$719$1@ger.gmane.org>
References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1><gpp5he$m12$1@ger.gmane.org>	<567E18322D754B74B56013E8811D10AA@RaymondLaptop1>	<F4D81FF574124ED8B9CFF0FBF37CB5C0@RaymondLaptop1>	<loom.20090318T104221-790@post.gmane.org>	<49C0DEF3.2070905@trueblade.com>	<49C0E2FB.6010301@gmail.com>
	<gprlrd$719$1@ger.gmane.org>
Message-ID: <49C173D5.7090700@gmail.com>

Terry Reedy wrote:
> Nick Coghlan wrote:
> A new conversion specifier should follow the current pattern and be a
> single letter, such as 'c' for 'custom' or 'd' for dict.

Because those characters already have other meanings in string
formatting dictionary (as do many possible single digit codes). The
suggested name "!conv" was chosen based on the existing localeconv()
function name.

> If, as I would expect, str.format scans left to right and interprets and
> replaces each field spec as it goes, then the above would not work. So
> put the conversion field before the fields it applies to.

I believe you're currently right - I'm not sure how hard it would be to
change it to a two step process (parse the whole string first into an
internal parse tree then go through and format each identified field).

As for why I formatted the example the way I did: the {!conv} isn't all
that interesting, since it just says "I accept a conventions
dictionary". Having it at the front of the format string would give it
to much prominence.

> This, of course, makes string formatting stateful.  With a 'shift lock'
> field added, an 'unshift' field should also be added.  This, though, has
> the problem that a blank 'field-name' will in 3.1 either be
> auto-numbered or flagged as an error (if there are other explicitly
> numbered fields).

Aside from not producing any output, the !conv field would still have to
obey all the rules for field naming/numbering. So if your format string
used explicit numbering instead of auto-numbering then the !conv would
need to be explicitly numbered as well.

I agree that having "format fields which are not format fields" isn't
ideal, but the alternative is likely to be something like
yet-another-string-formatting-method which accepts a positional only
conventions dictionary as its first argument.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Wed Mar 18 23:30:56 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 08:30:56 +1000
Subject: [Python-ideas] thoughts on generator.throw()
In-Reply-To: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
Message-ID: <49C17620.6000204@gmail.com>

Gerald Britton wrote:
> Today I was reviewing changes in Python 2.5 and I noticed the
> generator throw() method for the first time.  While thinking about
> what it does and why, a question arose in my mind:
> 
> Why is it called "throw"?  (Yes, I know that Java and possibly other
> languages use this keyword!)  In Python, we have long had a "raise"
> statement to raise exceptions.  I would have thought that the
> generator method would have been called "raise" as well.  But then I
> saw that it would have been impossible to implement since "raise" is a
> Python keyword.

Actually, it was also called throw because it says "raise this exception
over *there* (i.e inside the generator)". We're throwing the exception
"over the fence" as it were. That was a rationalisation of a necessity
(see the description in PEP 342), but still a good idea.

>  *Then* I wondered why "raise" is a keyword and not a
> function.

Because the compiler needs to see it and insert the appropriate commands
into the bytecode to tell the interpreter to find the nearest exception
handler or finally block and resume execution there.

While you could probably figure out a way to do that without dedicated
bytecode, I doubt it would do good things to the structure of the eval loop.

> I have a similar question about the "assert" statement.  It could
> possibly benefit from being a function instead. Of course, changing
> this would break lots of code, but maybe not any more than making
> print a function as in 3.0.

As others have said, so the compiler can drop it when optimisation is
switched on.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Wed Mar 18 23:33:43 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 08:33:43 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <loom.20090318T135202-162@post.gmane.org>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>	<gppn2l$1t7$1@ger.gmane.org><200903182239.40862.steve@pearwood.info>	<49C0E469.2080207@gmail.com>	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>	<gpqtn6$8dc$1@ger.gmane.org>
	<gpqu1f$9ib$1@ger.gmane.org>
	<loom.20090318T135202-162@post.gmane.org>
Message-ID: <49C176C7.7090309@gmail.com>

Antoine Pitrou wrote:
> Agreed, but they have the largest debts.
> Therefore, real-life examples of commas used as thousands separators should
> include a negative sign.

A. :)

B. All I can suggest is to try to think of the "commas as separators in
format()" situation as being in the same vein as that whole "let use
English keywords where possible" idea :)

Hopefully a way will be found to provide a less English-centric but
still easy to use formatting system eventually, but in the meantime
Python *is* a language that looks like English pseudocode...

Cheers,
Nick

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From rdmurray at bitdance.com  Thu Mar 19 01:37:55 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 19 Mar 2009 00:37:55 +0000 (UTC)
Subject: [Python-ideas] logics (was:thoughts on generator.throw())
References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com>
	<20090318192313.7bc137f5@o>
Message-ID: <gps453$k54$1@ger.gmane.org>

spir <denis.spir at free.fr> wrote:
> Le Wed, 18 Mar 2009 13:10:52 -0400,
> I would go much farther that python about logical types and operators.
> 
> Lazy evaluation is ok, because the alternative is not simpler:
>    if n != 0 and 1/n > threshold: 
> 
> But I'm not happy at all with the following:
>    >>> (3==3) + 1
>    2
>    >>> 1 or True
>    1
> 
> I think logical operators (and or not) should accept only logical value. And
> logical values should not operate with numbers. 

I might be argued into agreeing with you about the first case, but
it might be a logical consequence of the implementation of the second
case.  Or it might be an historical accident, since True used to be 1.
(But the statement still gives that result in Python3, so unless it was
just overlooked in the cleanup, someone must think it is a good idea.)

But I would very definitely not want to give up the second example.
Having the shortcut logical operators return the actual value that
was last evaluated is just too darn useful :)

--
R. David Murray           http://www.bitdance.com


From denis.spir at free.fr  Thu Mar 19 10:12:20 2009
From: denis.spir at free.fr (spir)
Date: Thu, 19 Mar 2009 10:12:20 +0100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
 thousands separator (discussion moved from python-dev)
In-Reply-To: <49C176C7.7090309@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>
	<gppn2l$1t7$1@ger.gmane.org>
	<200903182239.40862.steve@pearwood.info>
	<49C0E469.2080207@gmail.com>
	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
	<gpqtn6$8dc$1@ger.gmane.org> <gpqu1f$9ib$1@ger.gmane.org>
	<loom.20090318T135202-162@post.gmane.org>
	<49C176C7.7090309@gmail.com>
Message-ID: <20090319101220.41e580c7@o>

Le Thu, 19 Mar 2009 08:33:43 +1000,
Nick Coghlan <ncoghlan at gmail.com> s'exprima ainsi:

> B. All I can suggest is to try to think of the "commas as separators in
> format()" situation as being in the same vein as that whole "let use
> English keywords where possible" idea :)

This is a wrong rationale. The readers of python keywords is the community of pythonistas (*); while the readers of documents produced by apps written in python can be any kind of people.
"1,234,567.89" is more or less illegible for people not used to english conventions. Specifying the separator(s) is definitely a bad idea imo.
I have not understood the proposal to be intended only for debug, but for all kinds of quick and/or unpublished developpment. Even in the first case, having numbers output in the format your eyes are used to is a nice & worthful help.
Imagine you -- and all programmers, and millions of users -- would have to cope with numbers like "1.234.567,89" all the time only because someone decided (for any reason) that separators must be fixed, and this format is the obvious one.

Denis

(*) ditto about english naming, comments, & doc inside standard library

------
la vita e estrany


From steve at pearwood.info  Thu Mar 19 10:27:27 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 19 Mar 2009 20:27:27 +1100
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
In-Reply-To: <20090319101220.41e580c7@o>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o>
Message-ID: <200903192027.27510.steve@pearwood.info>

On Thu, 19 Mar 2009 08:12:20 pm spir wrote:
> Le Thu, 19 Mar 2009 08:33:43 +1000,
>
> Nick Coghlan <ncoghlan at gmail.com> s'exprima ainsi:
> > B. All I can suggest is to try to think of the "commas as
> > separators in format()" situation as being in the same vein as that
> > whole "let use English keywords where possible" idea :)
>
> This is a wrong rationale. The readers of python keywords is the
> community of pythonistas (*); while the readers of documents produced
> by apps written in python can be any kind of people. "1,234,567.89"
> is more or less illegible for people not used to english conventions.
> Specifying the separator(s) is definitely a bad idea imo. I have not
> understood the proposal to be intended only for debug, but for all
> kinds of quick and/or unpublished developpment. Even in the first
> case, having numbers output in the format your eyes are used to is a
> nice & worthful help. Imagine you -- and all programmers, and
> millions of users -- would have to cope with numbers like
> "1.234.567,89" all the time only because someone decided (for any
> reason) that separators must be fixed, and this format is the obvious
> one.

It would be sub-optimal but hardly "more or less illegible".

But then I'm not American and therefore I'm already used to people 
misspelling colour as "color", centre as "center", and biscuit 
as "cookie" *wink*

Nevertheless, I agree that for output, we shouldn't hard-code the 
decimal and thousands separator as "." and "," respectively -- although 
as an English-speaker, I'd be happy for those choices to be the 
default.

But surely with Raymond and Mark's idea about passing a dict derived 
from locale, this is no longer an issue? Are hard-coded separators 
still on the table?


-- 
Steven D'Aprano


From fredrik.johansson at gmail.com  Thu Mar 19 10:59:08 2009
From: fredrik.johansson at gmail.com (Fredrik Johansson)
Date: Thu, 19 Mar 2009 10:59:08 +0100
Subject: [Python-ideas] Builtin test function
Message-ID: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>

There's been some discussion about automatic test discovery lately.
Here's a random (not in any way thought through) idea: add a builtin
function test() that runs tests associated with a given function,
class, module, or object.

Example:

    >>> import myproject
    >>> test(myproject.MainClass)
    ...
    >>> test(myproject)
    ...

By default, test(obj) could simply run all doctests in docstrings
attached to obj. For modules, it could also look for unittest.TestCase
instances, and perhaps do some more advanced test discovery. test()
could implement some keyword options to control exactly what and what
not to do. There could perhaps also be a corresponding __test__
method/function for implementing custom test runners.

Fredrik


From robertc at robertcollins.net  Thu Mar 19 11:23:57 2009
From: robertc at robertcollins.net (Robert Collins)
Date: Thu, 19 Mar 2009 21:23:57 +1100
Subject: [Python-ideas] Builtin test function
In-Reply-To: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
Message-ID: <1237458237.15722.206.camel@lifeless-64>

On Thu, 2009-03-19 at 10:59 +0100, Fredrik Johansson wrote:
> There's been some discussion about automatic test discovery lately.
> Here's a random (not in any way thought through) idea: add a builtin
> function test() that runs tests associated with a given function,
> class, module, or object.

This takes out all of the [useful] configuration for output - parallel
testing, distributed testing, testing from an IDE etc.

I'd love to see something like bzr's load_tests module scope hook
honoured by the default test loader. It makes test discovery compatible
with test customisation.

I'd be happy to put a patch together.

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090319/e2b9a338/attachment.pgp>

From steve at pearwood.info  Thu Mar 19 11:48:53 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 19 Mar 2009 21:48:53 +1100
Subject: [Python-ideas] Builtin test function
In-Reply-To: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
Message-ID: <200903192148.54461.steve@pearwood.info>

On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote:
> There's been some discussion about automatic test discovery lately.
> Here's a random (not in any way thought through) idea: add a builtin
> function test() that runs tests associated with a given function,
> class, module, or object.

Improved testing is always welcome, but why a built-in?

I know testing is important, but is it so common and important that we 
need it at our fingertips, so to speak, and can't even import a module 
first before running tests? What's the benefit to making it a built-in 
instead of part of a test module?


-- 
Steven D'Aprano


From ncoghlan at gmail.com  Thu Mar 19 12:16:14 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 21:16:14 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <200903192027.27510.steve@pearwood.info>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49C176C7.7090309@gmail.com>
	<20090319101220.41e580c7@o>
	<200903192027.27510.steve@pearwood.info>
Message-ID: <49C2297E.302@gmail.com>

Steven D'Aprano wrote:
> But surely with Raymond and Mark's idea about passing a dict derived 
> from locale, this is no longer an issue? Are hard-coded separators 
> still on the table?

That's a separate discussion, not part of PEP 377. The comma in PEP 377
is hardcoded, just like the decimal point. If formatting becomes more
configurable it will be via a new PEP.

What I don't get here is that anyone writing "quick and dirty" scripts
that still needed locale appropriate output appropriate for
non-developer end users* already couldn't use %-formatting or str.format
for the task. The decimal point was wrong and there was no way at all to
insert a thousands separator.

If it's only a matter of localisation, then the locale module can do the
job and the affected developers are probably already using it. If it's a
matter of internationalisation, then that involves a lot more than just
a comma here and there, and again, affected developers will already be
using an appropriate tool.

The PEP provides a quick way to make big numbers more readable when the
intended audience is either the developer themselves (i.e. debugging
messages), or an audience of IT types (e.g. system administrators). Yes,
it is inadequate in many situations for formatting strings for display
to non-developer end users - that isn't a new problem, and PEP 377
doesn't make it any worse than it already was.

Cheers,
Nick.

*(Note that such scripts actually sound neither quick nor dirty to me -
as soon as you're producing output for non-developers you have to pay
far more attention to the formatting and other presentation aspects,
whether those readers are native English speakers or not)


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From rdmurray at bitdance.com  Thu Mar 19 12:42:55 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 19 Mar 2009 11:42:55 +0000 (UTC)
Subject: [Python-ideas] Rough draft: Proposed format specifier for a
	thousands separator (discussion moved from python-dev)
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>
	<ca471dc20903171450m2705bf14u66842afa45a792b8@mail.gmail.com>
	<gppn2l$1t7$1@ger.gmane.org>
	<200903182239.40862.steve@pearwood.info>
	<49C0E469.2080207@gmail.com>
	<8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1>
	<gpqtn6$8dc$1@ger.gmane.org> <gpqu1f$9ib$1@ger.gmane.org>
	<loom.20090318T135202-162@post.gmane.org>
	<49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o>
Message-ID: <gptb3u$ga8$1@ger.gmane.org>

spir <denis.spir at free.fr> wrote:
> Le Thu, 19 Mar 2009 08:33:43 +1000, Nick Coghlan
> <ncoghlan at gmail.com> s'exprima ainsi:
> 
> > B. All I can suggest is to try to think of the "commas as separators in
> > format()" situation as being in the same vein as that whole "let use
> > English keywords where possible" idea :)
> 
> This is a wrong rationale. The readers of python keywords is the community of
> pythonistas (*); while the readers of documents produced by apps written in
> python can be any kind of people.  "1,234,567.89" is more or less illegible

But the thing currently approved, using ',' to indicated that thousands
separators should be used, is _exactly_ like the keyword situation.
It's something that the programmer types and reads.

Controlling what character actually gets used in the output is a separate
issue that still needs to be addressed, to my understanding.  For now,
we are defaulting to English, just like usual ;)

--
R. David Murray           http://www.bitdance.com


From eric at trueblade.com  Thu Mar 19 12:50:17 2009
From: eric at trueblade.com (Eric Smith)
Date: Thu, 19 Mar 2009 07:50:17 -0400
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <49C2297E.302@gmail.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49C176C7.7090309@gmail.com>	<20090319101220.41e580c7@o>	<200903192027.27510.steve@pearwood.info>
	<49C2297E.302@gmail.com>
Message-ID: <49C23179.2090703@trueblade.com>

> That's a separate discussion, not part of PEP 377. The comma in PEP 377
> is hardcoded, just like the decimal point. If formatting becomes more
> configurable it will be via a new PEP.

For the record, it's PEP 378.

Eric.


From ncoghlan at gmail.com  Thu Mar 19 13:04:44 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 Mar 2009 22:04:44 +1000
Subject: [Python-ideas] Rough draft: Proposed format specifier for
 a	thousands separator (discussion moved from python-dev)
In-Reply-To: <49C23179.2090703@trueblade.com>
References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1>	<49C176C7.7090309@gmail.com>	<20090319101220.41e580c7@o>	<200903192027.27510.steve@pearwood.info>	<49C2297E.302@gmail.com>
	<49C23179.2090703@trueblade.com>
Message-ID: <49C234DC.7010401@gmail.com>

Eric Smith wrote:
>> That's a separate discussion, not part of PEP 377. The comma in PEP 377
>> is hardcoded, just like the decimal point. If formatting becomes more
>> configurable it will be via a new PEP.
> 
> For the record, it's PEP 378.

Sorry about that - got my PEP numbers mixed up (377 is floating around
in my brain since I still have to update it with Guido's rejection).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Thu Mar 19 15:38:26 2009
From: jh at improva.dk (Jacob Holm)
Date: Thu, 19 Mar 2009 15:38:26 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49AB1F90.7070201@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
Message-ID: <49C258E2.8050505@improva.dk>

Hi Greg

Greg Ewing wrote:
> I've made another couple of tweaks to the formal semantics
> (so as not to over-specify when the iterator methods are
> looked up).
>
> Latest version of the PEP, together with the prototype
> implementation and other related material, is available
> here:
>
> http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/
>

I am working on my own patch, based on rev2 of yours from the above link 
and the algorithm I have been going on about. It is currently working, 
and is even slightly faster than yours in every test I have (much faster 
in some, that was the whole point). I still need to do a bit of cleanup 
before I throw it to the wolves though...

Anyway, I have a few questions/comments to your patch.

   1. There is a small refcounting bug in your gen_iternext function. On
      success, it returns without decref'ing "yf".
   2. In the comment for "gen_undelegate" you mention "certain recursive
      situations" where a generator may lose its frame before we get a
      chance to clear f_yieldfrom. Can you elaborate? I can't think of
      any, and haven't been able to catch any with asserts in a
      debug-build using my own patch. However, if they exist I will need
      to handle it somehow and knowing what they are would certainly help.
   3. It looks like you are not calling "close" properly from "next",
      "send" and "throw". This makes no difference when delegating to a
      generator (the missing close would be a no-op), but would be an
      issue when delegating to a non-generator.
   4. It looks like your "gen_close" does not try to throw a
      GeneratorExit before calling close when delegating to a
      non-generator. I think it should to match the description of
      "close" in PEP342 and the expansion in your PEP.

Other than that, great work. It would have taken me ages to figure out 
all the necessary changes to the grammar, parser, ... and so on by 
myself. In fact I probably wouldn't even have tried.

I hope this helps, and promise to publish my own version of the patch 
once I think it is fit for public consumption.

Best regards

- Jacob


From mrs at mythic-beasts.com  Fri Mar 20 00:12:49 2009
From: mrs at mythic-beasts.com (Mark Seaborn)
Date: Thu, 19 Mar 2009 23:12:49 +0000 (GMT)
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
References: <20090312.202410.846948621.mrs@localhost.localdomain>
	<ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
Message-ID: <20090319.231249.343185657.mrs@localhost.localdomain>

Guido van Rossum <guido at python.org> wrote:

> On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> > Suppose we have an object x with a private attribute, "_field",
> > defined by a class Foo:
> >
> > class Foo(object):
> >
> > ? ?def __init__(self):
> > ? ? ? ?self._field = "secret"
> >
> > x = Foo()
> 
> Can you add some principals to this example? Who wrote the Foo class
> definition? Does CapPython have access to the source code for Foo? To
> the class object?

OK, suppose we have two principals, Alice and Bob.  Alice receives a
string from Bob.  Alice instantiates the string using CapPython's
safe_eval() function, getting back a module object that contains a
function object.  Alice passes the function an object x.  Alice's
intention is that the function should not be able to get hold of the
contents of x._field, no matter what string Bob supplies.

To make this more concrete, this is what Alice executes, with
source_from_bob defined in a string literal for the sake of example:

source_from_bob = """
class C:
    def f(self):
        return self._field
def entry_point(x):
    C.f(x) # potentially gets the secret object in Python 3.0
"""

import safeeval

secret = object()

class Foo(object):
    def __init__(self):
        self._field = secret

x = Foo()
module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
module.entry_point(x)


In this example, Bob's code is not given access to the class object
Foo.  Furthermore, Bob should not be able to get access to the class
Foo from the instance x.  The type() builtin is not considered to be
safe in CapPython so it is not included in the default environment.

Bob's code is not given access to the source code for class Foo.  But
even if Bob is aware of Alice's source code, it should not affect
whether Bob can get hold of the secret object.

By the way, you can try out the example by getting the code from the
Bazaar repository:
bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython


> > However, in Python 3.0, the CapPython code can do this:
> >
> > class C(object):
> >
> > ? ?def f(self):
> > ? ? ? ?return self._field
> >
> > C.f(x) # returns "secret"
> >
> > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is
> > not being called on an instance of C.
> 
> In Python 2.x I could write
> 
> class C(Foo):
>   def f(self):
>     return self._field

In the example above, Bob's code is not given access to Foo, so Bob
cannot do this.  But you are right, if Bob's code were passed Foo as
well as x, Bob could do this.

Suppose Alice wanted to give Bob access to class Foo, perhaps so that
Bob could create derived classes.  It is still possible for Alice to
do that safely, if Alice defines Foo differently.  Alice can pass the
secret object to Foo's constructor instead of having the class
definition get its reference to the secret object from an enclosing
scope:

class Foo(object):

    def __init__(self, arg):
        self._field = arg

secret = object()
x = Foo(secret)
module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
module.entry_point(x, Foo)


Bob can create his own objects derived from Foo, but cannot use his
access to Foo to break encapsulation of instance x.  Foo is now
authorityless, in the sense that it does not capture "secret" from its
enclosing environment, unlike the previous definition.


> or alternatively
> 
> class C(x.__class__):
>   <same f as before>

The verifier would reject x.__class__, so this is not possible.


> > Guido said, "I don't understand where the function object f gets its
> > magic powers".
> >
> > The answer is that function definitions directly inside class
> > statements are treated specially by the verifier.
> 
> Hm, this sounds like a major change in language semantics, and if I
> were Sun I'd sue you for using the name "Python" in your product. :-)

Damn, the makers of Typed Lambda Calculus had better watch out for
legal action from the makers of Lambda Calculus(tm) too... :-)  Is it
really a major change in semantics if it's just a subset? ;-)


To some extent the verifier's check of only accessing private
attributes through self is just checking a coding style that I already
follow when writing Python code (except sometimes for writing test
cases).

Of course some of the verifier's checks, such as only allowing
attribute assignments through self, are a lot more draconian than
coding style checks.


> > If you wrote the same function definition at the top level:
> >
> > def f(var):
> > ? ?return var._field # rejected
> >
> > the attribute access would be rejected by the verifier, because "var"
> > is not a self variable, and private attributes may only be accessed
> > through self variables.
> >
> > I renamed the variable in the example,
> 
> What do you mean by this?

I just mean that I applied alpha conversion.

def f(self):
    return self._field

is equivalent to

def f(var):
    return var._field

Whether these function definitions are accepted by the verifier
depends on their context.


> Do you also catch things like
> 
> g = getattr
> s = 'field'.replace('f', '_f')
> 
> print g(x, s)
> 
> ?

The default environment doesn't provide the real getattr() function.
It provides a wrapped version that rejects private attribute names.

Mark


From greg.ewing at canterbury.ac.nz  Fri Mar 20 01:23:48 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 20 Mar 2009 12:23:48 +1200
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C258E2.8050505@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
Message-ID: <49C2E214.1040003@canterbury.ac.nz>

Jacob Holm wrote:

>   1. There is a small refcounting bug in your gen_iternext function. On
>      success, it returns without decref'ing "yf".

Thanks, I'll fix that.

>   2. In the comment for "gen_undelegate" you mention "certain recursive
>      situations" where a generator may lose its frame before we get a
>      chance to clear f_yieldfrom. Can you elaborate?

I can't remember the details, but I definitely ran into one
during development, which is why I added that function. Have
you tried running all of my tests?

>   3. It looks like you are not calling "close" properly from "next",
>      "send" and "throw".

I'm not sure what you mean by that. Can you provide an
example that doesn't behave as expected?

>   4. It looks like your "gen_close" does not try to throw a
>      GeneratorExit before calling close when delegating to a
>      non-generator.

I'm not sure what you mean here either. Regardless of the
type of sub-iterator, it should end up getting to the
part which does

	if (!PyErr_Occurred())
		PyErr_SetNone(PyExc_GeneratorExit);

Again, and example that doesn't behave properly would
help.

-- 
Greg


From jh at improva.dk  Fri Mar 20 02:04:42 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 20 Mar 2009 02:04:42 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C2E214.1040003@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz>
Message-ID: <49C2EBAA.9020106@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>
>>   2. In the comment for "gen_undelegate" you mention "certain recursive
>>      situations" where a generator may lose its frame before we get a
>>      chance to clear f_yieldfrom. Can you elaborate?
>
> I can't remember the details, but I definitely ran into one
> during development, which is why I added that function. Have
> you tried running all of my tests?
Yup.  All tests pass, except for your test19 where my traceback is 
different.

> --- expected/test19.py.out 2009-02-22 09:51:26.000000000 +0100 +++ 
> actual/test19.py.out 2009-03-20 01:50:28.000000000 +0100 @@ -7,8 +7,8 
> @@ Traceback (most recent call last): File "test19.py", line 20, in 
> <module> for y in gi: - File "test19.py", line 16, in g2 - yield from 
> gi File "test19.py", line 9, in g1 yield from g2() + File "test19.py", 
> line 16, in g2 + yield from gi ValueError: generator already executing 

I am not quite sure why that is, but I actually think mine is better.

>>   3. It looks like you are not calling "close" properly from "next",
>>      "send" and "throw".
>
> I'm not sure what you mean by that. Can you provide an
> example that doesn't behave as expected?
Sure, see below.

>>   4. It looks like your "gen_close" does not try to throw a
>>      GeneratorExit before calling close when delegating to a
>>      non-generator.
>
> I'm not sure what you mean here either. Regardless of the
> type of sub-iterator, it should end up getting to the
> part which does
>
>     if (!PyErr_Occurred())
>         PyErr_SetNone(PyExc_GeneratorExit);
>
> Again, and example that doesn't behave properly would
> help.
>
Of course.   Here is a demonstration/test...

class iterator(object):
    """Simple iterator that counts to n while writing what is done to it"""

    def __init__(self, n):
        self.ctr = iter(xrange(n))

    def __iter__(self):
        return self

    def close(self):
        print "Close"

    def next(self):
        print "Next"
        return self.ctr.next()

    def send(self, val):
        print "Send", val
        return self.ctr.next()

    def throw(self, *args):
        print "Throw:", args
        return self.ctr.next()


def generator(n):
    yield from iterator(n)


g = generator(1)
g.next()
try:
    g.next()
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'

g = generator(1)
g.next()
try:
    g.send(1)
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'

g = generator(1)
g.next()
try:
    g.throw(ValueError)
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'

g = generator(2)
g.next()
try:
    g.next()
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'

g = generator(2)
g.next()
try:
    g.send(1)
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'

g = generator(2)
g.next()
try:
    g.throw(ValueError)
except Exception, e:
    print type(e)
else:
    print 'No exception'
del g
print '--'


And here is the output I would expect based on the relevant PEPs.

Next
Next
Close
<type 'exceptions.StopIteration'>
--
Next
Send 1
Close
<type 'exceptions.StopIteration'>
--
Next
Throw: (<type 'exceptions.ValueError'>,)
Close
<type 'exceptions.StopIteration'>
--
Next
Next
No exception
Throw: (<type 'exceptions.GeneratorExit'>,)
Close
--
Next
Send 1
No exception
Throw: (<type 'exceptions.GeneratorExit'>,)
Close
--
Next
Throw: (<type 'exceptions.ValueError'>,)
No exception
Throw: (<type 'exceptions.GeneratorExit'>,)
Close
--


However, when I run this using your patch, the first 3 "Close" messages, 
and the 3 "GeneratorExit" messages are missing.

Did that help?

- Jacob


From jh at improva.dk  Fri Mar 20 02:07:34 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 20 Mar 2009 02:07:34 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C2EBAA.9020106@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>
Message-ID: <49C2EC56.4020704@improva.dk>

Sorry about the garbled diff... Here is the real diff between expected 
and actual output when I run my patch on test19.

- Jacob

--- expected/test19.py.out	2009-02-22 09:51:26.000000000 +0100
+++ actual/test19.py.out	2009-03-20 02:06:52.000000000 +0100
@@ -7,8 +7,8 @@
 Traceback (most recent call last):
   File "test19.py", line 20, in <module>
     for y in gi:
-  File "test19.py", line 16, in g2
-    yield from gi
   File "test19.py", line 9, in g1
     yield from g2()
+  File "test19.py", line 16, in g2
+    yield from gi
 ValueError: generator already executing


From greg.ewing at canterbury.ac.nz  Fri Mar 20 02:33:47 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 20 Mar 2009 13:33:47 +1200
Subject: [Python-ideas] Revision to yield-from implementation
Message-ID: <49C2F27B.1020102@canterbury.ac.nz>

I have uploaded a small revision to my prototype
implementation of the yield-from statement to fix
a small refcounting bug.

http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Mar 20 05:09:13 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 20 Mar 2009 17:09:13 +1300
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C2EBAA.9020106@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
Message-ID: <49C316E9.1090103@canterbury.ac.nz>

Jacob Holm wrote:

> Of course.   Here is a demonstration/test...
> 
> However, when I run this using your patch, the first 3 "Close" messages, 
> and the 3 "GeneratorExit" messages are missing.

I don't understand why you expect to get the output
you present. Can you explain your reasoning with
reference to the relevant sections of the relevant
PEPs that you mention?

-- 
Greg


From jh at improva.dk  Fri Mar 20 10:33:58 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 20 Mar 2009 10:33:58 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C316E9.1090103@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
Message-ID: <49C36306.4040002@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>
>> Of course.   Here is a demonstration/test...
>>
>> However, when I run this using your patch, the first 3 "Close" 
>> messages, and the 3 "GeneratorExit" messages are missing.
>
> I don't understand why you expect to get the output
> you present. Can you explain your reasoning with
> reference to the relevant sections of the relevant
> PEPs that you mention?
>
Starting with "Close".  The only reason I excpect *any* "Close" message 
is that the expansion in your PEP explicitly calls close in the finally 
clause. It makes no distinction between different ways of exiting the 
block, so I'd expect one call for each time it is exited.


The "GeneratorExit", I expect due to the description of close in PEP 342:

def close(self):
    try:
        self.throw(GeneratorExit)
    except (GeneratorExit, StopIteration):
        pass
    else:
        raise RuntimeError("generator ignored GeneratorExit")

When the generator is closed (due to the del g lines in the example), 
this says to throw a GeneratorExit and handle the result.  If we do this 
manually, the throw will be delegated to the iterator, which will print 
the "Throw: (<type 'exceptions.GeneratorExit'>,)" message. 


Do I make sense yet?

- Jacob


From gerald.britton at gmail.com  Fri Mar 20 14:54:09 2009
From: gerald.britton at gmail.com (Gerald Britton)
Date: Fri, 20 Mar 2009 09:54:09 -0400
Subject: [Python-ideas] Interactive trace function
Message-ID: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com>

Years ago I worked for a while in REXX.  It was a fairly advanced
scripting language for its time and we used it on some substantial
mainframe projects.  One thing that I liked and miss in Python is the
"trace" statement.

Basically, you could insert a "trace" statement in a REXX program and
then, when you ran it, it would step through the program a line at a
time.  This gave you the chance to follow your code through tricky
sections and display variables after each step.  I know it saved me
hours of troubleshooting of tricky problems.

I would like to know if anyone has ever proposed something similar for
Python.  It would work something like this:

1. In an interactive session, you could issue a trace command.
Thereafter, whatever you ran would be done a step at a time, with a
terminal prompt after every statement for you to print anything you
like that would help you understand the state of the program at that
point.

2. From the command line, you could add a --trace option, or something
like it, to ask Python to launch the program interactively with trace
enabled, which would work as described above.

3. If you have a problematic piece of code, you could insert a trace
statement just before the troubled section. Then when you ran the
program, when it came to the trace statement, it would begin an
interactive trace at that point as described above. (You would have to
start your program from the command line for this to make sense.)

Has something like this ever come up before?  Is there a way to do this today?


-- 
Gerald Britton


From jh at improva.dk  Fri Mar 20 15:26:23 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 20 Mar 2009 15:26:23 +0100
Subject: [Python-ideas] Interactive trace function
In-Reply-To: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com>
References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com>
Message-ID: <49C3A78F.3040707@improva.dk>

Gerald Britton wrote:
> Years ago I worked for a while in REXX.  It was a fairly advanced
> scripting language for its time and we used it on some substantial
> mainframe projects.  One thing that I liked and miss in Python is the
> "trace" statement.
>
> Basically, you could insert a "trace" statement in a REXX program and
> then, when you ran it, it would step through the program a line at a
> time.  This gave you the chance to follow your code through tricky
> sections and display variables after each step.  I know it saved me
> hours of troubleshooting of tricky problems.
>
> I would like to know if anyone has ever proposed something similar for
> Python.  It would work something like this:
>
> 1. In an interactive session, you could issue a trace command.
> Thereafter, whatever you ran would be done a step at a time, with a
> terminal prompt after every statement for you to print anything you
> like that would help you understand the state of the program at that
> point.
>
> 2. From the command line, you could add a --trace option, or something
> like it, to ask Python to launch the program interactively with trace
> enabled, which would work as described above.
>
> 3. If you have a problematic piece of code, you could insert a trace
> statement just before the troubled section. Then when you ran the
> program, when it came to the trace statement, it would begin an
> interactive trace at that point as described above. (You would have to
> start your program from the command line for this to make sense.)
>
> Has something like this ever come up before?  Is there a way to do this today?
>
>
>   
How about:

import pdb; pdb.set_trace()

I use that for debugging all the time...

- Jacob


From paul.bedaride at gmail.com  Fri Mar 20 15:42:51 2009
From: paul.bedaride at gmail.com (paul bedaride)
Date: Fri, 20 Mar 2009 15:42:51 +0100
Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack
In-Reply-To: <Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>
References: <fa7d4c4f0903200635w3bbc3c45m9a4202fd966ea769@mail.gmail.com>
	<Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>
Message-ID: <fa7d4c4f0903200742j52c5ec6al7421dff8e03b5247@mail.gmail.com>

Yes it's true you can do easily the pack part with zip(*[iter(l)]*size)
you can do the slicing with zip(*[l[i:len(l)-(slice-1-i)] for i in
range(slice)])
And you could also do the twice but you get something more complicated.
It's also true that with izip you could get iterator.
I use this pack function a lot of times in my code, and it's more readable,
than the zip version. After, the thing it's to know if people really
use this kind
 of function on list of it's just me (that's it's totally possible).

paul bedaride

On Fri, Mar 20, 2009 at 3:07 PM, Isaac Morland <ijmorlan at uwaterloo.ca> wrote:
> On Fri, 20 Mar 2009, paul bedaride wrote:
>
>> I propose a new function for list for pack values of a list and
>> sliding over them:
>>
>> then we can do things like this:
>> for i, j, k in pack(range(10), 3, partialend=False):
>> ? print i, j, k
>>
>> I propose this because i need a lot of times pack and slide function
>> over list and this one
>> combine the two in a generator way.
>
> See the Python documentation for zip():
>
> http://docs.python.org/library/functions.html#zip
>
> And this article in which somebody independently rediscovers the idea:
>
> http://drj11.wordpress.com/2009/01/28/my-python-dream-about-groups/
>
> Summary: except for the "partialend" parameter, this can already be done in
> a single line. ?It is not for me to say whether this nevertheless would be
> useful as a library routine (if only perhaps to make it easy to specify
> "partialend" explicitly).
>
> It seems to me that sometimes one would want izip instead of zip. ?And I
> think you could get the effect of partialend=True in 2.6 by using
> izip_longest (except with an iterator result rather than a list).
>
>> def pack(l, size=2, slide=2, partialend=True):
>> ? lenght = len(l)
>> ? for p in range(0,lenght-size,slide):
>> ? ? ? def packet():
>> ? ? ? ? ? for i in range(size):
>> ? ? ? ? ? ? ? yield l[p+i]
>> ? ? ? yield packet()
>> ? p = p + slide
>> ? if partialend or lenght-p == size:
>> ? ? ? def packet():
>> ? ? ? ? ? for i in range(lenght-p):
>> ? ? ? ? ? ? ? yield l[p+i]
>> ? ? ? yield packet()
>
> Isaac Morland ? ? ? ? ? ? ? ? ? CSCF Web Guru
> DC 2554C, x36650 ? ? ? ? ? ? ? ?WWW Software Specialist
>


From tom at vector-seven.com  Fri Mar 20 15:38:39 2009
From: tom at vector-seven.com (Thomas Lee)
Date: Sat, 21 Mar 2009 01:38:39 +1100
Subject: [Python-ideas] A read-only, dict-like optparse.Value
Message-ID: <49C3AA6F.3080908@vector-seven.com>

Hi folks,

Would anybody support the idea of read-only dict-like behaviour of 
"options" for the following code:

====

from optparse import OptionParser
parser = OptionParser()
parser.add_option("--host", dest="host" default="localhost")
parser.add_option("--port", dest="port", default=1234)
parser.add_option("--path", dest="path", default="/tmp")
options, args = parser.parse_args()

====

As it is, you have to "know" what possible attributes are present on the 
options (effectively the set of "dest" attributes) -- I often implement 
something like the following because recently I've had to use command 
line options in a bunch of format strings:

def make_options_dict(options):
    known_options = ("host", "port", "path")
    return dict(zip(known_options, [getattr(options, attr) for attr in 
known_options]))

I don't mind having to do this, but having to hard code the options in 
there feels a bit nasty. Just as useful for my particular use case (i.e 
passing the options "dict" to a format string) would be something along 
the lines of options.todict() or dict(options). Even a way to know the 
set of "dest" attributes that are defined on "options" would be cleaner. 
e.g.

options_dict = dict(zip(options.all(), [getattr(options, attr) for attr 
in options.all()]))

Where options.all() returns all the option "dest" attribute names.

Or something to that effect.

Any thoughts?

Cheers,
T


From rdmurray at bitdance.com  Fri Mar 20 17:55:04 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 20 Mar 2009 16:55:04 +0000 (UTC)
Subject: [Python-ideas] A read-only, dict-like optparse.Value
References: <49C3AA6F.3080908@vector-seven.com>
Message-ID: <gq0hp8$flf$1@ger.gmane.org>

Thomas Lee <tom at vector-seven.com> wrote:
> Hi folks,
> 
> Would anybody support the idea of read-only dict-like behaviour of 
> "options" for the following code:
> 
> ====
> 
> from optparse import OptionParser
> parser = OptionParser()
> parser.add_option("--host", dest="host" default="localhost")
> parser.add_option("--port", dest="port", default=1234)
> parser.add_option("--path", dest="path", default="/tmp")
> options, args = parser.parse_args()
> 
> ====

I presume you know that dest is redundant there?  I ask because you
wanted to avoid retyping the option names later :)

> As it is, you have to "know" what possible attributes are present on the 
> options (effectively the set of "dest" attributes) -- I often implement 
> something like the following because recently I've had to use command 
> line options in a bunch of format strings:
> 
> def make_options_dict(options):
>     known_options = ("host", "port", "path")
>     return dict(zip(known_options, [getattr(options, attr) for attr in 
> known_options]))
> 
> I don't mind having to do this, but having to hard code the options in 
> there feels a bit nasty. Just as useful for my particular use case (i.e 
> passing the options "dict" to a format string) would be something along 
> the lines of options.todict() or dict(options). Even a way to know the 
> set of "dest" attributes that are defined on "options" would be cleaner. 
> e.g.

Well, given the implementation of optparse, you could do:

    options.__dict__.items()

But exposing the full dictionary interface on options strikes me as a
reasonable idea.  I don't see any particular reason to make it read-only,
either.

(NB: The Values class has some...interesting...methods that I wasn't
aware of that look somewhat intriguing.  And they aren't read-only.)

> options_dict = dict(zip(options.all(), [getattr(options, attr) for attr 
> in options.all()]))
> 
> Where options.all() returns all the option "dest" attribute names.
> 
> Or something to that effect.
> 
> Any thoughts?

I don't see any reason not to just duck type the options object as
dictionary-like and use the normal dictionary method names to access
the information you want.  Off the cuff it seems like a good idea
to expose an interface to this information.

Hmm.  Then I could do globals().update(options.items()), which
would simplify some of my code :)  (Whether or not that is a good
idea is a different question!)

--
R. David Murray           http://www.bitdance.com


From python at rcn.com  Fri Mar 20 18:01:15 2009
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 20 Mar 2009 10:01:15 -0700
Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack
References: <fa7d4c4f0903200635w3bbc3c45m9a4202fd966ea769@mail.gmail.com><Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>
	<fa7d4c4f0903200742j52c5ec6al7421dff8e03b5247@mail.gmail.com>
Message-ID: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1>


>> I propose a new function for list for pack values of a list and
>> sliding over them:
>>
>> then we can do things like this:
>> for i, j, k in pack(range(10), 3, partialend=False):
>> print i, j, k
 . . .
>> def pack(l, size=2, slide=2, partialend=True):
>> lenght = len(l)
>> for p in range(0,lenght-size,slide):
>> def packet():
>> for i in range(size):
>> yield l[p+i]
>> yield packet()
>> p = p + slide
>> if partialend or lenght-p == size:
>> def packet():
>> for i in range(lenght-p):
>> yield l[p+i]
>> yield packet()

This has been discussed before and rejected.

There were several considerations.  The itertools recipes already 
include simple patterns for grouper() and pairwise() that are easy
to use as primitives in your code or to serve as models for variants. 

The design of pack() itself is questionable.  It attempts to be a 
Swiss Army Knife by parameterizing all possible variations
(length of window, length to slide, and how to handle end-cases).
This design makes the tool harder to learn and use, and it makes
the implementation more complex.  

That complexity isn't necessary.  Use cases would typically fall
into grouper cases where the window length equals the slide
length or into cases that slide one element at a time.  You don't
win anything by combining the two cases except for more making
the tool harder to learn and use.

The pairwise() recipe could be generalized to larger windows,
but seemed like less of a good idea after closely examining potential
use cases.  For cases that used a larger window, there always
seemed to be a better solution than extending pairwise().  For
instance, a twenty-day moving average is better implemented with
a deque(maxlen=20) and a running sum than with an iterator
returning tuples of length twenty -- that approach does a lot of
unnecessary work shifting elements in the tuple, turning an
O(n) process into an O(m*n) process.

For short windows, like pairwise() itself, the issue is not one of
total running time; instead, the problem is that almost every 
proposed use case was better coded as a simple Python loop,
saving the value previous values with a step like:  oldvalue = value.
Having pairwise() or tripletwise() tended to be a distraction away
from better solutions.  Also, the pure python approach was more
general as it allowed accumulations:  total += value. 

While your proposed function has been re-invented a number of
times, that doesn't mean it's a good idea.  It is more an exercise
in what can be done, not in what should be done.


Raymond


From rdmurray at bitdance.com  Fri Mar 20 18:02:27 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 20 Mar 2009 17:02:27 +0000 (UTC)
Subject: [Python-ideas] Interactive trace function
References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com>
	<49C3A78F.3040707@improva.dk>
Message-ID: <gq0i72$gso$1@ger.gmane.org>

Jacob Holm <jh at improva.dk> wrote:
> Gerald Britton wrote:
> > Years ago I worked for a while in REXX.  It was a fairly advanced
> > scripting language for its time and we used it on some substantial
> > mainframe projects.  One thing that I liked and miss in Python is the
> > "trace" statement.
> >
> > Basically, you could insert a "trace" statement in a REXX program and
> > then, when you ran it, it would step through the program a line at a
> > time.  This gave you the chance to follow your code through tricky
> > sections and display variables after each step.  I know it saved me
> > hours of troubleshooting of tricky problems.
> >
> > I would like to know if anyone has ever proposed something similar for
> > Python.  It would work something like this:
> >
> > 1. In an interactive session, you could issue a trace command.
> > Thereafter, whatever you ran would be done a step at a time, with a
> > terminal prompt after every statement for you to print anything you
> > like that would help you understand the state of the program at that
> > point.
> >
> > 2. From the command line, you could add a --trace option, or something
> > like it, to ask Python to launch the program interactively with trace
> > enabled, which would work as described above.
> >
> > 3. If you have a problematic piece of code, you could insert a trace
> > statement just before the troubled section. Then when you ran the
> > program, when it came to the trace statement, it would begin an
> > interactive trace at that point as described above. (You would have to
> > start your program from the command line for this to make sense.)
> >
> > Has something like this ever come up before?  Is there a way to do this today?
> >
> >
> >   
> How about:
> 
> import pdb; pdb.set_trace()
> 
> I use that for debugging all the time...

I just learned about this one, which is also sometimes useful (when you
_don't_ want the interactive prompt, you just want to see the sequence
of execution):

    python -m trace -t <pythonprogram>

--
R. David Murray           http://www.bitdance.com


From paul.bedaride at gmail.com  Fri Mar 20 18:32:32 2009
From: paul.bedaride at gmail.com (paul bedaride)
Date: Fri, 20 Mar 2009 18:32:32 +0100
Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack
In-Reply-To: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1>
References: <fa7d4c4f0903200635w3bbc3c45m9a4202fd966ea769@mail.gmail.com>
	<Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>
	<fa7d4c4f0903200742j52c5ec6al7421dff8e03b5247@mail.gmail.com>
	<9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1>
Message-ID: <fa7d4c4f0903201032m3218517craf3726079a56e5d9@mail.gmail.com>

Now I discover itertools I thing your are right, but maybe the pack
function could be
rename iwinslice (at the end it's its real name), and add it to itertools ??

paul bedaride

On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger <python at rcn.com> wrote:
>
>>> I propose a new function for list for pack values of a list and
>>> sliding over them:
>>>
>>> then we can do things like this:
>>> for i, j, k in pack(range(10), 3, partialend=False):
>>> print i, j, k
>
> . . .
>>>
>>> def pack(l, size=2, slide=2, partialend=True):
>>> lenght = len(l)
>>> for p in range(0,lenght-size,slide):
>>> def packet():
>>> for i in range(size):
>>> yield l[p+i]
>>> yield packet()
>>> p = p + slide
>>> if partialend or lenght-p == size:
>>> def packet():
>>> for i in range(lenght-p):
>>> yield l[p+i]
>>> yield packet()
>
> This has been discussed before and rejected.
>
> There were several considerations. ?The itertools recipes already include
> simple patterns for grouper() and pairwise() that are easy
> to use as primitives in your code or to serve as models for variants.
> The design of pack() itself is questionable. ?It attempts to be a Swiss Army
> Knife by parameterizing all possible variations
> (length of window, length to slide, and how to handle end-cases).
> This design makes the tool harder to learn and use, and it makes
> the implementation more complex.
> That complexity isn't necessary. ?Use cases would typically fall
> into grouper cases where the window length equals the slide
> length or into cases that slide one element at a time. ?You don't
> win anything by combining the two cases except for more making
> the tool harder to learn and use.
>
> The pairwise() recipe could be generalized to larger windows,
> but seemed like less of a good idea after closely examining potential
> use cases. ?For cases that used a larger window, there always
> seemed to be a better solution than extending pairwise(). ?For
> instance, a twenty-day moving average is better implemented with
> a deque(maxlen=20) and a running sum than with an iterator
> returning tuples of length twenty -- that approach does a lot of
> unnecessary work shifting elements in the tuple, turning an
> O(n) process into an O(m*n) process.
>
> For short windows, like pairwise() itself, the issue is not one of
> total running time; instead, the problem is that almost every proposed use
> case was better coded as a simple Python loop,
> saving the value previous values with a step like: ?oldvalue = value.
> Having pairwise() or tripletwise() tended to be a distraction away
> from better solutions. ?Also, the pure python approach was more
> general as it allowed accumulations: ?total += value.
> While your proposed function has been re-invented a number of
> times, that doesn't mean it's a good idea. ?It is more an exercise
> in what can be done, not in what should be done.
>
>
> Raymond
>


From gerald.britton at gmail.com  Fri Mar 20 18:53:36 2009
From: gerald.britton at gmail.com (Gerald Britton)
Date: Fri, 20 Mar 2009 13:53:36 -0400
Subject: [Python-ideas] Interactive trace function
In-Reply-To: <gq0i72$gso$1@ger.gmane.org>
References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com>
	<49C3A78F.3040707@improva.dk> <gq0i72$gso$1@ger.gmane.org>
Message-ID: <5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com>

Thanks for the tip!  btw, did you get the --ignore-dir option to work?
 I'm trying to run this on a large system (hundreds of modules) and
ignore most of the modules and packages to focus on just a few.
However, it processes all of them anyway

On Fri, Mar 20, 2009 at 1:02 PM, R. David Murray <rdmurray at bitdance.com> wrote:
> Jacob Holm <jh at improva.dk> wrote:
>> Gerald Britton wrote:
>> > Years ago I worked for a while in REXX. ?It was a fairly advanced
>> > scripting language for its time and we used it on some substantial
>> > mainframe projects. ?One thing that I liked and miss in Python is the
>> > "trace" statement.
>> >
>> > Basically, you could insert a "trace" statement in a REXX program and
>> > then, when you ran it, it would step through the program a line at a
>> > time. ?This gave you the chance to follow your code through tricky
>> > sections and display variables after each step. ?I know it saved me
>> > hours of troubleshooting of tricky problems.
>> >
>> > I would like to know if anyone has ever proposed something similar for
>> > Python. ?It would work something like this:
>> >
>> > 1. In an interactive session, you could issue a trace command.
>> > Thereafter, whatever you ran would be done a step at a time, with a
>> > terminal prompt after every statement for you to print anything you
>> > like that would help you understand the state of the program at that
>> > point.
>> >
>> > 2. From the command line, you could add a --trace option, or something
>> > like it, to ask Python to launch the program interactively with trace
>> > enabled, which would work as described above.
>> >
>> > 3. If you have a problematic piece of code, you could insert a trace
>> > statement just before the troubled section. Then when you ran the
>> > program, when it came to the trace statement, it would begin an
>> > interactive trace at that point as described above. (You would have to
>> > start your program from the command line for this to make sense.)
>> >
>> > Has something like this ever come up before? ?Is there a way to do this today?
>> >
>> >
>> >
>> How about:
>>
>> import pdb; pdb.set_trace()
>>
>> I use that for debugging all the time...
>
> I just learned about this one, which is also sometimes useful (when you
> _don't_ want the interactive prompt, you just want to see the sequence
> of execution):
>
> ? ?python -m trace -t <pythonprogram>
>
> --
> R. David Murray ? ? ? ? ? http://www.bitdance.com
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Gerald Britton


From rdmurray at bitdance.com  Fri Mar 20 19:02:50 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 20 Mar 2009 14:02:50 -0400 (EDT)
Subject: [Python-ideas] Interactive trace function
In-Reply-To: <5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com>
References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> 
	<49C3A78F.3040707@improva.dk> <gq0i72$gso$1@ger.gmane.org>
	<5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0903201400230.31981@kimball.webabinitio.net>

On Fri, 20 Mar 2009 at 13:53, Gerald Britton wrote:
> On Fri, Mar 20, 2009 at 1:02 PM, R. David Murray <rdmurray at bitdance.com> wrote:
>> I just learned about this one, which is also sometimes useful (when you
>> _don't_ want the interactive prompt, you just want to see the sequence
>> of execution):
>>
>> ? ?python -m trace -t <pythonprogram>
>
> Thanks for the tip!  btw, did you get the --ignore-dir option to work?
> I'm trying to run this on a large system (hundreds of modules) and
> ignore most of the modules and packages to focus on just a few.
> However, it processes all of them anyway

[top posting fixed for clarity]

Haven't tried that option, I've only used it for small programs so far.

--
R. David Murray           http://www.bitdance.com

From josiah.carlson at gmail.com  Fri Mar 20 20:09:11 2009
From: josiah.carlson at gmail.com (Josiah Carlson)
Date: Fri, 20 Mar 2009 12:09:11 -0700
Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack
In-Reply-To: <fa7d4c4f0903201032m3218517craf3726079a56e5d9@mail.gmail.com>
References: <fa7d4c4f0903200635w3bbc3c45m9a4202fd966ea769@mail.gmail.com>
	<Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>
	<fa7d4c4f0903200742j52c5ec6al7421dff8e03b5247@mail.gmail.com>
	<9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1>
	<fa7d4c4f0903201032m3218517craf3726079a56e5d9@mail.gmail.com>
Message-ID: <e6511dbf0903201209u10b8528dq4c01fd3fb9d3d036@mail.gmail.com>

iwinslice() is just as bad of a name as any of the others.

I have seen the equivalent of window(iterator, size=2, step=1), which
works as you would expect (both as the output, as well as the
implementation), with size and step both limited to 5 (because if you
are doing things with more than 5 items at a time...you probably
really want something else, and in certain cases, you can use multiple
window calls to compose larger groups).

I'd be a -0 on the feature, because as Raymond says, it's trivial to
implement with a deque.  And as I've said before, not all x line
functions should be built-in.

 - Josiah

On Fri, Mar 20, 2009 at 10:32 AM, paul bedaride <paul.bedaride at gmail.com> wrote:
> Now I discover itertools I thing your are right, but maybe the pack
> function could be
> rename iwinslice (at the end it's its real name), and add it to itertools ??
>
> paul bedaride
>
> On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger <python at rcn.com> wrote:
>>
>>>> I propose a new function for list for pack values of a list and
>>>> sliding over them:
>>>>
>>>> then we can do things like this:
>>>> for i, j, k in pack(range(10), 3, partialend=False):
>>>> print i, j, k
>>
>> . . .
>>>>
>>>> def pack(l, size=2, slide=2, partialend=True):
>>>> lenght = len(l)
>>>> for p in range(0,lenght-size,slide):
>>>> def packet():
>>>> for i in range(size):
>>>> yield l[p+i]
>>>> yield packet()
>>>> p = p + slide
>>>> if partialend or lenght-p == size:
>>>> def packet():
>>>> for i in range(lenght-p):
>>>> yield l[p+i]
>>>> yield packet()
>>
>> This has been discussed before and rejected.
>>
>> There were several considerations. ?The itertools recipes already include
>> simple patterns for grouper() and pairwise() that are easy
>> to use as primitives in your code or to serve as models for variants.
>> The design of pack() itself is questionable. ?It attempts to be a Swiss Army
>> Knife by parameterizing all possible variations
>> (length of window, length to slide, and how to handle end-cases).
>> This design makes the tool harder to learn and use, and it makes
>> the implementation more complex.
>> That complexity isn't necessary. ?Use cases would typically fall
>> into grouper cases where the window length equals the slide
>> length or into cases that slide one element at a time. ?You don't
>> win anything by combining the two cases except for more making
>> the tool harder to learn and use.
>>
>> The pairwise() recipe could be generalized to larger windows,
>> but seemed like less of a good idea after closely examining potential
>> use cases. ?For cases that used a larger window, there always
>> seemed to be a better solution than extending pairwise(). ?For
>> instance, a twenty-day moving average is better implemented with
>> a deque(maxlen=20) and a running sum than with an iterator
>> returning tuples of length twenty -- that approach does a lot of
>> unnecessary work shifting elements in the tuple, turning an
>> O(n) process into an O(m*n) process.
>>
>> For short windows, like pairwise() itself, the issue is not one of
>> total running time; instead, the problem is that almost every proposed use
>> case was better coded as a simple Python loop,
>> saving the value previous values with a step like: ?oldvalue = value.
>> Having pairwise() or tripletwise() tended to be a distraction away
>> from better solutions. ?Also, the pure python approach was more
>> general as it allowed accumulations: ?total += value.
>> While your proposed function has been re-invented a number of
>> times, that doesn't mean it's a good idea. ?It is more an exercise
>> in what can be done, not in what should be done.
>>
>>
>> Raymond
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From g.brandl at gmx.net  Fri Mar 20 20:14:43 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 20 Mar 2009 20:14:43 +0100
Subject: [Python-ideas] A read-only, dict-like optparse.Value
In-Reply-To: <49C3AA6F.3080908@vector-seven.com>
References: <49C3AA6F.3080908@vector-seven.com>
Message-ID: <gq0q0n$bno$1@ger.gmane.org>

Thomas Lee schrieb:
> Hi folks,
> 
> Would anybody support the idea of read-only dict-like behaviour of 
> "options" for the following code:
> 
> ====
> 
> from optparse import OptionParser
> parser = OptionParser()
> parser.add_option("--host", dest="host" default="localhost")
> parser.add_option("--port", dest="port", default=1234)
> parser.add_option("--path", dest="path", default="/tmp")
> options, args = parser.parse_args()
> 
> ====
> 
> As it is, you have to "know" what possible attributes are present on the 
> options (effectively the set of "dest" attributes) -- I often implement 
> something like the following because recently I've had to use command 
> line options in a bunch of format strings:
> 
> def make_options_dict(options):
>     known_options = ("host", "port", "path")
>     return dict(zip(known_options, [getattr(options, attr) for attr in 
> known_options]))

Perhaps

options_dict = vars(options)

already does what you want?  optparse seems to set nonpresent attributes
for options to None.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From jh at improva.dk  Fri Mar 20 20:32:46 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 20 Mar 2009 20:32:46 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C36306.4040002@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>
Message-ID: <49C3EF5E.1050807@improva.dk>

Jacob Holm wrote:
> The "GeneratorExit", I expect due to the description of close in PEP 342:
>
> def close(self):
> try:
> self.throw(GeneratorExit)
> except (GeneratorExit, StopIteration):
> pass
> else:
> raise RuntimeError("generator ignored GeneratorExit")
>
> When the generator is closed (due to the del g lines in the example), 
> this says to throw a GeneratorExit and handle the result. If we do 
> this manually, the throw will be delegated to the iterator, which will 
> print the "Throw: (<type 'exceptions.GeneratorExit'>,)" message.
>

It turns out I was wrong about the GeneratorExit. What I missed is that 
starting from 2.6, GeneratorExit no longer subclasses Exception, and so 
it wouldn't be thrown at the iterator. So move along, nothing to see 
here ... :)

- Jacob


From fredrik.johansson at gmail.com  Fri Mar 20 21:03:55 2009
From: fredrik.johansson at gmail.com (Fredrik Johansson)
Date: Fri, 20 Mar 2009 21:03:55 +0100
Subject: [Python-ideas] Builtin test function
In-Reply-To: <200903192148.54461.steve@pearwood.info>
References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
	<200903192148.54461.steve@pearwood.info>
Message-ID: <3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com>

On Thu, Mar 19, 2009 at 11:48 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote:
>> There's been some discussion about automatic test discovery lately.
>> Here's a random (not in any way thought through) idea: add a builtin
>> function test() that runs tests associated with a given function,
>> class, module, or object.
>
> Improved testing is always welcome, but why a built-in?
>
> I know testing is important, but is it so common and important that we
> need it at our fingertips, so to speak, and can't even import a module
> first before running tests? What's the benefit to making it a built-in
> instead of part of a test module?

It would just be a convenience, and I'm just throwing the idea out.

The advantage would be a uniform and very simple interface for testing any
module, without having to know whether I should import doctest,
unittest or something else (and having to remember the commands
used by each framework). It would certainly not be a replacement for more
advanced test frameworks.

Fredrik


From rdmurray at bitdance.com  Fri Mar 20 22:02:33 2009
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 20 Mar 2009 21:02:33 +0000 (UTC)
Subject: [Python-ideas] file.read() doesn't read the whole file
References: <676ac298-0a44-4820-80dd-166a0363d45f@y38g2000prg.googlegroups.com>
	<mailman.2271.1237549445.11746.python-list@python.org>
	<d40db39c-4e03-49fc-9317-11ba1f688797@w1g2000prk.googlegroups.com>
Message-ID: <gq1099$vag$1@ger.gmane.org>

Sreejith K <sreejithemk at gmail.com> wrote:
> I'm using the above codes in a pthon-fuse's file class's read
> function. The offset and length are 0 and 4096 respectively for my
> test inputs. When I open a file and read the 4096 bytes from offset,
> only a few lines are printed, not the whole file. Actually the file is
> only a few bytes. But when I tried reading from the Interactive mode
> of python it gave the whole file.
> 
> Is there any problem using read() method in fuse-python ?
> 
> Also statements like break and continue behaves weirdly in fuse
> functions. Any help is appreciated....

If you think break and continue are behaving differently in a python
program that is providing a fuse filesystem implementation, then your
understanding of what your code is doing is faulty in some fashion.
The fact that python is being called from fuse isn't going to change
the semantics of the language.

So I think you need to do some debugging to understand what's actually
going on when your code gets called.  As someone else suggested, if you
are perceiving that the data read is short because of what you see at
the os level when reading the data from the fuse-plus-your-application
filesystem, that is after your python code returns the data to the fuse
infrastructure, then that is probably where your problem is and not in
the python read itself.  (I'm assuming here that the read in question
is taking place in a python method called from fuse and is reading real
data from a real file...if that assumption is wrong and you are actually
reading from a file _provided through_ fuse, then you need to look to
your fuse file system implementation for answers.)

--
R. David Murray           http://www.bitdance.com


From tom at vector-seven.com  Fri Mar 20 23:21:42 2009
From: tom at vector-seven.com (Thomas Lee)
Date: Sat, 21 Mar 2009 09:21:42 +1100
Subject: [Python-ideas] A read-only, dict-like optparse.Value
In-Reply-To: <gq0q0n$bno$1@ger.gmane.org>
References: <49C3AA6F.3080908@vector-seven.com> <gq0q0n$bno$1@ger.gmane.org>
Message-ID: <49C416F6.3040009@vector-seven.com>

Thanks Georg, this is pretty much exactly what I was looking for. 
Somehow I had never heard of the vars builtin before!

Regards,
Tom

Georg Brandl wrote:
> Thomas Lee schrieb:
>   
>> Hi folks,
>>
>> Would anybody support the idea of read-only dict-like behaviour of 
>> "options" for the following code:
>>
>> ====
>>
>> from optparse import OptionParser
>> parser = OptionParser()
>> parser.add_option("--host", dest="host" default="localhost")
>> parser.add_option("--port", dest="port", default=1234)
>> parser.add_option("--path", dest="path", default="/tmp")
>> options, args = parser.parse_args()
>>
>> ====
>>
>> As it is, you have to "know" what possible attributes are present on the 
>> options (effectively the set of "dest" attributes) -- I often implement 
>> something like the following because recently I've had to use command 
>> line options in a bunch of format strings:
>>
>> def make_options_dict(options):
>>     known_options = ("host", "port", "path")
>>     return dict(zip(known_options, [getattr(options, attr) for attr in 
>> known_options]))
>>     
>
> Perhaps
>
> options_dict = vars(options)
>
> already does what you want?  optparse seems to set nonpresent attributes
> for options to None.
>
> Georg
>
>   
   

From greg.ewing at canterbury.ac.nz  Sat Mar 21 01:35:02 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Mar 2009 12:35:02 +1200
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C3EF5E.1050807@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk>
Message-ID: <49C43636.9080402@canterbury.ac.nz>

> Jacob Holm wrote:
> 
>> The "GeneratorExit", I expect due to the description of close in PEP 342:
>>
>> def close(self):
>> try:
>> self.throw(GeneratorExit)
>> except (GeneratorExit, StopIteration):
>> pass
>> else:
>> raise RuntimeError("generator ignored GeneratorExit")

Hmmm... well, my PEP kind of supersedes that when a yield-from
is in effect, because it specifies that the subiterator is
finalized first by attempting to call its 'close' method, not
by throwing GeneratorExit into it. After that, GeneratorExit is
used to finalize the delegating generator.

The reasoning is that GeneratorExit is an implementation
detail of generators, not something iterators in general should
be expected to deal with.

-- 
Greg


From jh at improva.dk  Sat Mar 21 02:04:05 2009
From: jh at improva.dk (Jacob Holm)
Date: Sat, 21 Mar 2009 02:04:05 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C43636.9080402@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>
Message-ID: <49C43D05.3010903@improva.dk>

Greg Ewing wrote:
>> Jacob Holm wrote:
>>
>>> The "GeneratorExit", I expect due to the description of close in PEP 
>>> 342:
>>>
>>> def close(self):
>>> try:
>>> self.throw(GeneratorExit)
>>> except (GeneratorExit, StopIteration):
>>> pass
>>> else:
>>> raise RuntimeError("generator ignored GeneratorExit")
>
> Hmmm... well, my PEP kind of supersedes that when a yield-from
> is in effect, because it specifies that the subiterator is
> finalized first by attempting to call its 'close' method, not
> by throwing GeneratorExit into it. After that, GeneratorExit is
> used to finalize the delegating generator.
>
> The reasoning is that GeneratorExit is an implementation
> detail of generators, not something iterators in general should
> be expected to deal with.
>
As already mentioned in another mail to this list (maybe you missed 
it?), the expansion in your PEP actually has the behaviour you expect 
for the GeneratorExit example because GeneratorExit doesn't inherit from 
Exception.  No need to redefine anything here.  Your patch is right, I 
was wrong, end of story...

The other mismatch, concerning the missing "close" calls to the 
iterator, I still believe to be an issue.  It is debatable whether the 
issue is mostly with the PEP or the implementation, but they don't match 
up as it is...

- Jacob


From greg.ewing at canterbury.ac.nz  Sat Mar 21 02:44:20 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 21 Mar 2009 13:44:20 +1200
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C43D05.3010903@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk>
Message-ID: <49C44674.5030107@canterbury.ac.nz>

Jacob Holm wrote:
> the expansion in your PEP actually has the behaviour you expect 
> for the GeneratorExit example because GeneratorExit doesn't inherit from 
> Exception.

That's an accident, though, and it's possible I should have
specified BaseException there. I still consider the explanation
I gave to be the true one.

> The other mismatch, concerning the missing "close" calls to the 
> iterator, I still believe to be an issue.

Can you elaborate on that? I thought a first you were expecting
the implicit close of the generator that happens before it's
deallocated to be passed on to the subiterator, but some of your
examples seem to have the close happening *before* the del gen,
so I'm confused.

-- 
Greg


From ncoghlan at gmail.com  Sat Mar 21 05:35:12 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 21 Mar 2009 14:35:12 +1000
Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack
In-Reply-To: <e6511dbf0903201209u10b8528dq4c01fd3fb9d3d036@mail.gmail.com>
References: <fa7d4c4f0903200635w3bbc3c45m9a4202fd966ea769@mail.gmail.com>	<Pine.GSO.4.64.0903200958310.22372@core.cs.uwaterloo.ca>	<fa7d4c4f0903200742j52c5ec6al7421dff8e03b5247@mail.gmail.com>	<9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1>	<fa7d4c4f0903201032m3218517craf3726079a56e5d9@mail.gmail.com>
	<e6511dbf0903201209u10b8528dq4c01fd3fb9d3d036@mail.gmail.com>
Message-ID: <49C46E80.3060808@gmail.com>

Josiah Carlson wrote:
> iwinslice() is just as bad of a name as any of the others.
> 
> I have seen the equivalent of window(iterator, size=2, step=1), which
> works as you would expect (both as the output, as well as the
> implementation), with size and step both limited to 5 (because if you
> are doing things with more than 5 items at a time...you probably
> really want something else, and in certain cases, you can use multiple
> window calls to compose larger groups).

Oops, I didn't realise this thread had moved over here, so I just
repeated what yourself and Raymond said over on python-dev. Oh well...

> I'd be a -0 on the feature, because as Raymond says, it's trivial to
> implement with a deque.  And as I've said before, not all x line
> functions should be built-in.

That does raise the possibility of adding "iterator windowing done
right" by including a deque based implementation in itertools (or at
least in the itertools recipes page).

For example, the following continuously yields the same deque, but each
time the contents represent a new window onto the underlying data:

  from collections import deque
  def window (iterable, size=2, step=1, overlap=0):
    itr = iter(iterable)
    new_per_window = size - overlap
    contents = deque(islice(itr, 0, size*step, step), size)
    while True:
      yield contents
      new_data = list(islice(itr, 0, new_per_window*step, step))
      if len(new_data) < new_per_window:
        break
      contents.extend(new_data)

(There are other ways of doing it that involve less data copying, but
the above way seems to be the most straightforward)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Sat Mar 21 12:58:51 2009
From: jh at improva.dk (Jacob Holm)
Date: Sat, 21 Mar 2009 12:58:51 +0100
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C44674.5030107@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>
Message-ID: <49C4D67B.4010109@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> the expansion in your PEP actually has the behaviour you expect for 
>> the GeneratorExit example because GeneratorExit doesn't inherit from 
>> Exception.
>
> That's an accident, though, and it's possible I should have
> specified BaseException there. I still consider the explanation
> I gave to be the true one.
In that case, I think a clarification in the PEP would be in order.   I 
like the fact that the PEP-342 description of close does the right thing 
though.  If you want BaseException instead of Exception in the PEP, 
maybe you could replace the:

except Exception, _e:


line with:

except GeneratorExit:
    raise
except BaseException, _e:


This would make it clearer that the behavior of close is intentional, 
and would still allow delegating the throw of any exception not 
inheriting from GeneratorExit to the subiterator.

>
>> The other mismatch, concerning the missing "close" calls to the 
>> iterator, I still believe to be an issue.
>
> Can you elaborate on that? I thought a first you were expecting
> the implicit close of the generator that happens before it's
> deallocated to be passed on to the subiterator, but some of your
> examples seem to have the close happening *before* the del gen,
> so I'm confused.
>
Yes, I can see that the use of implicit close in that example was a 
mistake, and that I should have added a few more output lines to clarify 
the intent.  The close is definitely intended to happen before the del 
in the examples.  I have a better example here, with inline comments 
explaining what I think should happen at critical points (and why):

class iterator(object):
    """Simple iterator that counts to n while writing what is done to it"""

    def __init__(self, n):
        self.ctr = iter(xrange(n))

    def __iter__(self):
        return self

    def close(self):
        print "Close"

    def next(self):
        print "Next"
        return self.ctr.next()

    # no send method!

    # no throw method!


def generator(n):
    try:
        print "Getting first value from iterator"
        result = yield from iterator(n)
        print "Iterator returned", result
    finally:
        print "Generator closing"


g = generator(1)
g.next()
try:
     print "Calling g.next()"
     # This causes a StopIteration in iterator.next().  After grabbing
     # the value in the "except StopIteration" clause of the PEP
     # expansion, the "finally" clause calls iterator.close().  Any
     # other exception raised by next (or by send or throw if the
     # iterator had those) would also be handled by the finally
     # clause.  For well-behaved iterators, these calls to close would
     # be no-ops, but they are still required by the PEP as written.
     g.next()
except Exception, e:
    print type(e)
else:
    print 'No exception'
# This close should be a no-op. The exception we just caught should
# have already closed the generator.
g.close()
print '--'


g = generator(1)
g.next()
try:
    print "Calling g.send(42)"
    # This causes an AttributeError when looking up the "send" method.
    # The finally clause from the PEP expansion makes sure
    # iterator.close() is called.  This call is *not* expected to be a
    # no-op.
    g.send(42)
except Exception, e:
    print type(e)
else:
    print 'No exception'
# This close should be a no-op. The exception we just caught should
# have already closed the generator.
g.close()
print '--'


g = generator(1)
g.next()
try:
    print "Calling g.throw(ValueError)"
    # Since iterator does not have a "throw" method, the ValueError is
    # raised directly in the yield-from expansion in the generator.
    # The finally clause ensures that iterator.close() is called.
    # This call is *not* expected to be a no-op.
    g.throw(ValueError)
except Exception, e:
    print type(e)
else:
    print 'No exception'
# This close should be a no-op. The exception we just caught should
# have already closed the generator.
g.close()
print '--'


g = generator(1)
g.next()
try:
    print "Calling g.throw(StopIteration(42))"
    # The iterator still does not have a "throw" method, so the
    # StopIteration is raised directly in the yield-from expansion.
    # Then the exception is caught and converted to a value for the
    # yield-from expression.  Before the generator sees the value, the
    # finally clause makes sure that iterator.close() is called.  This
    # call is *not* expected to be a no-op.
    g.throw(StopIteration(42))
except Exception, e:
    print type(e)
else:
    print 'No exception'
# This close should be a no-op. The exception we just caught should
# have already closed the generator.
g.close()
print '--'


There is really four examples here.  The first one is essentially the 
same as last time, I just expanded the output a bit.  The next two 
examples are corner cases where the missing close makes a real 
difference, even for well-behaved iterators (this is not the case in the 
first example).  The fourth example catches a bug in the current version 
of my patch, and shows a potentially interesting use of an iterator 
without a send method in a yield-from expression.  The issue i have with 
your patch is that iterator.close() is not called in any of the four 
examples, even though my reading of the PEP suggests it should be. (I 
have confirmed that my reading matches the PEP by manually replacing the 
yield-from in the generator with the expansion from the PEP, just to be 
sure...)

The expected output is:

Getting first value from iterator
Next
Calling g.next()
Next
Close
Iterator returned None
Generator closing
<type 'exceptions.StopIteration'>
--
Getting first value from iterator
Next
Calling g.send(42)
Close
Generator closing
<type 'exceptions.AttributeError'>
--
Getting first value from iterator
Next
Calling g.throw(ValueError)
Close
Generator closing
<type 'exceptions.ValueError'>
--
Getting first value from iterator
Next
Calling g.throw(StopIteration(42))
Close
Iterator returned 42
Generator closing
<type 'exceptions.StopIteration'>
--


From greg.ewing at canterbury.ac.nz  Sat Mar 21 22:43:54 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Mar 2009 09:43:54 +1200
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C4D67B.4010109@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk>
Message-ID: <49C55F9A.6070305@canterbury.ac.nz>

Jacob Holm wrote:

>     # This causes a StopIteration in iterator.next().  After grabbing
>     # the value in the "except StopIteration" clause of the PEP
>     # expansion, the "finally" clause calls iterator.close().

Okay, I see what you mean now. That's a bug in the expansion.
Once an iterator has raised StopIteration, it has presumably
already finalized itself, so calling its close() method
shouldn't be necessary, and I hadn't intended that it should
be called in that case.

I'll update the PEP accordingly.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sun Mar 22 00:15:56 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Mar 2009 11:15:56 +1200
Subject: [Python-ideas] Yield-From: Revamped expansion
In-Reply-To: <49C55F9A.6070305@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
Message-ID: <49C5752C.2080704@canterbury.ac.nz>

I'm thinking about replacing the expansion with the
following, which hopefully fixes a couple of concerns
that were raised recently without breaking anything else.

Can anyone see any remaining ways in which it doesn't
match the textual description in the Proposal section?

(It still isn't *quite* right, because it doesn't
distinguish between a GeneratorExit explicitly thrown
in and one resulting from calling close() on the
delegating generator. I may need to revise the text
and/or my implementation on that point, because I want
the inline-expansion interpretation to hold.)

     _i = iter(EXPR)
     try:
         _u = _i.next()
     except StopIteration, _e:
         _r = _e.value
     else:
         while 1:
             try:
                 _v = yield _u
             except GeneratorExit:
                 _m = getattr(_i, 'close', None)
                 if _m is not None:
                     _m()
                 raise
             except BaseException, _e:
                 _m = getattr(_i, 'throw', None)
                 if _m is not None:
                     _u = _m(_e)
                 else:
                     raise
             else:
                 try:
                     if _v is None:
                         _u = _i.next()
                     else:
                         _u = _i.send(_v)
                 except StopIteration, _e:
                     _r = _e.value
                     break
     RESULT = _r

-- 
Greg


From ncoghlan at gmail.com  Sun Mar 22 00:22:00 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Mar 2009 09:22:00 +1000
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C55F9A.6070305@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>
Message-ID: <49C57698.7030808@gmail.com>

Greg Ewing wrote:
> Jacob Holm wrote:
> 
>>     # This causes a StopIteration in iterator.next().  After grabbing
>>     # the value in the "except StopIteration" clause of the PEP
>>     # expansion, the "finally" clause calls iterator.close().
> 
> Okay, I see what you mean now. That's a bug in the expansion.
> Once an iterator has raised StopIteration, it has presumably
> already finalized itself, so calling its close() method
> shouldn't be necessary, and I hadn't intended that it should
> be called in that case.

close() *should* still be called in that case - the current expansion in
the PEP is correct. It is the *iterator's* job to make sure that
multiple calls to close() (or calling close() on a finished iterator)
don't cause problems. The syntax shouldn't be trying to second guess
whether or not calling close() is necessary or not - it should just be
calling it, period.

>>> def gen():
...   yield 1
...
>>> g = gen()
>>> g.next()
1
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> g.close()
>>> g.close()
>>> g2 = gen()
>>> g.close()
>>> g.close()
>>> g3 = gen()
>>> g3.next()
1
>>> g.close()
>>> g.close()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sun Mar 22 00:28:27 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Mar 2009 09:28:27 +1000
Subject: [Python-ideas] Yield-From: Revamped expansion
In-Reply-To: <49C5752C.2080704@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>
	<49C5752C.2080704@canterbury.ac.nz>
Message-ID: <49C5781B.30102@gmail.com>

Greg Ewing wrote:
> I'm thinking about replacing the expansion with the
> following, which hopefully fixes a couple of concerns
> that were raised recently without breaking anything else.
> 
> Can anyone see any remaining ways in which it doesn't
> match the textual description in the Proposal section?
> 
> (It still isn't *quite* right, because it doesn't
> distinguish between a GeneratorExit explicitly thrown
> in and one resulting from calling close() on the
> delegating generator. I may need to revise the text
> and/or my implementation on that point, because I want
> the inline-expansion interpretation to hold.)
> 
>     _i = iter(EXPR)
>     try:
>         _u = _i.next()
>     except StopIteration, _e:
>         _r = _e.value
>     else:
>         while 1:
>             try:
>                 _v = yield _u
>             except GeneratorExit:
>                 _m = getattr(_i, 'close', None)
>                 if _m is not None:
>                     _m()
>                 raise
>             except BaseException, _e:
>                 _m = getattr(_i, 'throw', None)
>                 if _m is not None:
>                     _u = _m(_e)
>                 else:
>                     raise
>             else:
>                 try:
>                     if _v is None:
>                         _u = _i.next()
>                     else:
>                         _u = _i.send(_v)
>                 except StopIteration, _e:
>                     _r = _e.value
>                     break
>     RESULT = _r
> 

I'd adjust the inner exception handlers to exploit the fact that
SystemExit and GeneratorExit don't inherit from BaseException:

    _i = iter(EXPR)
    try:
        _u = _i.next()
    except StopIteration, _e:
        _r = _e.value
    else:
        while 1:
            try:
                _v = yield _u
            except Exception, _e:
                _m = getattr(_i, 'throw', None)
                if _m is not None:
                    _u = _m(_e)
                else:
                    raise
            except:
                # Covers SystemExit, GeneratorExit and
                # anything else that doesn't inherit
                # from Exception
                _m = getattr(_i, 'close', None)
                if _m is not None:
                    _m()
                raise
            else:
                try:
                    if _v is None:
                        _u = _i.next()
                    else:
                        _u = _i.send(_v)
                except StopIteration, _e:
                    _r = _e.value
                    break
    RESULT = _r

I think Antoine and PJE are right that the PEP needs some more actual
use cases though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From greg.ewing at canterbury.ac.nz  Sun Mar 22 09:15:42 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Mar 2009 20:15:42 +1200
Subject: [Python-ideas] Revised**7 PEP on Yield-From
In-Reply-To: <49C57698.7030808@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com>
Message-ID: <49C5F3AE.4060402@canterbury.ac.nz>

Nick Coghlan wrote:

> The syntax shouldn't be trying to second guess
> whether or not calling close() is necessary or not - it should just be
> calling it, period.

But *why* should it be called? Just as calling close() after
the iterator has finished shouldn't do any harm, *not* doing
so shouldn't do any harm either, and some implementation
strategies (my current one included) would have to go out
of their way to call close() in that case.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sun Mar 22 09:23:11 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Mar 2009 20:23:11 +1200
Subject: [Python-ideas] Yield-From: Revamped expansion
In-Reply-To: <49C5781B.30102@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C5752C.2080704@canterbury.ac.nz> <49C5781B.30102@gmail.com>
Message-ID: <49C5F56F.7060001@canterbury.ac.nz>

Nick Coghlan wrote:

> I'd adjust the inner exception handlers to exploit the fact that
> SystemExit and GeneratorExit don't inherit from BaseException:

But then anything thrown in that didn't inherit from
Exception would bypass giving the subiterator a chance
to handle it, which doesn't seem right.

The more I think about this, the more I'm wondering
whether I shouldn't ever try to call close() on the
subiterator at all, and just rely on it to finalize
itself when it's deallocated.

That would solve all problems concerning when and if
close() calls should be made (the answer would be "never").

It would also avoid the problem of a partially exhausted
iterator that's still in use by something else getting
prematurely finalized, which is another thing that's been
bothering me.

Here's another expansion based on that idea. When we've
finished with the subiterator for whatever reason --
it raised StopIteration, something got thrown in, we
got closed ourselves, etc. -- we simply drop our reference
to it. If that causes it to be deallocated, it's
responsible for cleaning itself up however it sees fit.

     _i = iter(EXPR)
     try:
         try:
             _u = _i.next()
         except StopIteration, _e:
             _r = _e.value
         else:
             while 1:
                 try:
                     _v = yield _u
                 except BaseException, _e:
                     _m = getattr(_i, 'throw', None)
                     if _m is not None:
                         _u = _m(_e)
                     else:
                         raise
                 else:
                     try:
                         if _v is None:
                             _u = _i.next()
                         else:
                             _u = _i.send(_v)
                     except StopIteration, _e:
                         _r = _e.value
                         break
     finally:
         del _i
     RESULT = _r

> I think Antoine and PJE are right that the PEP needs some more actual
> use cases though.

The examples I have are a bit big to put in the PEP
itself, but I can include some links.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Sun Mar 22 10:26:08 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 22 Mar 2009 21:26:08 +1200
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C5F3AE.4060402@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
Message-ID: <49C60430.7030108@canterbury.ac.nz>

I'm having trouble making up my mind how GeneratorExit
should be handled.

My feeling is that GeneratorExit is a peculiarity of
generators that other kinds of iterators shouldn't have
to know about. So, if you close() a generator, that
shouldn't imply throwing GeneratorExit into the
subiterator -- rather, the subiterator should simply
be dropped and then the delegating generator finalized
as usual.

If the subiterator happens to be another generator,
dropping the last reference to it will cause it to
be closed, in which case it will raise its own
GeneratorExit. Other kinds of iterators can finalize
themselves however they see fit, and don't need to
pretend they're generators and understand
GeneratorExit.

For consistency, this implies that a GeneratorExit
explicitly thrown in using throw() shouldn't be
forwarded to the subiterator either, even if it has
a throw() method.

To do otherwise would require making a distinction that
can't be expressed in the Python expansion. Also, it
seems elegant to preserve the property that if g is a
generator then g.close() and g.throw(GeneratorExit) are
exactly equivalent.

What do people think about this?

-- 
Greg


From ncoghlan at gmail.com  Sun Mar 22 12:55:50 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 22 Mar 2009 21:55:50 +1000
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C60430.7030108@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
Message-ID: <49C62746.7080200@gmail.com>

Greg Ewing wrote:
> To do otherwise would require making a distinction that
> can't be expressed in the Python expansion. Also, it
> seems elegant to preserve the property that if g is a
> generator then g.close() and g.throw(GeneratorExit) are
> exactly equivalent.
> 
> What do people think about this?

That whole question is why I suggested rephrasing the question of which
exceptions are passed to the subiterator in Exception vs BaseException
terms. The only acknowledged direct subclasses of BaseException are
KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those
exceptions is to say "drop what you're doing and bail out any which way
you can". Terminating the outermost generator in those cases and letting
the subiterators clean up as best they can sounds like a perfectly
reasonable option to me. The alternative is to catch BaseException and
throw absolutely everything (including GeneratorExit) into the
subiterator. The in-between options that you're describing would appear
to just complicate the semantics to no great purpose.

Note that you may also be pursuing a false consistency here, since
g.close() has never been equivalent to g.throw(GeneratorExit), as the
latter propagates the exception back into the current scope while the
former suppresses it (example was run using 2.5.2):

>>> def gen(): yield
...
>>> g = gen()
>>> g.next()
>>> g.close()
>>> g2 = gen()
>>> g2.next()
>>> g2.throw(GeneratorExit)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in gen
GeneratorExit


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Sun Mar 22 13:42:36 2009
From: jh at improva.dk (Jacob Holm)
Date: Sun, 22 Mar 2009 13:42:36 +0100
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C62746.7080200@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C62746.7080200@gmail.com>
Message-ID: <49C6323C.1020400@improva.dk>

Nick Coghlan wrote:
> Greg Ewing wrote:
>   
>> To do otherwise would require making a distinction that
>> can't be expressed in the Python expansion. Also, it
>> seems elegant to preserve the property that if g is a
>> generator then g.close() and g.throw(GeneratorExit) are
>> exactly equivalent.
>>
>> What do people think about this?
>>     
>
> That whole question is why I suggested rephrasing the question of which
> exceptions are passed to the subiterator in Exception vs BaseException
> terms. The only acknowledged direct subclasses of BaseException are
> KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those
> exceptions is to say "drop what you're doing and bail out any which way
> you can". Terminating the outermost generator in those cases and letting
> the subiterators clean up as best they can sounds like a perfectly
> reasonable option to me. The alternative is to catch BaseException and
> throw absolutely everything (including GeneratorExit) into the
> subiterator. The in-between options that you're describing would appear
> to just complicate the semantics to no great purpose.
>   
Well, since GeneratorExit is specifically about generators, I don't see 
a problem in special-casing that one and just let everything else be 
thrown at the subgenerator. I would also be Ok with just throwing 
everything (including GeneratorExit) there, as that makes the 
implementation of throw a bit simpler.

> Note that you may also be pursuing a false consistency here, since
> g.close() has never been equivalent to g.throw(GeneratorExit), as the
> latter propagates the exception back into the current scope while the
> former suppresses it (example was run using 2.5.2):
>   
I believe that the "exact equivalence" Greg was talking about is the 
description of close from PEP 342. It is nice that the semantics of 
close can be described so easily in terms of throw.

I like the idea of not having an explicit close in the expansion at all. 
In most cases the refcounting will take care of it anyway (at least in 
CPython), and when there are multiple references you might actually want 
to not close. Code that needs it can add the explicit close themselves 
by putting the yield-from in a try...finally or a with... block.

- Jacob


From jh at improva.dk  Sun Mar 22 15:35:46 2009
From: jh at improva.dk (Jacob Holm)
Date: Sun, 22 Mar 2009 15:35:46 +0100
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C60430.7030108@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
Message-ID: <49C64CC2.1050608@improva.dk>

Greg Ewing wrote:
> I'm having trouble making up my mind how GeneratorExit
> should be handled.
>
> My feeling is that GeneratorExit is a peculiarity of
> generators that other kinds of iterators shouldn't have
> to know about.
They don't, see below.

> So, if you close() a generator, that
> shouldn't imply throwing GeneratorExit into the
> subiterator -- rather, the subiterator should simply
> be dropped and then the delegating generator finalized
> as usual.
>
> If the subiterator happens to be another generator,
> dropping the last reference to it will cause it to
> be closed, in which case it will raise its own
> GeneratorExit. 
This is only true in CPython, but that shouldn't be a problem. If you 
really need the subiterator to be closed at that point, wrapping the 
yield-from in the appropriate try...finally... or with... block will do 
the trick.

> Other kinds of iterators can finalize
> themselves however they see fit, and don't need to
> pretend they're generators and understand
> GeneratorExit.
They don't have to understand GeneratorExit at all. As long as they know 
how to clean up after themselves when thrown an exception they cannot 
handle, things will just work. GeneratorExit is no different from 
SystemExit or KeyboardInterrupt in that regard.

>
> For consistency, this implies that a GeneratorExit
> explicitly thrown in using throw() shouldn't be
> forwarded to the subiterator either, even if it has
> a throw() method.
I agree that if close() doesn't throw the GeneratorExit to the 
subiterator, then throw() shouldn't either.

>
> To do otherwise would require making a distinction that
> can't be expressed in the Python expansion. Also, it
> seems elegant to preserve the property that if g is a
> generator then g.close() and g.throw(GeneratorExit) are
> exactly equivalent.
Not exactly equivalent, but related in the simple way described in PEP 342.

>
> What do people think about this?
>
If I understand you correctly, what you want can be described by the 
following expansion:

    _i = iter(EXPR)
    try:
        _u = _i.next()
        while 1:
            try:
                _v = yield _u
            except GeneratorExit:
                raise
            except BaseException, _e:
                _m = getattr(_i, 'throw', None)
                if _m is not None:
                    _u = _m(_e)
                else:
                    raise
            else:
                if _v is None:
                    _u = _i.next()
                else:
                    _u = _i.send(_v)
    except StopIteration, _e:
        RESULT = _e.value
    finally:
        _i = _u = _v = _e = _m = None
        del _i, _u, _v, _e, _m


(except for minor details like the possible method caching). I like this 
version because it makes it easier to share subiterators if you need to. 
The explicit close in the earlier proposals meant that as soon as one 
generator delegating to the shared iterator was closed, the shared one 
would be as well. No, I don't have a concrete use case for this, but I 
think it is the least surprising behavior we could choose for closing 
shared subiterators. As mentioned above, you can still explicitly 
request that the subiterator be closed with the delegating generator by 
wrapping the yield-from in a try...finally... or with... block.

If I understand Nick correctly, he would like to drop the "except 
GeneratorExit: raise" part, and possibly change BaseException to 
Exception. I don't like the idea of just dropping the "except 
GeneratorExit: raise", as that brings us back in the situation where 
shared subiterators are less useful. If we also change BaseException to 
Exception, the only difference is that it will no longer be possible to 
throw exceptions like SystemExit and KeyboardInterrupt that don't 
inherit from Exception to a subiterator. Again, I don't have a concrete 
use case, but I think putting an arbitrary restriction like that in a 
language construct is a bad idea. One example where this would cause 
surprises is if you split part of a generator function (that for one 
reason or another need to handle these exceptions) into a separate 
generator and calls it using yield from. Throwing an exception to the 
refactored generator could then have different meaning than before the 
refactoring, and there would be no easy way to fix this.

Just my 2 cents...

- Jacob


From ncoghlan at gmail.com  Sun Mar 22 22:08:57 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Mar 2009 07:08:57 +1000
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C64CC2.1050608@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C64CC2.1050608@improva.dk>
Message-ID: <49C6A8E9.4010705@gmail.com>

Jacob Holm wrote:
> If I understand Nick correctly, he would like to drop the "except
> GeneratorExit: raise" part, and possibly change BaseException to
> Exception. I don't like the idea of just dropping the "except
> GeneratorExit: raise", as that brings us back in the situation where
> shared subiterators are less useful. If we also change BaseException to
> Exception, the only difference is that it will no longer be possible to
> throw exceptions like SystemExit and KeyboardInterrupt that don't
> inherit from Exception to a subiterator.

Note that as of 2.6, GeneratorExit doesn't inherit from Exception either
- it now inherits directly from BaseException, just like the other two
terminal exceptions:

Python 2.6+ (trunk:66863M, Oct  9 2008, 21:32:59)
>>> BaseException.__subclasses__()
[<type 'exceptions.Exception'>, <type 'exceptions.GeneratorExit'>, <type
'exceptions.SystemExit'>, <type 'exceptions.KeyboardInterrupt'>]

All I'm saying is that if GeneratorExit doesn't get passed down then
neither should SystemExit nor KeyboardInterrupt, while if the latter two
*do* get passed down, then so should GeneratorExit.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From guido at python.org  Sun Mar 22 23:31:15 2009
From: guido at python.org (Guido van Rossum)
Date: Sun, 22 Mar 2009 15:31:15 -0700
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <20090319.231249.343185657.mrs@localhost.localdomain>
References: <20090312.202410.846948621.mrs@localhost.localdomain>
	<ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
	<20090319.231249.343185657.mrs@localhost.localdomain>
Message-ID: <ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>

On Thu, Mar 19, 2009 at 4:12 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> Guido van Rossum <guido at python.org> wrote:
>
>> On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
>> > Suppose we have an object x with a private attribute, "_field",
>> > defined by a class Foo:
>> >
>> > class Foo(object):
>> >
>> > ? ?def __init__(self):
>> > ? ? ? ?self._field = "secret"
>> >
>> > x = Foo()
>>
>> Can you add some principals to this example? Who wrote the Foo class
>> definition? Does CapPython have access to the source code for Foo? To
>> the class object?
>
> OK, suppose we have two principals, Alice and Bob. ?Alice receives a
> string from Bob. ?Alice instantiates the string using CapPython's
> safe_eval() function, getting back a module object that contains a
> function object. ?Alice passes the function an object x. ?Alice's
> intention is that the function should not be able to get hold of the
> contents of x._field, no matter what string Bob supplies.
>
> To make this more concrete, this is what Alice executes, with
> source_from_bob defined in a string literal for the sake of example:
>
> source_from_bob = """
> class C:
> ? ?def f(self):
> ? ? ? ?return self._field
> def entry_point(x):
> ? ?C.f(x) # potentially gets the secret object in Python 3.0
> """
>
> import safeeval
>
> secret = object()
>
> class Foo(object):
> ? ?def __init__(self):
> ? ? ? ?self._field = secret
>
> x = Foo()
> module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
> module.entry_point(x)
>
>
> In this example, Bob's code is not given access to the class object
> Foo. ?Furthermore, Bob should not be able to get access to the class
> Foo from the instance x. ?The type() builtin is not considered to be
> safe in CapPython so it is not included in the default environment.
>
> Bob's code is not given access to the source code for class Foo. ?But
> even if Bob is aware of Alice's source code, it should not affect
> whether Bob can get hold of the secret object.

OK, I think I understand all this, except I don't have much of an idea
of what subset of the language Bob is allowed to used.

> By the way, you can try out the example by getting the code from the
> Bazaar repository:
> bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython

If you don't mind I will try to avoid downloading your source a little longer.

>> > However, in Python 3.0, the CapPython code can do this:
>> >
>> > class C(object):
>> >
>> > ? ?def f(self):
>> > ? ? ? ?return self._field
>> >
>> > C.f(x) # returns "secret"
>> >
>> > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is
>> > not being called on an instance of C.
>>
>> In Python 2.x I could write
>>
>> class C(Foo):
>> ? def f(self):
>> ? ? return self._field
>
> In the example above, Bob's code is not given access to Foo, so Bob
> cannot do this. ?But you are right, if Bob's code were passed Foo as
> well as x, Bob could do this.
>
> Suppose Alice wanted to give Bob access to class Foo, perhaps so that
> Bob could create derived classes. ?It is still possible for Alice to
> do that safely, if Alice defines Foo differently. ?Alice can pass the
> secret object to Foo's constructor instead of having the class
> definition get its reference to the secret object from an enclosing
> scope:
>
> class Foo(object):
>
> ? ?def __init__(self, arg):
> ? ? ? ?self._field = arg
>
> secret = object()
> x = Foo(secret)
> module = safeeval.safe_eval(source_from_bob, safeeval.Environment())
> module.entry_point(x, Foo)
>
>
> Bob can create his own objects derived from Foo, but cannot use his
> access to Foo to break encapsulation of instance x. ?Foo is now
> authorityless, in the sense that it does not capture "secret" from its
> enclosing environment, unlike the previous definition.
>
>
>> or alternatively
>>
>> class C(x.__class__):
>> ? <same f as before>
>
> The verifier would reject x.__class__, so this is not possible.
>
>
>> > Guido said, "I don't understand where the function object f gets its
>> > magic powers".
>> >
>> > The answer is that function definitions directly inside class
>> > statements are treated specially by the verifier.
>>
>> Hm, this sounds like a major change in language semantics, and if I
>> were Sun I'd sue you for using the name "Python" in your product. :-)
>
> Damn, the makers of Typed Lambda Calculus had better watch out for
> legal action from the makers of Lambda Calculus(tm) too... :-) ?Is it
> really a major change in semantics if it's just a subset? ;-)

Well yes. The empty subset is also a subset. :-) More seriously, IIUC
you are disallowing all use of attribute names starting with
underscores, which not only invalidates most Python code in practical
use (though you might not care about that) but also disallows the use
of many features that are considered part of the language, such as
access to __dict__ and many other introspective attributes.

> To some extent the verifier's check of only accessing private
> attributes through self is just checking a coding style that I already
> follow when writing Python code (except sometimes for writing test
> cases).

You might wish this to be true, but for most Python programmers, it
isn't. Introspection is a commonly-used part of the language (probably
more so than in Java). So is the use of attribute names starting with
a single underscore outside the class tree, e.g. by "friend"
functions.

> Of course some of the verifier's checks, such as only allowing
> attribute assignments through self, are a lot more draconian than
> coding style checks.

That also sounds like a rather serious hindrance to writing Python as
most people think of it.

>> > If you wrote the same function definition at the top level:
>> >
>> > def f(var):
>> > ? ?return var._field # rejected
>> >
>> > the attribute access would be rejected by the verifier, because "var"
>> > is not a self variable, and private attributes may only be accessed
>> > through self variables.
>> >
>> > I renamed the variable in the example,
>>
>> What do you mean by this?
>
> I just mean that I applied alpha conversion.

BTW that's a new term for me. :-)

> def f(self):
> ? ?return self._field
>
> is equivalent to
>
> def f(var):
> ? ?return var._field

This equivalence is good.

> Whether these function definitions are accepted by the verifier
> depends on their context.

But this isn't.

Are you saying that the verifier accepts the use of self._foo in a
method? That would make the scenario of potentially passing a class
defined by Alice into Bob's code much harder to verify -- now suddenly
Alice has to know about a lot of things before she can be sure that
she doesn't leave open a backdoor for Bob.

>> Do you also catch things like
>>
>> g = getattr
>> s = 'field'.replace('f', '_f')
>>
>> print g(x, s)
>>
>> ?
>
> The default environment doesn't provide the real getattr() function.
> It provides a wrapped version that rejects private attribute names.

Do you have a web page describing the precise list of limitations you
apply in your "subset" of Python? Does it support import of some form?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jh at improva.dk  Mon Mar 23 01:07:12 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 23 Mar 2009 01:07:12 +0100
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C6A8E9.4010705@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C64CC2.1050608@improva.dk> <49C6A8E9.4010705@gmail.com>
Message-ID: <49C6D2B0.9060405@improva.dk>

Hi Nick

Nick Coghlan wrote:
> Jacob Holm wrote:
>   
>> If I understand Nick correctly, he would like to drop the "except
>> GeneratorExit: raise" part, and possibly change BaseException to
>> Exception. I don't like the idea of just dropping the "except
>> GeneratorExit: raise", as that brings us back in the situation where
>> shared subiterators are less useful. If we also change BaseException to
>> Exception, the only difference is that it will no longer be possible to
>> throw exceptions like SystemExit and KeyboardInterrupt that don't
>> inherit from Exception to a subiterator.
>>     
>
> Note that as of 2.6, GeneratorExit doesn't inherit from Exception either
> - it now inherits directly from BaseException, just like the other two
> terminal exceptions:
>   
I know this.

> All I'm saying is that if GeneratorExit doesn't get passed down then
> neither should SystemExit nor KeyboardInterrupt, while if the latter two
> *do* get passed down, then so should GeneratorExit.
>   
I also know this, and I disagree.  You are saying that because they have 
the thing in commen that they do *not* inherit from Exception we should 
treat them the same.  This is like saying that anything that is not a 
shade of green should be treated as red, completely ignoring the 
possibility of other colors.

I like to see GeneratorExit handled as a special case by yield-from, 
because:

   1. It already has a special meaning in generators as the exception
      raised in the generator when close is called.
   2. It *enables* certain uses of yield-from that would require much
      more more work to handle otherwise.  I am thinking of the ability
      to have multiple generators yield from the same iterator.  Being
      able to close one generator without closing the shared iterator
      seems like a good thing.
   3. While the GeneratorExit is not propagated directly, its expected
      effect of finalizing the subiterator *is*. At least in CPython,
      and assuming the subiterator does its finalization in a __del__
      method, and that the generator holds the only reference. If the
      subiterator is actually a generator, it will even look like the
      GeneratorExit was propagated, due to the PEP 342 definition of close.

I don't like the idea of only throwing exceptions that inherit from 
Exception to the subiterator, because it makes the following two 
generators behave differently when thrown a non-Exception exception.

def generatorA():
    try:
        x = yield
    except BaseException, e:
        print type(e)
        raise

def generatorB():
    return (yield from generatorA())


The PEP is clearly intended to make them act identically.  Quoting from 
the PEP: "When the iterator is another generator, the effect is the same 
as if the body of the subgenerator were inlined at the point of the 
yield from expression".

Treating only GeneratorExit special allows them to behave exactly the 
same (in CPython).  If you only propagate exceptions that inherit from 
Exception, you would have to write something like:

def generatorC():
    g = generatorA()
    while 1:
        try:
            return (yield from g)
        except Exception:
            # This exception comes from g, so just reraise
            raise
        except BaseException, e:
            yield g.throw(e) # this exception was not propagated by yield-from, do it manually


to get the same effect.

I don't mind that the expansion as written in the PEP becomes very 
slightly more complicated, as long as it makes the code  using it 
simpler to reason about.

- Jacob


From benjamin at python.org  Mon Mar 23 01:13:26 2009
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 23 Mar 2009 00:13:26 +0000 (UTC)
Subject: [Python-ideas] a identity function?
Message-ID: <loom.20090323T000822-515@post.gmane.org>

I've found as I write more and more decorators I need an identity function
often. For example I might write:

def replace_maybe(reason):
    if reason == "good reason":
        return lambda x: x
    def decorator(func):
        # do fancy stuff here
    return decorator

I hate lambdas, so usually I write

def _id(x):
    return x

It'd be nice to have a shortcut in the stdlib, though. Would this go well in the
operator or functools modules well?


From python at rcn.com  Mon Mar 23 01:27:26 2009
From: python at rcn.com (Raymond Hettinger)
Date: Sun, 22 Mar 2009 17:27:26 -0700
Subject: [Python-ideas] a identity function?
References: <loom.20090323T000822-515@post.gmane.org>
Message-ID: <FB4A2E368EBA4A809252B1093A6F332A@RaymondLaptop1>

[Benjamin Peterson]
> I've found as I write more and more decorators I need an identity function
> often. For example I might write:
> 
> def replace_maybe(reason):
>    if reason == "good reason":
>        return lambda x: x
>    def decorator(func):
>        # do fancy stuff here
>    return decorator
> 
> I hate lambdas, so usually I write
> 
> def _id(x):
>    return x
> 
> It'd be nice to have a shortcut in the stdlib, though. Would this go well in the
> operator or functools modules well?

-1 

I and Paul Rubin considered this a long time ago.  It stayed on the todo
list for a while and then fell by the wayside as its downsides became
apparent.

One problem is that many of the places whether it is tempting to
use an identity function, it is just a slower way to do something
that we should have used a simple if-statement for.  In your
example, there is no cost, but it is terrible to end-up with variants
of map(func, iterable) where func defaults to lambda x: x. 

The other issue is that different signatures were needed for
different tasks.

    identity = lambda *args: args
    identity = lambda *args:  args[0] if args else None
    identity = lambda x: x

Better to let people write their own trivial pass-throughs
and think about the signature and time costs.


Raymond
    

From greg.ewing at canterbury.ac.nz  Mon Mar 23 01:49:49 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 23 Mar 2009 13:49:49 +1300
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C6A8E9.4010705@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C64CC2.1050608@improva.dk>
	<49C6A8E9.4010705@gmail.com>
Message-ID: <49C6DCAD.3060701@canterbury.ac.nz>

Nick Coghlan wrote:

> All I'm saying is that if GeneratorExit doesn't get passed down then
> neither should SystemExit nor KeyboardInterrupt

That would violate the inlining principle, though.
An inlined generator is going to get all exceptions
regardless of what they inherit from.

>, while if the latter two
> *do* get passed down, then so should GeneratorExit.

Whereas that would mean a shared subiterator would
get prematurely finalized when closing the delegating
generator.

So there seems to be no choice about this -- we must
pass on all exceptions except GeneratorExit, and we
must *not* pass on GeneratorExit itself.

-- 
Greg


From dangyogi at gmail.com  Mon Mar 23 02:23:19 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Sun, 22 Mar 2009 21:23:19 -0400
Subject: [Python-ideas] Yield-From: Revamped expansion
In-Reply-To: <49C5781B.30102@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C5752C.2080704@canterbury.ac.nz>
	<49C5781B.30102@gmail.com>
Message-ID: <49C6E487.7040002@gmail.com>

Nick Coghlan wrote:
>
> I'd adjust the inner exception handlers to exploit the fact that
> SystemExit and GeneratorExit don't inherit from BaseException:
>
> [...]
>             except:
>                 # Covers SystemExit, GeneratorExit and
>                 # anything else that doesn't inherit
>                 # from Exception
>                 _m = getattr(_i, 'close', None)
>                 if _m is not None:
>                     _m()
>                 raise
>   
This feels better to me too.  Though it seems that _i.throw would be 
more appropriate than _i.close (except call _i.close is there is no 
_i.throw -- is it possible to have a close and not a throw?).

I like the idea that "finally" (in try/finally) means finally and not 
"maybe finally" (which boils down to finally in CPython due to the 
reference counting collector, but maybe finally in Jython, IronPython or 
Pypy).


-bruce frederiksen


From dangyogi at gmail.com  Mon Mar 23 03:40:21 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Sun, 22 Mar 2009 22:40:21 -0400
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C60430.7030108@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
Message-ID: <49C6F695.1050100@gmail.com>

Greg Ewing wrote:
>
> My feeling is that GeneratorExit is a peculiarity of
> generators that other kinds of iterators shouldn't have
> to know about. So, if you close() a generator, that
> shouldn't imply throwing GeneratorExit into the
> subiterator -- [...]
It can only be "thrown into the subiterator" if the subiterator is a 
generator (i.e., has a throw method) -- in which case, it knows about 
GeneratorExit.  So the hasattr(_i, 'throw') test already covers this case.
>
>
> If the subiterator happens to be another generator,
> dropping the last reference to it will cause it to
> be closed, [...] 
NO, NO, NO.  Unless you are prepared to say that programs written to 
this spec are *not* expected to run on any other version of Python other 
than CPython.  CPython is the *only* version with a reference counting 
collector.  And encouraging Python programmers to rely on this invites 
trouble when they try to port to any other version of Python.  I know.  
I've been there, and have the T-shirt.  And it's not pretty.  The errors 
that you get when your finally clauses and context managers aren't run 
can be quite mysterious.  And God help that person if they haven't slept 
with PEP 342 under their pillow!
> Other kinds of iterators can finalize
> themselves however they see fit, and don't need to
> pretend they're generators and understand
> GeneratorExit.
Your PEP currently does not demand that other iterators "pretend they're 
generators and understand GeneratorExit".  Non-generator iterators don't 
have throw or close methods and will remain blissfully ignorant of these 
finer points as the PEP stands now.  So this is not a problem.
>
> For consistency, this implies that a GeneratorExit
> explicitly thrown in using throw() shouldn't be
> forwarded to the subiterator either, even if it has
> a throw() method.
>
> To do otherwise would require making a distinction that
> can't be expressed in the Python expansion. Also, it
> seems elegant to preserve the property that if g is a
> generator then g.close() and g.throw(GeneratorExit) are
> exactly equivalent.
Yes, g.close and g.throw(GeneratorExit) are equivalent.  So you should 
be able to translate a close into a throwing GeneratorExit or vice 
versa.  But if the subiterator doesn't have the first method that you 
look for (let's say you pick throw), then you should call the other 
method (if it has that one instead).

Finally, on your previous post, you say:

> It would also avoid the problem of a partially exhausted
> iterator that's still in use by something else getting
> prematurely finalized, which is another thing that's been
> bothering me. 
This is a valid point.  But consider:

1.  The delegating generator has no way to stop the subgenerator 
prematurely when it uses the yield from.  So the yield from can only be 
stopped prematurely by the delegating generator's caller.  And then the 
subgenerator would have to be communicated between the caller to the 
delegating generator somehow (e.g, passed in as a parameter) so that the 
caller could continue to use it.  (And the subgenerator has to be a 
generator, not a plain iterator).  Though possible, this kind of a use 
case would be used very rarely compared to the use case of the yield 
from being the final place the subgenerator is used.

2.  If finalization of the subgenerator needs to be prevented, it can be 
wrapped in a plain iterator wrapper that doesn't define throw or close.

class no_finalize:
    def __init__(self, gen):
        self.gen = gen
    def __iter__(self):
        return self
    def __next__(self):
        return next(self.gen)
    def send(self, x):
        return self.gen.send(x)

g = subgen(...)
yield from no_finalize(g)
... use g


As I see it, you are faced with two options:

1. Define "yield from" in a way that it will work the same in all 
implementations of Python and will work for the 98% use case without any 
extra boilerplate code, and only require extra boilerplate (as above) 
for the 2% use case.  or

2. Define "yield from" in a way that will have quite different behavior 
(for reasons very obscure to most programmers) on the different 
implementations of Python (due to the different implementation of 
garbage collectors), require boilerplate code to be portable for the 98% 
use case (e.g., adding a "with closing(subgen())" around the yield 
from); but not require any boilerplate code for portability in the 2% 
use case.

The only argument I can think in favor of option 2, is that's what the 
"for" statement ended up with.  But that was only because changing the 
"for" statement to option 1 would break the legacy 2% use cases...

IMHO option 1 is the better choice.

-bruce frederiksen


From denis.spir at free.fr  Mon Mar 23 13:19:06 2009
From: denis.spir at free.fr (spir)
Date: Mon, 23 Mar 2009 13:19:06 +0100
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>
References: <20090312.202410.846948621.mrs@localhost.localdomain>
	<ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
	<20090319.231249.343185657.mrs@localhost.localdomain>
	<ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>
Message-ID: <20090323131906.6da1e6ad@o>

Le Sun, 22 Mar 2009 15:31:15 -0700,
Guido van Rossum <guido at python.org> s'exprima ainsi:

> > To some extent the verifier's check of only accessing private
> > attributes through self is just checking a coding style that I already
> > follow when writing Python code (except sometimes for writing test
> > cases).  
> 
> You might wish this to be true, but for most Python programmers, it
> isn't. Introspection is a commonly-used part of the language (probably
> more so than in Java). So is the use of attribute names starting with
> a single underscore outside the class tree, e.g. by "friend"
> functions.

Just a side note.
In a language that does not hold a notion of private attribute as core feature, "morphologic" (name forming) convention is a great help. I have long thought a more formal way of introducing public interface -- if only a simple declarative line at top of class def -- would be better, but I recently changed my mind. I think now the privacy vs "publicity" opposition is rather relative, vague, changing.
Let's take the case of any toolset/library code with several classes communicating with each other. In most cases, some attributes will be both hidden to client code but exposed to other objects of the toolset. So there are already 3 levels of privacy. If we now introduce tools of the toolset and pure client interface classes we add two levels...
Privacy is relative, so to say conventional; in addition to relative levels, there are also qualitative differences in privacy. Some languages (esp. Java) invent hardcoded languages features in a hopeless trial to formalize all of these distinctions.
The python way of just saying "let this alone, unless you really know what you intend to do" is probably better to cope with such unclear and variable notions.

Denis
------
la vita e estrany


From jh at improva.dk  Mon Mar 23 14:09:07 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 23 Mar 2009 14:09:07 +0100
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C6F695.1050100@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>
Message-ID: <49C789F3.30301@improva.dk>

Bruce Frederiksen wrote:
> Greg Ewing wrote:
> [...]
>> If the subiterator happens to be another generator,
>> dropping the last reference to it will cause it to
>> be closed, [...] 
> NO, NO, NO. Unless you are prepared to say that programs written to 
> this spec are *not* expected to run on any other version of Python 
> other than CPython. CPython is the *only* version with a reference 
> counting collector. And encouraging Python programmers to rely on this 
> invites trouble when they try to port to any other version of Python. 
> I know. I've been there, and have the T-shirt. And it's not pretty. 
> The errors that you get when your finally clauses and context managers 
> aren't run can be quite mysterious. And God help that person if they 
> haven't slept with PEP 342 under their pillow!
Ok, got it. Relying on refcounting is bad.

> [...]
>> It would also avoid the problem of a partially exhausted
>> iterator that's still in use by something else getting
>> prematurely finalized, which is another thing that's been
>> bothering me. 
> This is a valid point. But consider:
>
> 1. The delegating generator has no way to stop the subgenerator 
> prematurely when it uses the yield from. So the yield from can only be 
> stopped prematurely by the delegating generator's caller. And then the 
> subgenerator would have to be communicated between the caller to the 
> delegating generator somehow (e.g, passed in as a parameter) so that 
> the caller could continue to use it. (And the subgenerator has to be a 
> generator, not a plain iterator). 
"...subgenerator has to be a generator" is not entirely true. For 
example, if the subiterator doesn't have send, you can send a non-None 
value to the generator and that will raise an AttributeError at the 
yield from. If it doesn't have throw, you can even throw a StopIteration 
with a value to get that value as the result of the yield-from 
expression, which might be useful in a twisted sort of way. In both 
cases, the subiterator will only be closed if the yield-from expression 
actually closes it. So it is definitely possible to get a non-generator 
prematurely finalized.

> Though possible, this kind of a use case would be used very rarely 
> compared to the use case of the yield from being the final place the 
> subgenerator is used.
That I agree with.

>
> 2. If finalization of the subgenerator needs to be prevented, it can 
> be wrapped in a plain iterator wrapper that doesn't define throw or 
> close.
>
> class no_finalize:
> def __init__(self, gen):
> self.gen = gen
> def __iter__(self):
> return self
> def __next__(self):
> return next(self.gen)
> def send(self, x):
> return self.gen.send(x)
>
> g = subgen(...)
> yield from no_finalize(g)
> ... use g
Well, if the subiterator is a generator that itself uses yield-from, the 
need to wrap it would destroy all possible speed benefits of using 
yield-from. So if there *is* a valid use case for yielding from a shared 
generator, this is not really a solution unless you don't care about speed.

>
> As I see it, you are faced with two options:
>
> 1. Define "yield from" in a way that it will work the same in all 
> implementations of Python and will work for the 98% use case without 
> any extra boilerplate code, and only require extra boilerplate (as 
> above) for the 2% use case. or
I can live with that. This essentially means using the expansion in the 
PEP (with "except Exception, _e" replaced by "except BaseException, _e", 
to get the inlining property we all want). The decision to use explicit 
close will make what could have been a 2% use case much less attractive.

Note that with explicit close, my argument for special-casing 
GeneratorExit by adding "except GeneratorExit: raise" weakens. The 
GeneratorExit will be delegated to the deepest generator/iterator with a 
throw method. As long as the iterators don't swallow the exception, they 
will be closed from the finally clause in the expansion. If one of them 
*does* swallow the exception, the outermost generator will raise a 
RuntimeError. The only difference that special-casing GeneratorExit 
would make is that 1) if the final iterator is not a generator, it won't 
see a GeneratorExit, and 2) if one of the iterators swallow the 
exception, the rest would still be closed and you might get a better 
traceback for the RuntimeError.

>
> 2. Define "yield from" in a way that will have quite different 
> behavior (for reasons very obscure to most programmers) on the 
> different implementations of Python (due to the different 
> implementation of garbage collectors), require boilerplate code to be 
> portable for the 98% use case (e.g., adding a "with closing(subgen())" 
> around the yield from); but not require any boilerplate code for 
> portability in the 2% use case.
>
> The only argument I can think in favor of option 2, is that's what the 
> "for" statement ended up with. But that was only because changing the 
> "for" statement to option 1 would break the legacy 2% use cases...
There is also the question of speed as mentioned above, but that 
argument is not all that strong...

>
> IMHO option 1 is the better choice.
If relying on refcounting is as bad as you say, then I agree.

- Jacob


From dangyogi at gmail.com  Mon Mar 23 21:27:47 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Mon, 23 Mar 2009 16:27:47 -0400
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C789F3.30301@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
Message-ID: <49C7F0C3.10904@gmail.com>

Jacob Holm wrote:
> Bruce Frederiksen wrote:
>> This is a valid point. But consider:
>>
>> 1. The delegating generator has no way to stop the subgenerator 
>> prematurely when it uses the yield from. So the yield from can only 
>> be stopped prematurely by the delegating generator's caller. And then 
>> the subgenerator would have to be communicated between the caller to 
>> the delegating generator somehow (e.g, passed in as a parameter) so 
>> that the caller could continue to use it. (And the subgenerator has 
>> to be a generator, not a plain iterator). 
> "...subgenerator has to be a generator" is not entirely true. For 
> example, if the subiterator doesn't have send, you can send a non-None 
> value to the generator and that will raise an AttributeError at the 
> yield from. If it doesn't have throw, you can even throw a 
> StopIteration with a value to get that value as the result of the 
> yield-from expression, which might be useful in a twisted sort of way. 
> In both cases, the subiterator will only be closed if the yield-from 
> expression actually closes it. So it is definitely possible to get a 
> non-generator prematurely finalized.
But non-generators don't have a close (or throw) method.  They lack the 
concept of "finalization".  Only generators have these extra methods.  
So using a subiterator in yield from isn't an issue here.  (Or am I 
missing something)?
> Well, if the subiterator is a generator that itself uses yield-from, 
> the need to wrap it would destroy all possible speed benefits of using 
> yield-from. So if there *is* a valid use case for yielding from a 
> shared generator, this is not really a solution unless you don't care 
> about speed.
Yes, there is a performance penalty in this case.  If the wrapper were 
written in C, then I would think that the penalty would be negligible.  
Perhaps offer a C wrapper in a standard library??
> Note that with explicit close, my argument for special-casing 
> GeneratorExit by adding "except GeneratorExit: raise" weakens. The 
> GeneratorExit will be delegated to the deepest generator/iterator with 
> a throw method. As long as the iterators don't swallow the exception, 
> they will be closed from the finally clause in the expansion. If one 
> of them *does* swallow the exception, the outermost generator will 
> raise a RuntimeError.
Another case where close differs from throw(GeneratorExit).  Close is 
define in PEP 342 to raise RuntimeError if GeneratorExit is swallowed.  
Should the delegating generator, then, be calling close rather throw for 
GeneratorExit so that the RuntimeError is raised closer to cause of the 
exception?  Or does this violate the "inlining" goal of the current PEP?

-bruce frederiksen


From greg.ewing at canterbury.ac.nz  Tue Mar 24 00:07:13 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 24 Mar 2009 11:07:13 +1200
Subject: [Python-ideas] Yield-From: GeneratorExit?
In-Reply-To: <49C7F0C3.10904@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
Message-ID: <49C81621.9040600@canterbury.ac.nz>

Bruce Frederiksen wrote:

> But non-generators don't have a close (or throw) method.  They lack the 
> concept of "finalization".

Any object could require explicit finalization in the
absence of refcounting, so "close" isn't peculiar to
generators.

> Should the delegating generator, then, be calling close rather throw for 
> GeneratorExit so that the RuntimeError is raised closer to cause of the 
> exception?  Or does this violate the "inlining" goal of the current PEP?

Yes, it would violate the inlining principle.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Tue Mar 24 00:24:53 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 24 Mar 2009 11:24:53 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49C81621.9040600@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>
Message-ID: <49C81A45.1070803@canterbury.ac.nz>

We have a decision to make. It appears we can have
*one* of the following, but not both:

(1) In non-refcounting implementations, subiterators
are finalized promptly when the delegating generator
is explicitly closed.

(2) Subiterators are not prematurely finalized when
other references to them exist.

Since in the majority of intended use cases the
subiterator won't be shared, (1) seems like the more
important guarantee to uphold. Does anyone disagree
with that?

Guido, what do you think?

-- 
Greg


From python at rcn.com  Tue Mar 24 17:03:31 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 24 Mar 2009 09:03:31 -0700
Subject: [Python-ideas] [Python-Dev] About adding a new iterator
	methodcalled	"shuffled"
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
Message-ID: <ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>


> On Tue, Mar 24, 2009, Roy Hyunjin Han wrote:
>>
>> I know that Python has iterator methods called "sorted" and "reversed" and
>> these are handy shortcuts.
>> 
>> Why not add a new iterator method called "shuffled"?

You can already write:

       sorted(s, key=lambda x: random())

But nobody does that.  So you have a good
indication that the proposed method isn't need.


Raymond


From python at rcn.com  Tue Mar 24 17:25:14 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 24 Mar 2009 09:25:14 -0700
Subject: [Python-ideas] [Python-Dev] About adding a new
	iteratormethodcalled	"shuffled"
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
Message-ID: <7D86CAF9592C415EBA74EAD5320439CC@RaymondLaptop1>

> You can already write:
> 
>       sorted(s, key=lambda x: random())
> 
> But nobody does that.  So you have a good
> indication that the proposed method isn't need.

s/need/needed


From solipsis at pitrou.net  Tue Mar 24 19:29:56 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 Mar 2009 18:29:56 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?=5BPython-Dev=5D_About_adding_a_new_iter?=
	=?utf-8?q?ator=09methodcalled=09=22shuffled=22?=
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
Message-ID: <loom.20090324T182710-144@post.gmane.org>

Raymond Hettinger <python at ...> writes:
> 
> You can already write:
> 
>        sorted(s, key=lambda x: random())
> 
> But nobody does that.  So you have a good
> indication that the proposed method isn't need.

On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for
shuffling (I don't know how shuffle() is internally implemented, but ISTM that
it shouldn't take more than O(n)).

Note that I'm not supporting the original proposal: shuffle() is not used enough
to warrant such a shortcut.

Regards

Antoine.


From bmintern at gmail.com  Tue Mar 24 19:34:35 2009
From: bmintern at gmail.com (Brandon Mintern)
Date: Tue, 24 Mar 2009 14:34:35 -0400
Subject: [Python-ideas] [Python-Dev] About adding a new iterator
	methodcalled "shuffled"
In-Reply-To: <loom.20090324T182710-144@post.gmane.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<loom.20090324T182710-144@post.gmane.org>
Message-ID: <4c0fccce0903241134i43b7a77ds2375eaf4b485d323@mail.gmail.com>

from random import shuffle
shuffle(s)

I think it's convenient enough as is.

Brandon


From dickinsm at gmail.com  Tue Mar 24 19:44:51 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Tue, 24 Mar 2009 18:44:51 +0000
Subject: [Python-ideas] [Python-Dev] About adding a new iterator
	methodcalled "shuffled"
In-Reply-To: <loom.20090324T182710-144@post.gmane.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<loom.20090324T182710-144@post.gmane.org>
Message-ID: <5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com>

On Tue, Mar 24, 2009 at 6:29 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for
> shuffling (I don't know how shuffle() is internally implemented, but ISTM that
> it shouldn't take more than O(n)).

I assumed that the OP was suggesting something of the form:

def shuffled(L):
    while L:
        i = random.randrange(len(L))
        yield L[i]
        L.pop(i)

fixed up somehow so that it's only O(1) to yield each element;  in effect,
an itertools version of random.sample.  I could see uses for this
in cases where you only want a few randomly chosen elements
from a large list, but don't necessarily know in advance how many
elements you need.

Mark


From steve at pearwood.info  Tue Mar 24 21:20:00 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Mar 2009 07:20:00 +1100
Subject: [Python-ideas]
	=?iso-8859-1?q?About_adding_a_new_iterator_methodc?=
	=?iso-8859-1?q?alled_=22shuffled=22?=
In-Reply-To: <ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
Message-ID: <200903250720.00433.steve@pearwood.info>

On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote:
> > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote:
> >> I know that Python has iterator methods called "sorted" and
> >> "reversed" and these are handy shortcuts.
> >>
> >> Why not add a new iterator method called "shuffled"?
>
> You can already write:
>
>        sorted(s, key=lambda x: random())
>
> But nobody does that.  So you have a good 
> indication that the proposed method isn't needed.

That's nice -- not as readable as random.shuffle(s) but still nice. And 
fast too: on my PC, it is about twice as fast as random.shuffle() 
for "reasonable" sized lists (tested up to one million items).

I don't think randomly shuffling a list is anywhere near common enough a 
task that it should be a built-in, so -1 on the OP's request, but since 
we're on the topic, I wonder whether the random.shuffle() 
implementation should use Raymond's idiom rather than the current 
Fisher-Yates shuffle?

The advantage of F-Y is that it is O(N) instead of O(N*log N) for 
sorting, but the constant factor makes it actually significantly slower 
in practice.

In addition, the F-Y shuffle is limited by the period of the random 
number generator: given a period P, it can randomize lists of length n 
where n! < P. For lists larger than n items, some permutations are 
unreachable. In the current implementation of random, n equals 2080. I 
*think* Raymond's idiom suffers from the same limitation, it's hard to 
imagine that it doesn't, but can anyone confirm this?

(In any case, if you're shuffling lists with more than 2080 items, and 
you care about the statistical properties of the result (as opposed to 
just "make it somewhat mixed up"), then the current implementation 
isn't good enough and you'll need to use your own shuffle routine.)

Are there any advantages of the current F-Y implementation? It seems to 
me that Raymond's idiom is no worse statistically, and significantly 
faster in practice, so it should be the preferred implementation. 
Thoughts?


-- 
Steven D'Aprano


From guido at python.org  Tue Mar 24 22:05:13 2009
From: guido at python.org (Guido van Rossum)
Date: Tue, 24 Mar 2009 14:05:13 -0700
Subject: [Python-ideas] Builtin test function
In-Reply-To: <3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com>
References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com>
	<200903192148.54461.steve@pearwood.info>
	<3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com>
Message-ID: <ca471dc20903241405p3123540cvf7f21010affae5af@mail.gmail.com>

I think what you are really looking for is a standard API for finding
the tests associated with a module, given the module object (or
perhaps its full name), perhaps combined with a standard API for
running the tests found.

I don't think running tests is of such all-importance to warrant
adding a built-in function that wraps both the test finding and test
running APIs.

But whatever you do, don't call it 'test' -- that name is overloaded
too much as it is.

--Guido

On Fri, Mar 20, 2009 at 1:03 PM, Fredrik Johansson
<fredrik.johansson at gmail.com> wrote:
> On Thu, Mar 19, 2009 at 11:48 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>> On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote:
>>> There's been some discussion about automatic test discovery lately.
>>> Here's a random (not in any way thought through) idea: add a builtin
>>> function test() that runs tests associated with a given function,
>>> class, module, or object.
>>
>> Improved testing is always welcome, but why a built-in?
>>
>> I know testing is important, but is it so common and important that we
>> need it at our fingertips, so to speak, and can't even import a module
>> first before running tests? What's the benefit to making it a built-in
>> instead of part of a test module?
>
> It would just be a convenience, and I'm just throwing the idea out.
>
> The advantage would be a uniform and very simple interface for testing any
> module, without having to know whether I should import doctest,
> unittest or something else (and having to remember the commands
> used by each framework). It would certainly not be a replacement for more
> advanced test frameworks.
>
> Fredrik
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ncoghlan at gmail.com  Tue Mar 24 22:13:26 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Mar 2009 07:13:26 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49C81A45.1070803@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>
Message-ID: <49C94CF6.5070301@gmail.com>

Greg Ewing wrote:
> We have a decision to make. It appears we can have
> *one* of the following, but not both:
> 
> (1) In non-refcounting implementations, subiterators
> are finalized promptly when the delegating generator
> is explicitly closed.
> 
> (2) Subiterators are not prematurely finalized when
> other references to them exist.
> 
> Since in the majority of intended use cases the
> subiterator won't be shared, (1) seems like the more
> important guarantee to uphold. Does anyone disagree
> with that?

If you choose (2), then (1) is trivial to implement in code that uses
the new expression in combination with existing support for
deterministic finalisation. For example:

  with contextlib.closing(make_subiter()) as subiter:
      yield from subiter

On the other hand, if you choose (1), then it is impossible to use that
construct in combination with any other existing constructs to avoid
finalisation - you have to write out the equivalent code from the PEP by
hand, leaving out the finalisation parts.

So I think dropping the implicit finalisation is the better option - it
simplifies the new construct, and plays well with explicit finalisation
when that is what people want.

However, I would also recommend *not* special casing GeneratorExit in
that case: just pass it down using throw.

Note that non-generator iterators that want "throw" to mean the same
thing as "close" can do that easily enough:

  def throw(self, *args):
    self.close()
    reraise(*args)

(reraise itself would just do the dance to check how many arguments
there were and use the appropriate form of "raise" to reraise the exception)

Hmm, that does suggest another issue with the PEP however: it only calls
the subiterator's throw with the value of the thrown in exception. It
should be using the 3 argument form to avoid losing any passed in
traceback information.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From steve at pearwood.info  Tue Mar 24 22:40:11 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Mar 2009 08:40:11 +1100
Subject: [Python-ideas] About adding a new iterator methodcalled
	"shuffled"
In-Reply-To: <200903250720.00433.steve@pearwood.info>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
Message-ID: <200903250840.11674.steve@pearwood.info>

On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote:
> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote:
> > > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote:
> > >> I know that Python has iterator methods called "sorted" and
> > >> "reversed" and these are handy shortcuts.
> > >>
> > >> Why not add a new iterator method called "shuffled"?
> >
> > You can already write:
> >
> >        sorted(s, key=lambda x: random())
> >
> > But nobody does that.  So you have a good
> > indication that the proposed method isn't needed.
>
> That's nice -- not as readable as random.shuffle(s) but still nice.
> And fast too: on my PC, it is about twice as fast as random.shuffle()
> for "reasonable" sized lists (tested up to one million items).

Ah crap. Ignore the above: I made an embarrassing error in my test 
(neglected to actually call random inside the lambda) and so my timings 
were completely wrong. The current random.shuffle() is marginally 
faster even for small lists (500 items) so I withdraw my suggestion 
that it be replaced.


-- 
Steven D'Aprano


From jh at improva.dk  Tue Mar 24 22:44:56 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 24 Mar 2009 22:44:56 +0100
Subject: [Python-ideas] [Python-Dev] About adding a new
 iterator	methodcalled "shuffled"
In-Reply-To: <5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<20090324155828.GA15670@panix.com>	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>	<loom.20090324T182710-144@post.gmane.org>
	<5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com>
Message-ID: <49C95458.1040403@improva.dk>

Mark Dickinson wrote:
> On Tue, Mar 24, 2009 at 6:29 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>   
>> On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for
>> shuffling (I don't know how shuffle() is internally implemented, but ISTM that
>> it shouldn't take more than O(n)).
>>     
It doesn't.

>
> I assumed that the OP was suggesting something of the form:
>
> def shuffled(L):
>     while L:
>         i = random.randrange(len(L))
>         yield L[i]
>         L.pop(i)
>
> fixed up somehow so that it's only O(1) to yield each element;  in effect,
> an itertools version of random.sample.  
Like this, for example:

def shuffled(L):
    L = list(L) # make a copy, so we don't mutate the argument
    while L:
        i = random.randrange(len(L))
        yield L[i]
        L[i] = L[-1]
        L.pop()

Or this:

def shuffled(L):
    D = {} # use a dict to store modified values so we don't have to mutate the argument
    for j in xrange(len(L)-1, -1, -1):
        i = random.randrange(j+1)
        yield D.get(i, L[i])
        D[i] = D.get(j, L[j])


The second is a bit slower but avoids copying the whole list up front, 
which should be better for the kind of uses you mention.
And yes, I think it is necessary that it doesn't modify its argument.
> I could see uses for this
> in cases where you only want a few randomly chosen elements
> from a large list, but don't necessarily know in advance how many
> elements you need
So could I, but I don't mind too much having to write it myself when I 
need it.

- Jacob


From terry at jon.es  Tue Mar 24 22:55:31 2009
From: terry at jon.es (Terry Jones)
Date: Tue, 24 Mar 2009 22:55:31 +0100
Subject: [Python-ideas] About adding a new iterator methodcalled
	"shuffled"
In-Reply-To: Your message at 08:40:11 on Wednesday, 25 March 2009
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
	<200903250840.11674.steve@pearwood.info>
Message-ID: <18889.22227.482403.145267@jon.es>

>>>>> "Steven" == Steven D'Aprano <steve at pearwood.info> writes:
Steven> On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote:
>> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote:
>> > > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote:
>> > >> I know that Python has iterator methods called "sorted" and
>> > >> "reversed" and these are handy shortcuts.
>> > >>
>> > >> Why not add a new iterator method called "shuffled"?
>> >
>> > You can already write:
>> >
>> >        sorted(s, key=lambda x: random())
>> >
>> > But nobody does that.  So you have a good
>> > indication that the proposed method isn't needed.
>> 
>> That's nice -- not as readable as random.shuffle(s) but still nice.  And
>> fast too: on my PC, it is about twice as fast as random.shuffle() for
>> "reasonable" sized lists (tested up to one million items).

Note that using sorting to shuffle is likely very inefficient.

The sort takes O(n lg n) comparisons whereas you can do a perfect
Fischer-Yates (aka Knuth) shuffle with <= n swaps.  The model of
computation here is different (comparisons vs swaps), but there is a vast
literature on number of swaps done by sorting algorithms. In any case
there's almost certainly no reason to use anything other than the standard
Knuth shuffle, which is presumably what random.shuffle implements.

Terry


From python at rcn.com  Tue Mar 24 23:20:39 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 24 Mar 2009 15:20:39 -0700
Subject: [Python-ideas] About adding a new iterator
	methodcalled"shuffled"
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1><200903250720.00433.steve@pearwood.info><200903250840.11674.steve@pearwood.info>
	<18889.22227.482403.145267@jon.es>
Message-ID: <AC659C485F4E44CAA9121093D2F1366C@RaymondLaptop1>


> Note that using sorting to shuffle is likely very inefficient.

Who cares?  The OP's goal was to save a few programmer
clock cycles so he could in-line what we already get from
random.shuffle().  His request is use case challenged
(very few programs would benefit and those would only
save a line or two).  If he actually cares about O(n) time
then it's trivial to write:

    s = list(iterable)
    random.shuffle(s)
    for elem in s:
         . . .

But if he want's to mush it on one-line, I gave a workable
alternative.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090324/30d9edc4/attachment.html>

From jh at improva.dk  Tue Mar 24 23:22:11 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 24 Mar 2009 23:22:11 +0100
Subject: [Python-ideas] About adding a new iterator
	methodcalled	"shuffled"
In-Reply-To: <18889.22227.482403.145267@jon.es>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>	<200903250720.00433.steve@pearwood.info>	<200903250840.11674.steve@pearwood.info>
	<18889.22227.482403.145267@jon.es>
Message-ID: <49C95D13.5030201@improva.dk>

Terry Jones wrote:
> Steven> On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote:
>   
>>> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote:
>>>       
>>>>> On Tue, Mar 24, 2009, Roy Hyunjin Han wrote:
>>>>>           
>>>>>> I know that Python has iterator methods called "sorted" and
>>>>>> "reversed" and these are handy shortcuts.
>>>>>>
>>>>>> Why not add a new iterator method called "shuffled"?
>>>>>>             
>>>> You can already write:
>>>>
>>>>        sorted(s, key=lambda x: random())
>>>>
>>>> But nobody does that.  So you have a good
>>>> indication that the proposed method isn't needed.
>>>>         
>>> That's nice -- not as readable as random.shuffle(s) but still nice.  And
>>> fast too: on my PC, it is about twice as fast as random.shuffle() for
>>> "reasonable" sized lists (tested up to one million items).
>>>       
>
> Note that using sorting to shuffle is likely very inefficient.
>
> The sort takes O(n lg n) comparisons whereas you can do a perfect
> Fischer-Yates (aka Knuth) shuffle with <= n swaps.  The model of
> computation here is different (comparisons vs swaps), but there is a vast
> literature on number of swaps done by sorting algorithms. In any case
> there's almost certainly no reason to use anything other than the standard
> Knuth shuffle, which is presumably what random.shuffle implements.
>
>   
It is, I just checked. Other than implementing it in C, I don't see any 
way of significantly speeding this up.

- Jacob


From terry at jon.es  Tue Mar 24 23:46:22 2009
From: terry at jon.es (Terry Jones)
Date: Tue, 24 Mar 2009 23:46:22 +0100
Subject: [Python-ideas] About adding a new iterator
	methodcalled"shuffled"
In-Reply-To: Your message at 15:20:39 on Tuesday, 24 March 2009
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
	<200903250840.11674.steve@pearwood.info>
	<18889.22227.482403.145267@jon.es>
	<AC659C485F4E44CAA9121093D2F1366C@RaymondLaptop1>
Message-ID: <18889.25278.144509.586820@jon.es>

>>>>> "Raymond" == Raymond Hettinger <python at rcn.com> writes:

>> Note that using sorting to shuffle is likely very inefficient.

Raymond> Who cares?  The OP's goal was to save a few programmer clock
Raymond> cycles so he could in-line what we already get from
Raymond> random.shuffle().

Who cares?  Jeez... did I say something to get your hackles up?

I'm not sure if I see the original posting, but the one you first reference
in the mailing list archives doesn't say anything about saving clock
cycles.  Supposing that is what he was after, posting a cute but O(n lg n)
alternative without saying it's highly inefficient is directly counter to
what you say he was looking for.

The reason I even said anything was because someone (Roy?) then said
"that's nice".  That's like someone saying oh, you could do it like this
with bubblesort, someone else saying "that's nice", and there the record
stands, awaiting future generations of uneducated programmers.

Anyway, apologies if you don't care or for commenting out loud on something
that was perhaps obvious to everyone.  BTW, I hadn't noticed Antoine's
earlier message amounting to the same thing. He seems to care too :-)

Terry


From greg.ewing at canterbury.ac.nz  Wed Mar 25 01:48:14 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Mar 2009 12:48:14 +1200
Subject: [Python-ideas] About adding a new iterator methodcalled
	"shuffled"
In-Reply-To: <200903250720.00433.steve@pearwood.info>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
Message-ID: <49C97F4E.8010200@canterbury.ac.nz>

Steven D'Aprano wrote:

> In addition, the F-Y shuffle is limited by the period of the random 
> number generator:

*All* shuffling algorithms are limited by that.

Think about it: A shuffling algorithm is a function
from a random number to a permutation. There's no
way you can get more permutations out than there are
random numbers to put in.

-- 
Greg


From terry at jon.es  Wed Mar 25 02:06:10 2009
From: terry at jon.es (Terry Jones)
Date: Wed, 25 Mar 2009 02:06:10 +0100
Subject: [Python-ideas] About adding a new iterator
	methodcalled	"shuffled"
In-Reply-To: Your message at 12:48:14 on Wednesday, 25 March 2009
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
	<49C97F4E.8010200@canterbury.ac.nz>
Message-ID: <18889.33666.764025.595818@jon.es>

>>>>> "Greg" == Greg Ewing <greg.ewing at canterbury.ac.nz> writes:
Greg> *All* shuffling algorithms are limited by that.

Greg> Think about it: A shuffling algorithm is a function from a random
Greg> number to a permutation. There's no way you can get more permutations
Greg> out than there are random numbers to put in.

Hi Greg

Maybe we should put a note to that effect in random.shuffle.__doc__ :-)

  http://mail.python.org/pipermail/python-dev/2006-June/065815.html

Regards,
Terry


From python at rcn.com  Wed Mar 25 02:59:27 2009
From: python at rcn.com (Raymond Hettinger)
Date: Tue, 24 Mar 2009 18:59:27 -0700
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com><ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1><200903250720.00433.steve@pearwood.info><49C97F4E.8010200@canterbury.ac.nz>
	<18889.33666.764025.595818@jon.es>
Message-ID: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1>


> Greg> Think about it: A shuffling algorithm is a function from a random
> Greg> number to a permutation. There's no way you can get more permutations
> Greg> out than there are random numbers to put in.

If our random number generator can produce more possible shuffles than there are atoms in the universe, I say you don't worry about 
it.


Raymond 


From greg.ewing at canterbury.ac.nz  Wed Mar 25 07:38:26 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Mar 2009 18:38:26 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49C94CF6.5070301@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>
Message-ID: <49C9D162.5040907@canterbury.ac.nz>

Nick Coghlan wrote:
> Greg Ewing wrote:
> 
>>(1) In non-refcounting implementations, subiterators
>>are finalized promptly when the delegating generator
>>is explicitly closed.
>>
>>(2) Subiterators are not prematurely finalized when
>>other references to them exist.
>
> If you choose (2), then (1) is trivial to implement
> 
>   with contextlib.closing(make_subiter()) as subiter:
>       yield from subiter

That's a fairly horrendous thing to expect people to
write around all their yield-froms, though. It also
means we would have to say that the inlining principle
only holds for refcounting implementations.

Maybe we should just give up trying to accommodate
shared subiterators. Is it worth complicating
everything for the sake of something that's not
really part of the intended set of use cases?

> Hmm, that does suggest another issue with the PEP however: it only calls
> the subiterator's throw with the value of the thrown in exception. It
> should be using the 3 argument form to avoid losing any passed in
> traceback information.

Good point, I'll update the expansion accordingly.

-- 
Greg


From solipsis at pitrou.net  Wed Mar 25 12:06:06 2009
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 25 Mar 2009 11:06:06 +0000 (UTC)
Subject: [Python-ideas]
	=?utf-8?q?About_adding_a_new_iterator_methodcalled?=
	=?utf-8?b?CSJzaHVmZmxlZCI=?=
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
	<49C97F4E.8010200@canterbury.ac.nz>
Message-ID: <loom.20090325T110419-372@post.gmane.org>

Greg Ewing <greg.ewing at ...> writes:
> 
> > In addition, the F-Y shuffle is limited by the period of the random 
> > number generator:
> 
> *All* shuffling algorithms are limited by that.
> 
> Think about it: A shuffling algorithm is a function
> from a random number to a permutation. There's no
> way you can get more permutations out than there are
> random numbers to put in.

The period of the generator should be (much) larger than the number of possible
random numbers, because of the generator's internal state.
(I'm not sure I understood your sentence as you meant it though)


From ncoghlan at gmail.com  Wed Mar 25 13:17:54 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Mar 2009 22:17:54 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49C9D162.5040907@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
Message-ID: <49CA20F2.7040207@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
>> Greg Ewing wrote:
>>
>>> (1) In non-refcounting implementations, subiterators
>>> are finalized promptly when the delegating generator
>>> is explicitly closed.
>>>
>>> (2) Subiterators are not prematurely finalized when
>>> other references to them exist.
>>
>> If you choose (2), then (1) is trivial to implement
>>
>>   with contextlib.closing(make_subiter()) as subiter:
>>       yield from subiter
> 
> That's a fairly horrendous thing to expect people to
> write around all their yield-froms, though. It also
> means we would have to say that the inlining principle
> only holds for refcounting implementations.
> 
> Maybe we should just give up trying to accommodate
> shared subiterators. Is it worth complicating
> everything for the sake of something that's not
> really part of the intended set of use cases?

Consider what happens if you replace the 'yield from' with the basic
form of iterator delegation that exists now:

  for x in make_subiter():
    yield x

Is such code wrong in any way? No it isn't. Failing to finalise the
object of iteration is the *normal* case. If for some reason it is
important in a given application to finalise it properly (e.g. the
subiter opens a database connection or file and we want to ensure they
are closed promptly no matter what else happens), only *then* does
deterministic finalisation come into play:

  with closing(make_subiter()) as subiter:
    for x in subiter:
      yield x

That is, I now believe the 'normal' case for 'yield from' should be
modelled on basic iteration, which means no implicit finalisation.

Now, keep in mind that in parallel with this I am now saying that *all*
exceptions, *including GeneratorExit* should be passed down to the
subiterator if it has a throw() method.

So even without implicit finalisation you can use "yield from" to nest
generators to your heart's content and an explicit close on the
outermost generator will be passed down to the innermost generator and
unwind the generator stack from there.

Using your "no finally clause" version from earlier in this thread as
the base for the exact semantic description:

    _i = iter(EXPR)
    try:
        _u = _i.next()
    except StopIteration, _e:
        _r = _e.value
    else:
        while 1:
            try:
                _v = yield _u
            except BaseException, _e:
                _m = getattr(_i, 'throw', None)
                if _m is not None:
                    _u = _m(_e)
                else:
                    raise
            else:
                try:
                    if _v is None:
                        _u = _i.next()
                    else:
                        _u = _i.send(_v)
                except StopIteration, _e:
                    _r = _e.value
                    break
    RESULT = _r


With an expansion of that form, you can easily make arbitrary iterators
(including generators) shareable by wrapping them in an iterator with no
throw or send methods:

  class ShareableIterator(object):
    def __init__(self, itr):
      self.itr = itr
    def __iter__(self):
      return self
    def __next__(self):
      return self.itr.next()
    next = __next__ # Be 2.x friendly
    def close(self):
      # Still support explicit finalisation of the
      # shared iterator, just not throw() or send()
      try:
         close_itr = self.itr.close
      except AttributeError:
         pass
      else:
         close_itr()

  # Decorator to use the above on a generator function
  def shareable(g):
    @functools.wraps(g)
    def wrapper(*args, **kwds):
      return ShareableIterator(g(*args, **kwds))
    return wrapper

Iterators that need finalisation can either make themselves implicitly
closable in yield from expressions by defining a throw() method that
delegates to close() and then reraises the exception appropriately, or
else they can recommend explicit closure regardless of the means of
iteration (be it a for loop, a generator expression or container
comprehension, manual iteration or the new yield from expression).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From steve at pearwood.info  Wed Mar 25 13:28:48 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Mar 2009 23:28:48 +1100
Subject: [Python-ideas]
	=?iso-8859-1?q?About_adding_a_new_iteratormethodca?=
	=?iso-8859-1?q?lled_=22shuffled=22?=
In-Reply-To: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<18889.33666.764025.595818@jon.es>
	<87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1>
Message-ID: <200903252328.49177.steve@pearwood.info>

On Wed, 25 Mar 2009 12:59:27 pm Raymond Hettinger wrote:
> > Greg> Think about it: A shuffling algorithm is a function from a
> > random Greg> number to a permutation. There's no way you can get
> > more permutations Greg> out than there are random numbers to put
> > in.
>
> If our random number generator can produce more possible shuffles
> than there are atoms in the universe, I say you don't worry about it.

No, I'm afraid that is a fallacy, because what is important is the 
number of permutations in the list, and that grows as the factorial of 
the number of items.

The Mersenne Twister has a period of 2**19937-1, which sounds huge, but 
it takes a list of only 2081 items for the number of permutations to 
exceed that.

To spell it out in tedious detail: that means that random.shuffle() can 
produce every permutation of a list of 2080 items, but for 2081 items 
approximately 98% of the possibilities can't be reached. For 2082 
items, approx 99.999% will never be reached. And so on.

Don't get me wrong, random.shuffle() is perfectly adequate for any 
use-case I can think of. But beyond 2080 items in the list, it becomes 
greatly biased, and I think that's important to note in the docs. Those 
who need to know about it will be told, and those who don't care can 
continue to not care.


-- 
Steven D'Aprano


From ncoghlan at gmail.com  Wed Mar 25 13:34:49 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Mar 2009 22:34:49 +1000
Subject: [Python-ideas] About adding a
	new	iteratormethodcalled	"shuffled"
In-Reply-To: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com><ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1><200903250720.00433.steve@pearwood.info><49C97F4E.8010200@canterbury.ac.nz>	<18889.33666.764025.595818@jon.es>
	<87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1>
Message-ID: <49CA24E9.3020706@gmail.com>

Raymond Hettinger wrote:
> 
>> Greg> Think about it: A shuffling algorithm is a function from a random
>> Greg> number to a permutation. There's no way you can get more
>> permutations
>> Greg> out than there are random numbers to put in.
> 
> If our random number generator can produce more possible shuffles than
> there are atoms in the universe, I say you don't worry about it.

The "long int too large to convert to float" error that I got on my
first attempt at printing 2080! in scientific notation is also something
of a hint :)

The decimal module came to my rescue though:

>>> +Decimal(math.factorial(2080))
Decimal('1.983139957541900373849131897E+6000')

That's one heck of a big number!

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Wed Mar 25 15:31:05 2009
From: jh at improva.dk (Jacob Holm)
Date: Wed, 25 Mar 2009 15:31:05 +0100
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CA20F2.7040207@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com>
Message-ID: <49CA4029.6050703@improva.dk>

Nick Coghlan wrote:
> [snip arguments for modelling on basic iteration]
> That is, I now believe the 'normal' case for 'yield from' should be
> modelled on basic iteration, which means no implicit finalisation.
>
> Now, keep in mind that in parallel with this I am now saying that *all*
> exceptions, *including GeneratorExit* should be passed down to the
> subiterator if it has a throw() method.
>   

I still think that is less useful than catching it and just dropping the 
reference, see below.

> So even without implicit finalisation you can use "yield from" to nest
> generators to your heart's content and an explicit close on the
> outermost generator will be passed down to the innermost generator and
> unwind the generator stack from there.
>   

The same would happen with the *implicit* close caused by the last 
reference to the outermost generator going away.

Delegating the GeneratorExit is a sure way to premature finalization 
when using shared generators, but only in a refcounting implementation 
like C-Python. That makes this the only feature I know of that would be 
*more* useful in a non-refcounting implementation.

> Using your "no finally clause" version from earlier in this thread as
> the base for the exact semantic description:
>
>     _i = iter(EXPR)
>     try:
>         _u = _i.next()
>     except StopIteration, _e:
>         _r = _e.value
>     else:
>         while 1:
>             try:
>                 _v = yield _u
>             except BaseException, _e:
>                 _m = getattr(_i, 'throw', None)
>                 if _m is not None:
>                     _u = _m(_e)
>                 else:
>                     raise
>             else:
>                 try:
>                     if _v is None:
>                         _u = _i.next()
>                     else:
>                         _u = _i.send(_v)
>                 except StopIteration, _e:
>                     _r = _e.value
>                     break
>     RESULT = _r
>
>   

I know I didn't comment on that expansion earlier, but should have. It 
fails to handle the case where the throw raises a StopIteration (or 
there is no throw method and the thrown exception is a StopIteration). 
You need something like:

    _i = iter(EXPR)
    try:
        _u = _i.next()
        while 1:
            try:
                _v = yield _u
    #        except GeneratorExit:
    #            raise
            except BaseException:
                _m = getattr(_i, 'throw', None)
                if _m is not None:
                    _u = _m(*sys.exc_info())
                else:
                    raise
            else:
                if _v is None:
                    _u = _i.next()
                else:
                    _u = _i.send(_v)
    except StopIteration, _e:
        RESULT = _e.value
    finally:
        _i = _u = _v = _e = _m = None
        del _i, _u, _v, _e, _m

This is independent of the GeneratorExit issue, but I put it in there as 
a comment just to make it clear what *I* think it should be if we are 
not putting a close in the finally clause. If we *do* put a call to 
close in the finally clause, the premature finalization of shared 
generators is guaranteed anyway, so there is not much point in 
specialcasing GeneratorExit.
> With an expansion of that form, you can easily make arbitrary iterators
> (including generators) shareable by wrapping them in an iterator with no
> throw or send methods:
>
>   class ShareableIterator(object):
>     def __init__(self, itr):
>       self.itr = itr
>     def __iter__(self):
>       return self
>     def __next__(self):
>       return self.itr.next()
>     next = __next__ # Be 2.x friendly
>     def close(self):
>       # Still support explicit finalisation of the
>       # shared iterator, just not throw() or send()
>       try:
>          close_itr = self.itr.close
>       except AttributeError:
>          pass
>       else:
>          close_itr()
>
>   # Decorator to use the above on a generator function
>   def shareable(g):
>     @functools.wraps(g)
>     def wrapper(*args, **kwds):
>       return ShareableIterator(g(*args, **kwds))
>     return wrapper
>   

With this wrapper, you will not be able to throw *any* exceptions to the 
shared iterator. Even if you fix the wrapper to pass through all other 
exceptions than GeneratorExit, you will still completely lose the speed 
benefits of yield-from when doing so. (For next, send, and throw it is 
possible to completely bypass all the intervening generators, so the 
call overhead becomes independent of the number of generators in the 
yield-from chain. I have a patch that does exactly this, working except 
for details related to this discussion). It is not possible to write 
such a wrapper efficiently without making it a builtin and 
special-casing it in the yield-from implementation, and I don't think 
that is a good idea.

> Iterators that need finalisation can either make themselves implicitly
> closable in yield from expressions by defining a throw() method that
> delegates to close() and then reraises the exception appropriately, or
> else they can recommend explicit closure regardless of the means of
> iteration (be it a for loop, a generator expression or container
> comprehension, manual iteration or the new yield from expression).
>   

A generator or iterator that needs closing should recommend explicit 
closing *anyway* to work correctly in other contexts on platforms other 
than C-Python. Not delegating GeneratorExit just happens to make it much 
simpler and faster to use shared generators/iterators that *don't* need 
immediate finalization. In C-Python you even get the finalization for 
free due to the refcounting, but of course relying on that is generally 
considered a bad idea.

- Jacob


From qrczak at knm.org.pl  Wed Mar 25 22:41:27 2009
From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk)
Date: Wed, 25 Mar 2009 22:41:27 +0100
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <200903252328.49177.steve@pearwood.info>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> 
	<18889.33666.764025.595818@jon.es>
	<87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> 
	<200903252328.49177.steve@pearwood.info>
Message-ID: <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>

On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano <steve at pearwood.info> wrote:

> Don't get me wrong, random.shuffle() is perfectly adequate for any
> use-case I can think of. But beyond 2080 items in the list, it becomes
> greatly biased, and I think that's important to note in the docs. Those
> who need to know about it will be told, and those who don't care can
> continue to not care.

Why anyone would care? Orderings possible to obtain from a given good
random number generator are quite uniformly distributed among all
orderings. I bet you can't even predict any particular ordering which
is impossible to obtain. There is no time to generate all orderings.
The factorial of large numbers is just huge.

-- 
Marcin Kowalczyk
qrczak at knm.org.pl
http://qrnik.knm.org.pl/~qrczak/


From greg.ewing at canterbury.ac.nz  Wed Mar 25 22:47:18 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 09:47:18 +1200
Subject: [Python-ideas] About adding a new iterator methodcalled
	"shuffled"
In-Reply-To: <loom.20090325T110419-372@post.gmane.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<20090324155828.GA15670@panix.com>
	<ACFBDEAB666B461F97B2B9A307C2D52E@RaymondLaptop1>
	<200903250720.00433.steve@pearwood.info>
	<49C97F4E.8010200@canterbury.ac.nz>
	<loom.20090325T110419-372@post.gmane.org>
Message-ID: <49CAA666.2020606@canterbury.ac.nz>

Antoine Pitrou wrote:

> The period of the generator should be (much) larger than the number of possible
> random numbers, because of the generator's internal state.

Hm, yes, I should have said a function from an
RNG state to a permutation. The initial state of
the RNG completely determines the permutation
generated, so there can't be more permutations
than states.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Wed Mar 25 23:02:55 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 10:02:55 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CA20F2.7040207@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com>
Message-ID: <49CAAA0F.7090507@canterbury.ac.nz>

Nick Coghlan wrote:

> That is, I now believe the 'normal' case for 'yield from' should be
> modelled on basic iteration, which means no implicit finalisation.
> 
> Now, keep in mind that in parallel with this I am now saying that *all*
> exceptions, *including GeneratorExit* should be passed down to the
> subiterator if it has a throw() method.

But those two things are contradictory. In a refcounting
Python implementation, dropping the last reference to the
delegating generator will cause it to close() itself,
thus throwing a GeneratorExit into the subiterator. If
other references to the subiterator still exist, this
means it gets prematurely finalized.

> With an expansion of that form, you can easily make arbitrary iterators
> (including generators) shareable by wrapping them in an iterator with no
> throw or send methods:

But if you need explicit wrappers to prevent finalization,
then you hardly have "no implicit finalization". So I'm a
bit confused about what behaviour you're really asking for.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu Mar 26 00:35:34 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 11:35:34 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CA4029.6050703@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
Message-ID: <49CABFC6.1080207@canterbury.ac.nz>

Jacob Holm wrote:
> It 
> fails to handle the case where the throw raises a StopIteration (or 
> there is no throw method and the thrown exception is a StopIteration).

No, I think it does the right thing in that case. By the
inlining principle, the StopIteration should be thrown
in like anything else, and if it propagates back out,
it should stop the delegating generator, *not* the
subiterator.

-- 
Greg


From jh at improva.dk  Thu Mar 26 00:40:46 2009
From: jh at improva.dk (Jacob Holm)
Date: Thu, 26 Mar 2009 00:40:46 +0100
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CABFC6.1080207@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
Message-ID: <49CAC0FE.5010305@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> It fails to handle the case where the throw raises a StopIteration 
>> (or there is no throw method and the thrown exception is a 
>> StopIteration).
>
> No, I think it does the right thing in that case. By the
> inlining principle, the StopIteration should be thrown
> in like anything else, and if it propagates back out,
> it should stop the delegating generator, *not* the
> subiterator.
>
But if you throw another exception and it is converted to a 
StopIteration by the subiterator, this should definitely stop the 
subiterator and get a return value. Or?

- Jacob


From steve at pearwood.info  Thu Mar 26 00:58:58 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 26 Mar 2009 10:58:58 +1100
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
Message-ID: <200903261058.59164.steve@pearwood.info>

On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote:
> On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano <steve at pearwood.info> 
wrote:
> > Don't get me wrong, random.shuffle() is perfectly adequate for any
> > use-case I can think of. But beyond 2080 items in the list, it
> > becomes greatly biased, and I think that's important to note in the
> > docs. Those who need to know about it will be told, and those who
> > don't care can continue to not care.
>
> Why anyone would care? Orderings possible to obtain from a given good
> random number generator are quite uniformly distributed among all
> orderings. 

Yes, that holds true for n <= 2080, since Fisher-Yates is an unbiased 
shuffler. But I don't think it remains true for n > 2080 since the vast 
majority of possible permutations have probability zero. I'm not saying 
that this absolutely *will* introduce statistical bias into the 
shuffled lists, but it could, and those who care about that risk 
shouldn't have to read the source code to learn this.


> I bet you can't even predict any particular ordering which 
> is impossible to obtain.

A moral dilemma... should I take advantage of your innumeracy by taking 
you up on that bet, or should I explain why that bet is a sure thing 
for me? *wink*

Since the chances of me collecting on the bet is essentially near zero, 
I'll explain.

For a list with 2082 items, shuffle() chooses from a subset of 
approximately 0.001% of all possible permutations. This means that if I 
give you a list of 2082 items and tell you to shuffle it, and then 
guess that such-and-such a permutation of it will never be reached, I 
can only lose if by chance I guessed on the 1 in 100,000 permutations 
that shuffle() can reach. I have 99,999 chances to win versus 1 to 
lose: that's essentially a sure thing.

In practical terms, beyond (say) 2085 or so, it would be a bona fide 
miracle if I didn't win such a bet.


-- 
Steven D'Aprano


From greg.ewing at canterbury.ac.nz  Thu Mar 26 01:24:25 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 12:24:25 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CAC0FE.5010305@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
Message-ID: <49CACB39.3020708@canterbury.ac.nz>

Jacob Holm wrote:

> But if you throw another exception and it is converted to a 
> StopIteration by the subiterator, this should definitely stop the 
> subiterator and get a return value.

Not if it simply raises a StopIteration from the
throw call. It would have to mark itself as
completed, return normally from the throw and
then raise StopIteration on the next call to
next() or send().

-- 
Greg


From jh at improva.dk  Thu Mar 26 01:50:37 2009
From: jh at improva.dk (Jacob Holm)
Date: Thu, 26 Mar 2009 01:50:37 +0100
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CACB39.3020708@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz>
Message-ID: <49CAD15D.2090008@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>
>> But if you throw another exception and it is converted to a 
>> StopIteration by the subiterator, this should definitely stop the 
>> subiterator and get a return value.
>
> Not if it simply raises a StopIteration from the
> throw call. It would have to mark itself as
> completed, return normally from the throw and
> then raise StopIteration on the next call to
> next() or send().
>
One of us must be missing something...  If the subiterator is exhausted 
before the throw, there won't *be* a value to return from the call so 
the only options for the throw method are to raise StopIteraton, or to 
raise some other exception.  Example:

def inner():
    try:
        yield 1
    except ValueError:
        pass
    return 2

def outer():
    v = yield from inner()
    yield v

g = outer()
print g.next()              # prints 1
print g.throw(ValueError)   # prints 2


In your expansion, the StopIteration raised by inner escapes the outer 
generator as well, so we get a StopIteration instead of the second print 
that I would expect.  Can you explain in a little more detail how the 
inlining argument makes you want to not catch a StopIteration escaping 
from throw?

- Jacob


From ncoghlan at gmail.com  Thu Mar 26 02:56:43 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Mar 2009 11:56:43 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CAAA0F.7090507@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz>
Message-ID: <49CAE0DB.3090104@gmail.com>

Greg Ewing wrote:
> But if you need explicit wrappers to prevent finalization,
> then you hardly have "no implicit finalization". So I'm a
> bit confused about what behaviour you're really asking for.

I should have said no *new* mechanism for implicit finalisation.
Deletion of the outer generator would, as you say, still call close()
and throw GeneratorExit in.

I like it because the rules are simple: either an exception is thrown in
 and passed down to the subiterator (which may have the effect of
finalising it), or else the subiterator is left alone (to be finalised
either explicitly or implicitly when it is deleted).

There's then no special case along the lines of "if GeneratorExit is
passed in we just drop our reference to the subiterator instead of
passing the exception down", or "if you iterate over a subiterator using
'yield from' instead of a for loop then the subiterator will
automatically be closed at the end of the expression".

No matter what you do with regards to finalisation, you're going to
demand extra work from somebody. The simple rule means that subiterators
will see all exceptions (even GeneratorExit), allowing them to handle
their own finalisation needs, while  shareable subiterators are also
possible so long as they don't have throw() methods.

The idea of a shareable iterator that *does* support send() or throw()
just doesn't make any sense to me. Splitting up a data feed amongst
multiple peer consumers, OK, that's fairly straightforward and I can
easily imagine uses for it in a generator based coding style (e.g.
having multiple clients pulling requests from a job queue). But having
multiple peer writers attempting to feed values or exceptions back into
that single iterator that can neither tell which writer a particular
value or exception came from, nor direct results to particular
consumers? That sounds like utter insanity. If you want to create a
shareable iterator, preventing use of send() and throw() strikes me as a
*very* good idea.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Thu Mar 26 03:17:24 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Mar 2009 12:17:24 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CAD15D.2090008@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>
Message-ID: <49CAE5B4.4080005@gmail.com>

Jacob Holm wrote:
> Greg Ewing wrote:
>> Jacob Holm wrote:
>>
>>> But if you throw another exception and it is converted to a
>>> StopIteration by the subiterator, this should definitely stop the
>>> subiterator and get a return value.
>>
>> Not if it simply raises a StopIteration from the
>> throw call. It would have to mark itself as
>> completed, return normally from the throw and
>> then raise StopIteration on the next call to
>> next() or send().
>>
> One of us must be missing something...  If the subiterator is exhausted
> before the throw, there won't *be* a value to return from the call so
> the only options for the throw method are to raise StopIteraton, or to
> raise some other exception.

I agree with Jacob here - contextlib.contextmanager contains a similar
check in its __exit__ method. The thing to check for is the throw method
call raising StopIteration and that StopIteration instance being a
*different* exception from the one that was thrown in. (This matters
more in the contextmanager case, since it is quite legitimate for a
generator to finish and raise StopIteration from inside a with
statement, so the contextmanager needs to avoid accidentally suppressing
that exception).

Avoiding the problem of suppressing thrown in StopIteration instances
means we still need multiple inner try/except blocks rather than a large
outer one. There is also another special case to consider: since a
permitted response to "throw(GeneratorExit)" is for the iterator to just
terminate instead of reraising GeneratorExit, the thrown in exception
should be reraised unconditionally in that situation.

So the semantics would then become:

    _i = iter(EXPR)
    try:
        _u = _i.next()
    except StopIteration as _e:
        _r = _e.value
    else:
        while 1:
            try:
                _v = yield _u
            except:
                _m = getattr(_i, 'throw', None)
                if _m is not None:
                    _et, _ev, _tb = sys.exc_info()
                    try:
                        _u = _m(_et, _ev, _tb)
                    except StopIteration as _e:
                        if _e is _ev or
                           _et is GeneratorExit:
                            # Don't suppress a thrown in
                            # StopIteration and handle the
                            # case where a subiterator
                            # handles GeneratorExit by
                            # terminating rather than
                            # reraising the exception
                            raise
                        # The thrown in exception
                        # terminated the iterator
                        # gracefully
                        _r = _e.value
                else:
                    raise
            else:
                try:
                    if _v is None:
                        _u = _i.next()
                    else:
                        _u = _i.send(_v)
                except StopIteration as _e:
                    _r = _e.value
                    break
    RESULT = _r


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From guido at python.org  Thu Mar 26 04:56:46 2009
From: guido at python.org (Guido van Rossum)
Date: Wed, 25 Mar 2009 20:56:46 -0700
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49C81A45.1070803@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>
Message-ID: <ca471dc20903252056j1d709a7fu9262bbe1e13091e3@mail.gmail.com>

On Mon, Mar 23, 2009 at 4:24 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> We have a decision to make. It appears we can have
> *one* of the following, but not both:
>
> (1) In non-refcounting implementations, subiterators
> are finalized promptly when the delegating generator
> is explicitly closed.
>
> (2) Subiterators are not prematurely finalized when
> other references to them exist.
>
> Since in the majority of intended use cases the
> subiterator won't be shared, (1) seems like the more
> important guarantee to uphold. Does anyone disagree
> with that?
>
> Guido, what do you think?

Gee, I'm actually glad I waited a while, because the following
discussion shows that this is a really hairy issue... I think (1)
means propagating GeneratorExit into the subgenerator (and recursively
if that's also waiting in a yield-from), while (2) would mean not
propagating it, right? I agree that (1) seems to make more sense
unless you can think of a use case for (2) -- and it seems from Nick's
last post that such a use case would have to be rather horrendously
outrageous...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Thu Mar 26 06:40:46 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 17:40:46 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CAD15D.2090008@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
Message-ID: <49CB155E.4040504@canterbury.ac.nz>

Jacob Holm wrote:

>  Can you explain in a little more detail how the 
> inlining argument makes you want to not catch a StopIteration escaping 
> from throw?

It's easier to see if we use an example that doesn't
involve a return value, since it's clearer what
"inlining" means in that case.

def inner():
    try:
        yield 1
    except ValueError:
        pass

def outer():
    print "About to yield from inner"
    yield from inner()
    print "Finished yielding from inner"

Now if we inline that, we get:

def outer_and_inner():
    print "About to yield from inner"
    try:
        yield 1
    except ValueError:
        pass
    print "Finished yielding from inner"

What would you expect that to do if you throw StopIteration
into it while it's suspended at the yield?

However, thinking about the return value case has made me
realize that it's not so obvious what "inlining" means then.
To get the return value in your example, one way would be to
perform the inlining like this:

def outer():
    try:
       try:
           yield 1
       except ValueError:
           pass
       raise StopIteration(2)
    except StopIteration, e:
       v = e.value
    yield v

which results in the behaviour you are expecting.

However, if you were inlining an ordinary function, that's
not how you would handle a return value -- rather, you'd
just replace the return by a statement that assigns the
return value to wherever it needs to go. Using that strategy,
we get

def outer():
    try:
        yield 1
    except ValueError:
        pass
    v = 2
    yield v

That's closer to what I have in mind when I talk about
"inlining" in the PEP.

I realize that this is probably not exactly what the current
expansion specifies. I'm working on a new one to fix issues
like this.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu Mar 26 07:00:57 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 18:00:57 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CAE0DB.3090104@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz>
	<49CAE0DB.3090104@gmail.com>
Message-ID: <49CB1A19.2000305@canterbury.ac.nz>

Nick Coghlan wrote:

> I like it because the rules are simple: either an exception is thrown in
>  and passed down to the subiterator (which may have the effect of
> finalising it), or else the subiterator is left alone (to be finalised
> either explicitly or implicitly when it is deleted).

Okay, so you're in favour of accepting the risk of prematurely
finalizing shared subiterators, on the grounds that it can be
prevented using a wrapper in the rare cases where it matters.

I can live with that, and in fact it's more or less where my
most recent thinking has been leading me.

> I like it because the rules are simple: either an exception is thrown in
>  and passed down to the subiterator (which may have the effect of
> finalising it), or else the subiterator is left alone (to be finalised
> either explicitly or implicitly when it is deleted).

We might still want one special case. If GeneratorExit is thrown
and the subiterator has no throw() or the GeneratorExit propagates
back out of the throw(), I think an attempt should be made to
close() it. Otherwise, explicitly closing the delegating generator
wouldn't be guaranteed to finalize the subiterator unless it had
a throw() method, whereas one would expect having close() to be
sufficient for this.

-- 
Greg


From ncoghlan at gmail.com  Thu Mar 26 07:32:28 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Mar 2009 16:32:28 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB1A19.2000305@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz>
Message-ID: <49CB217C.6000603@gmail.com>

Greg Ewing wrote:
> We might still want one special case. If GeneratorExit is thrown
> and the subiterator has no throw() or the GeneratorExit propagates
> back out of the throw(), I think an attempt should be made to
> close() it. Otherwise, explicitly closing the delegating generator
> wouldn't be guaranteed to finalize the subiterator unless it had
> a throw() method, whereas one would expect having close() to be
> sufficient for this.

I'm not so sure about that - we don't do it for normal iteration, so why
would we do it for the new expression?

However, I've been pondering the shareable iterator case a bit more, and
in trying to come up with even a toy example, I couldn't think of
anything that wouldn't be better handled just by actually *iterating*
over the shared iterator with a for loop.

Since the main advantage that the new expression has over simple
iteration is delegating send() and throw() correctly, and I'm suggesting
that shared iterators and those two methods don't mix, perhaps this
whole issue can be set aside?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From rhamph at gmail.com  Thu Mar 26 07:42:09 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 26 Mar 2009 00:42:09 -0600
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <200903261058.59164.steve@pearwood.info>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
Message-ID: <aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>

On Wed, Mar 25, 2009 at 5:58 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote:
>> Why anyone would care? Orderings possible to obtain from a given good
>> random number generator are quite uniformly distributed among all
>> orderings.
>
> Yes, that holds true for n <= 2080, since Fisher-Yates is an unbiased
> shuffler. But I don't think it remains true for n > 2080 since the vast
> majority of possible permutations have probability zero. I'm not saying
> that this absolutely *will* introduce statistical bias into the
> shuffled lists, but it could, and those who care about that risk
> shouldn't have to read the source code to learn this.

If random.shuffle() is broken for lists more than 2080 then it should
raise an error.  Claiming it "might" be broken in the docs for
moderately sized lists, without researching such a claim, is pointless
fear mongering.


>> I bet you can't even predict any particular ordering which
>> is impossible to obtain.
>
> A moral dilemma... should I take advantage of your innumeracy by taking
> you up on that bet, or should I explain why that bet is a sure thing
> for me? *wink*
>
> Since the chances of me collecting on the bet is essentially near zero,
> I'll explain.
>
> For a list with 2082 items, shuffle() chooses from a subset of
> approximately 0.001% of all possible permutations. This means that if I
> give you a list of 2082 items and tell you to shuffle it, and then
> guess that such-and-such a permutation of it will never be reached, I
> can only lose if by chance I guessed on the 1 in 100,000 permutations
> that shuffle() can reach. I have 99,999 chances to win versus 1 to
> lose: that's essentially a sure thing.
>
> In practical terms, beyond (say) 2085 or so, it would be a bona fide
> miracle if I didn't win such a bet.

Go ahead, pick a combination, then iterate through all 2**19937-1
permutations to prove you're correct.  Don't worry, we can wait.

Of course a stronger analysis technique can prove it much quicker than
brute force, but it's not a cryptographically secure PRNG, there's
LOTS of information that can be found through such techniques.

So far the 2080 limit is random trivia, nothing more.  It has no real
significance, imposes no new threats, and does not change how correct
code is written.


-- 
Adam Olsen, aka Rhamphoryncus


From arnodel at googlemail.com  Thu Mar 26 09:31:24 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Thu, 26 Mar 2009 08:31:24 +0000
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <200903261058.59164.steve@pearwood.info>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
Message-ID: <9bfc700a0903260131o4d209ca7m4845d12fff10079b@mail.gmail.com>

2009/3/25 Steven D'Aprano <steve at pearwood.info>:
> On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote:
>> On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano <steve at pearwood.info>
> wrote:
>> I bet you can't even predict any particular ordering which
>> is impossible to obtain.
>
> A moral dilemma... should I take advantage of your innumeracy by taking
> you up on that bet, or should I explain why that bet is a sure thing
> for me? *wink*

Your challenge was to exhibit a particular permutation which the
algorithm will not generate.  For good measure I think you should also
join a proof that it won't be generated (since there isn't enough time
or, probably, space, to test it).

-- 
Arnaud


From greg.ewing at canterbury.ac.nz  Thu Mar 26 09:36:58 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Mar 2009 20:36:58 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB217C.6000603@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz>
	<49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz>
	<49CB217C.6000603@gmail.com>
Message-ID: <49CB3EAA.7090909@canterbury.ac.nz>

Nick Coghlan wrote:

> I'm not so sure about that - we don't do it for normal iteration, so why
> would we do it for the new expression?

Because of the inlining principle. If you inline a
subgenerator, the result is just a single generator,
and closing it finalizes the whole thing.

> Since the main advantage that the new expression has over simple
> iteration is delegating send() and throw() correctly, and I'm suggesting
> that shared iterators and those two methods don't mix, perhaps this
> whole issue can be set aside?

Sounds good to me.

-- 
Greg


From stephen at xemacs.org  Thu Mar 26 11:31:16 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 26 Mar 2009 19:31:16 +0900
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
Message-ID: <87iqlwy3rf.fsf@xemacs.org>

Adam Olsen writes:

 > If random.shuffle() is broken for lists more than 2080 then it should
 > raise an error.

Not really.  Assuming the initial state is drawn from a uniform
distribution on all possible states, if all 2080-shuffles are
equiprobable, then any statistic you care to calculate based on that
will come out the same as if you had 2080 statistically independent
draws without replacement.  Another way to put it is "if you need a
random shuffle, this one is good enough for *any* such purpose".

However, once you exceed that limit you have to ask whether it's good
enough for the purpose at hand.  For some purposes, the distribution
of (2**19937-1)-shuffles might be good enough, even though they make
up only 1/(2**19937-2) of the possible shuffles.  (Yeah, I know, you
can wait....)

 > Claiming it "might" be broken in the docs for moderately sized
 > lists, without researching such a claim, is pointless fear
 > mongering.

How about if it's phrased the way I did above?  Ie, this is good
enough for any N-shuffle for *any purpose whatsoever*, for N < 2081.


From ncoghlan at gmail.com  Thu Mar 26 11:38:29 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Mar 2009 20:38:29 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB3EAA.7090909@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz>
Message-ID: <49CB5B25.4070105@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> I'm not so sure about that - we don't do it for normal iteration, so why
>> would we do it for the new expression?
> 
> Because of the inlining principle. If you inline a
> subgenerator, the result is just a single generator,
> and closing it finalizes the whole thing.

That makes perfect sense to me as a justification for treating
GeneratorExit the same as any other exception (i.e. delegating it to the
subgenerator). It doesn't lead me to think that the semantics ever need
to involve calling close().

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Thu Mar 26 15:16:42 2009
From: jh at improva.dk (Jacob Holm)
Date: Thu, 26 Mar 2009 15:16:42 +0100
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB155E.4040504@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>
Message-ID: <49CB8E4A.3050108@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>
>>  Can you explain in a little more detail how the inlining argument 
>> makes you want to not catch a StopIteration escaping from throw?
> [snip explanation]
Thank you very much for the clear explanation.  It seems each of us were 
missing something.  AFAICT your latest expansion (reproduced below) 
fixes this.

I have a few (final, I hope) nits to pick about the finally clause. 

To start with there is no need for a separate "try".  Just adding the 
finally clause to the next try..except..else has the exact same semantics. 

Then there is the contents of the finally clause.  It is either too much 
or too little, depending on what it is you are trying to specify.  If 
the intent is to show that the last reference from the expansion to _i 
disappears here, it fails because _m is likely to hold a reference as 
well.  In any case I don't see a reason to single out _i for deletion.   
I suggest just dropping the finally clause altogether to make it clear 
that we are not promising any finalization beyond what is explicit in 
the rest of the code.

- Jacob

------------------------------------------------------------------------

     _i = iter(EXPR)
     try:
         try:
             _y = _i.next()
         except StopIteration, _e:
             _r = _e.value
         else:
             while 1:
                 try:
                     _s = yield _y
                 except:
                     _m = getattr(_i, 'throw', None)
                     if _m is not None:
                         _x = sys.exc_info()
                         try:
                             _y = _m(*_x)
                         except StopIteration, _e:
                             if _e is _x[1]:
                                 raise
                             else:
                                 _r = _e.value
                                 break
                     else:
                         _m = getattr(_i, 'close', None)
                         if _m is not None:
                             _m()
                         raise
                 else:
                     try:
                         if _s is None:
                             _y = _i.next()
                         else:
                             _y = _i.send(_s)
                     except StopIteration, _e:
                         _r = _e.value
                         break
     finally:
         del _i
     RESULT = _r


From grosser.meister.morti at gmx.net  Thu Mar 26 16:07:25 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Thu, 26 Mar 2009 16:07:25 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <49ABCF35.5030002@molden.no>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>	<goesse$qhd$1@ger.gmane.org>
	<49ABCF35.5030002@molden.no>
Message-ID: <49CB9A2D.4000905@gmx.net>

Sturla Molden wrote:
 > On 3/1/2009 9:57 PM, Christian Heimes wrote:
 >
 >>   with a, b as x, d as y:
 >
 > I'd like to add that parentheses improve readability here:
 >
 >    with a, (b as x), (d as y):
 >
 > I am worried the proposed syntax could be a source of confusion and
 > errors. E.g. when looking at
 >
 >    with a,b as c,d:
 >
 > my eyes read
 >
 >    with nested(a,b) as c,d:
 >
 > when Python would read
 >
 >    with a,(b as c),d:
 >
 >

Good point. Maybe that would be better:

    with a,b as c,d:

reads as:

    with nested(a,b) as c,d:

This means there can only be one "as" in a with statement with the further 
implication that even unneeded values have to be assigned:

    with a,b,c as x,unused,y:

Not as nice, but much more unambiguous. Unambiguity is what we need, I think. 
You can always assing to _, wich is very commonly used for unneeded values 
(well, or for the l10n hook - so using that name would not be very unambiguous).

	-panzi


From guido at python.org  Thu Mar 26 16:14:44 2009
From: guido at python.org (Guido van Rossum)
Date: Thu, 26 Mar 2009 08:14:44 -0700
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <49CB9A2D.4000905@gmx.net>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org> <49ABCF35.5030002@molden.no>
	<49CB9A2D.4000905@gmx.net>
Message-ID: <ca471dc20903260814u5367537ei12994f93f51cf316@mail.gmail.com>

On Thu, Mar 26, 2009 at 8:07 AM, Mathias Panzenb?ck
<grosser.meister.morti at gmx.net> wrote:
> Sturla Molden wrote:
>> On 3/1/2009 9:57 PM, Christian Heimes wrote:
>>
>>> ? with a, b as x, d as y:
>>
>> I'd like to add that parentheses improve readability here:
>>
>> ? ?with a, (b as x), (d as y):
>>
>> I am worried the proposed syntax could be a source of confusion and
>> errors. E.g. when looking at
>>
>> ? ?with a,b as c,d:
>>
>> my eyes read
>>
>> ? ?with nested(a,b) as c,d:
>>
>> when Python would read
>>
>> ? ?with a,(b as c),d:
>>
>>
>
> Good point. Maybe that would be better:
>
> ? with a,b as c,d:
>
> reads as:
>
> ? with nested(a,b) as c,d:
>
> This means there can only be one "as" in a with statement with the further
> implication that even unneeded values have to be assigned:
>
> ? with a,b,c as x,unused,y:
>
> Not as nice, but much more unambiguous. Unambiguity is what we need, I
> think. You can always assing to _, wich is very commonly used for unneeded
> values (well, or for the l10n hook - so using that name would not be very
> unambiguous).

No, we should maintain the parallel with" import a, b as c, d".

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From sturla at molden.no  Thu Mar 26 16:42:02 2009
From: sturla at molden.no (Sturla Molden)
Date: Thu, 26 Mar 2009 16:42:02 +0100
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <49CB9A2D.4000905@gmx.net>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>	<goesse$qhd$1@ger.gmane.org>	<49ABCF35.5030002@molden.no>
	<49CB9A2D.4000905@gmx.net>
Message-ID: <49CBA24A.1050502@molden.no>

On 3/26/2009 4:07 PM, Mathias Panzenb?ck wrote:

> Good point. Maybe that would be better:
> 
>    with a,b as c,d:

No. See Guido's reply.

I was just trying to say that

   with nested(a,b) as c,d:

is more readble than

   with a as c, b as d:

which would argue against new syntax and better documentation of 
contextlib.nested. However, as the tuple (a,b) is built prior to the 
call to nested, new syntax is needed.

It still does not hurt to put in parentheses for readability here:

   with (a as c), (b as d):


Perhaps parentheses should be recommended in the documentation, even 
though they are syntactically superfluous here?


Sturla Molden


From guido at python.org  Thu Mar 26 17:33:45 2009
From: guido at python.org (Guido van Rossum)
Date: Thu, 26 Mar 2009 09:33:45 -0700
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <49CBA24A.1050502@molden.no>
References: <goe3og$gpd$1@ger.gmane.org>
	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>
	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>
	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>
	<goesse$qhd$1@ger.gmane.org> <49ABCF35.5030002@molden.no>
	<49CB9A2D.4000905@gmx.net> <49CBA24A.1050502@molden.no>
Message-ID: <ca471dc20903260933ne967681y54f49eae2826f66e@mail.gmail.com>

On Thu, Mar 26, 2009 at 8:42 AM, Sturla Molden <sturla at molden.no> wrote:
> On 3/26/2009 4:07 PM, Mathias Panzenb?ck wrote:
>
>> Good point. Maybe that would be better:
>>
>> ? with a,b as c,d:
>
> No. See Guido's reply.
>
> I was just trying to say that
>
> ?with nested(a,b) as c,d:
>
> is more readble than
>
> ?with a as c, b as d:
>
> which would argue against new syntax and better documentation of
> contextlib.nested. However, as the tuple (a,b) is built prior to the call to
> nested, new syntax is needed.
>
> It still does not hurt to put in parentheses for readability here:
>
> ?with (a as c), (b as d):
>
>
> Perhaps parentheses should be recommended in the documentation, even though
> they are syntactically superfluous here?

No, the parens will be syntactically *illegal*.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From rhamph at gmail.com  Thu Mar 26 18:43:35 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 26 Mar 2009 11:43:35 -0600
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <87iqlwy3rf.fsf@xemacs.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
Message-ID: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>

On Thu, Mar 26, 2009 at 4:31 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Adam Olsen writes:
>
> ?> If random.shuffle() is broken for lists more than 2080 then it should
> ?> raise an error.
>
> Not really. ?Assuming the initial state is drawn from a uniform
> distribution on all possible states, if all 2080-shuffles are
> equiprobable, then any statistic you care to calculate based on that
> will come out the same as if you had 2080 statistically independent
> draws without replacement. ?Another way to put it is "if you need a
> random shuffle, this one is good enough for *any* such purpose".
>
> However, once you exceed that limit you have to ask whether it's good
> enough for the purpose at hand. ?For some purposes, the distribution
> of (2**19937-1)-shuffles might be good enough, even though they make
> up only 1/(2**19937-2) of the possible shuffles. ?(Yeah, I know, you
> can wait....)

Is it or is it not broken?  That's all I want to know.  "maybe" isn't
good enough.  "Not broken for small lists" implies it IS broken for
large lists.

Disabling it (raising an exception for large lists) is of course just
a stopgap measure.  Better would be a PRNG with a much larger period..
but of course that'd require more CPU time and more seed.


-- 
Adam Olsen, aka Rhamphoryncus


From tim.peters at gmail.com  Thu Mar 26 18:55:22 2009
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 26 Mar 2009 13:55:22 -0400
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> 
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> 
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com> 
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
Message-ID: <1f7befae0903261055l35a845acr17529d78efcf136e@mail.gmail.com>

[Adam Olsen]
> Is it or is it not broken? ?That's all I want to know.

Then you first need to define what "broken" means to you.  Anything
short of a source of /true/ random numbers is "broken" for /some/
purposes.

Python's current generator is not broken for any purposes I care
about, so my answer to your question is "no" -- but only if I ask your
question of myself ;-)


> Better would be a PRNG with a much larger period..

Not really.  A larger period is necessary but not sufficient /if/
you're concerned about generating all permutations of bigger lists
with equal probability -- see the old thread someone else pointed to
for more info on that.  The Mersenne Twister's provably superb
"high-dimensional equidistribution" properties are far more important
than its long period in this respect (the former is sufficient; the
latter is merely necessary).


From phd at phd.pp.ru  Thu Mar 26 19:01:33 2009
From: phd at phd.pp.ru (Oleg Broytmann)
Date: Thu, 26 Mar 2009 21:01:33 +0300
Subject: [Python-ideas] About adding a new iterator method
	called	"shuffled"
In-Reply-To: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
Message-ID: <20090326180133.GB7849@phd.pp.ru>

On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote:
> Is it or is it not broken?  That's all I want to know.  "maybe" isn't
> good enough.  "Not broken for small lists" implies it IS broken for
> large lists.

   Practicality beats purity, IMHO. If shuffle cannot process a list
I cannot even fit into virtual memory - I don't care, really.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


From jh at improva.dk  Thu Mar 26 19:18:46 2009
From: jh at improva.dk (Jacob Holm)
Date: Thu, 26 Mar 2009 19:18:46 +0100
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<200903252328.49177.steve@pearwood.info>	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>	<200903261058.59164.steve@pearwood.info>	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
Message-ID: <49CBC706.3060504@improva.dk>

Oleg Broytmann wrote:
> On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote:
>   
>> Is it or is it not broken?  That's all I want to know.  "maybe" isn't
>> good enough.  "Not broken for small lists" implies it IS broken for
>> large lists.
>>     
>
>    Practicality beats purity, IMHO. If shuffle cannot process a list
> I cannot even fit into virtual memory - I don't care, really.
>
> Oleg.
>   
A list of 2081 items certainly fits into the memory of my machine.  
There is a very clear sense in which it is broken for lists > 2080 
items.  It *may* be broken in other ways and for certain use cases for 
shorter lists, I don't know enough about the properties of the the PRNG 
to say anything about that.  I *think* someone mentioned that it was 
equidistributed in 623 dimensions and that this should mean it is as 
good as possible for any PRNG for any list up to 623 items.  If this is 
true, it would be nice to have a note about it in the docs.

Documented limitations are always better than undocumented ones. 
(Explicit is better than implicit...)

- Jacob


From amcnabb at mcnabbs.org  Thu Mar 26 18:54:17 2009
From: amcnabb at mcnabbs.org (Andrew McNabb)
Date: Thu, 26 Mar 2009 11:54:17 -0600
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
Message-ID: <20090326175417.GT2948@mcnabbs.org>

On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote:
> 
> Is it or is it not broken?  That's all I want to know.  "maybe" isn't
> good enough.  "Not broken for small lists" implies it IS broken for
> large lists.
> 
> Disabling it (raising an exception for large lists) is of course just
> a stopgap measure.  Better would be a PRNG with a much larger period..
> but of course that'd require more CPU time and more seed.

It's only broken in a theoretical sense.  It's fun to think about, but I
wouldn't lose any sleep over it.


-- 
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868


From python at rcn.com  Thu Mar 26 19:36:28 2009
From: python at rcn.com (Raymond Hettinger)
Date: Thu, 26 Mar 2009 11:36:28 -0700
Subject: [Python-ideas] About adding a newiteratormethodcalled	"shuffled"
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><200903252328.49177.steve@pearwood.info><3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com><200903261058.59164.steve@pearwood.info><aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com><87iqlwy3rf.fsf@xemacs.org><aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
	<20090326175417.GT2948@mcnabbs.org>
Message-ID: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1>


>> Is it or is it not broken?  That's all I want to know.  "maybe" isn't
>> good enough.  "Not broken for small lists" implies it IS broken for
>> large lists.
>> 
>> Disabling it (raising an exception for large lists) is of course just
>> a stopgap measure.  Better would be a PRNG with a much larger period..
>> but of course that'd require more CPU time and more seed.
> 
> It's only broken in a theoretical sense.  It's fun to think about, but I
> wouldn't lose any sleep over it.

It's not even broken in a theoretical sense.  It does exactly what it says it does.

Besides, this whole conversation is somewhat senseless.  You can't get any
more randomness out of a generator than you put into the seed in the
first place.  If you're not putting thousands of digits in your seed, then
no PRNG is going to give you an equal chance of producing every possible
shuffle for a large list.


Raymond


 "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."  --  John von Neumann 


From rhamph at gmail.com  Thu Mar 26 19:48:20 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 26 Mar 2009 12:48:20 -0600
Subject: [Python-ideas] About adding a newiteratormethodcalled "shuffled"
In-Reply-To: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
	<20090326175417.GT2948@mcnabbs.org>
	<09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1>
Message-ID: <aac2c7cb0903261148i292a5f19s2c0796bc0d07b20c@mail.gmail.com>

On Thu, Mar 26, 2009 at 12:36 PM, Raymond Hettinger <python at rcn.com> wrote:
> It's not even broken in a theoretical sense. ?It does exactly what it says
> it does.
>
> Besides, this whole conversation is somewhat senseless. ?You can't get any
> more randomness out of a generator than you put into the seed in the
> first place. ?If you're not putting thousands of digits in your seed, then
> no PRNG is going to give you an equal chance of producing every possible
> shuffle for a large list.

Indeed, a million item list requires over 2 megabytes of entropy.


-- 
Adam Olsen, aka Rhamphoryncus


From amcnabb at mcnabbs.org  Thu Mar 26 19:50:45 2009
From: amcnabb at mcnabbs.org (Andrew McNabb)
Date: Thu, 26 Mar 2009 12:50:45 -0600
Subject: [Python-ideas] About adding a newiteratormethodcalled "shuffled"
In-Reply-To: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1>
References: <20090326175417.GT2948@mcnabbs.org>
	<09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1>
Message-ID: <20090326185045.GU2948@mcnabbs.org>

On Thu, Mar 26, 2009 at 11:36:28AM -0700, Raymond Hettinger wrote:
>
>> It's only broken in a theoretical sense.  It's fun to think about, but I
>> wouldn't lose any sleep over it.
>
> It's not even broken in a theoretical sense.  It does exactly what it says it does.
>
> Besides, this whole conversation is somewhat senseless.  You can't get any
> more randomness out of a generator than you put into the seed in the
> first place.  If you're not putting thousands of digits in your seed, then
> no PRNG is going to give you an equal chance of producing every possible
> shuffle for a large list.

I agree with you--I just didn't want to make too strong of a statement.
I certainly believe that any comment in the docs about this issue would
be distracting and unhelpful.

-- 
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868


From jan.kanis at phil.uu.nl  Thu Mar 26 21:19:16 2009
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Thu, 26 Mar 2009 21:19:16 +0100
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <87iqlwy3rf.fsf@xemacs.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
Message-ID: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>

Just out of curiosity, would doing

l = range(2082)
random.shuffle(l)
random.shuffle(l)

give me (with a high probability) one of those permutations that is
unreachable with a single shuffle? If so, I'd presume you could get
any shuffle (in case you really cared) by calling random.shuffle
repeatedly and reseeding the prng in between.


From rrr at ronadam.com  Thu Mar 26 21:26:13 2009
From: rrr at ronadam.com (Ron Adam)
Date: Thu, 26 Mar 2009 15:26:13 -0500
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB3EAA.7090909@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C258E2.8050505@improva.dk>	<49C2E214.1040003@canterbury.ac.nz>
	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz>	<49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz>	<49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz>
Message-ID: <49CBE4E5.6010305@ronadam.com>


Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> I'm not so sure about that - we don't do it for normal iteration, so why
>> would we do it for the new expression?
> 
> Because of the inlining principle. If you inline a
> subgenerator, the result is just a single generator,
> and closing it finalizes the whole thing.
> 
>> Since the main advantage that the new expression has over simple
>> iteration is delegating send() and throw() correctly, and I'm suggesting
>> that shared iterators and those two methods don't mix, perhaps this
>> whole issue can be set aside?
> 
> Sounds good to me.


Just a thought...

If the subgenerator does not interact with the generator it is in after it 
is started, then wouldn't it be as if it replaces the calling generator for 
the life of the sub generator?  So instead of in-lining, can it be thought 
of more like switching-to another generator?

Ron


From rhamph at gmail.com  Thu Mar 26 22:25:58 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Thu, 26 Mar 2009 15:25:58 -0600
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
Message-ID: <aac2c7cb0903261425xe99b80co920dfdfc2579ec5@mail.gmail.com>

On Thu, Mar 26, 2009 at 2:19 PM, Jan Kanis <jan.kanis at phil.uu.nl> wrote:
> Just out of curiosity, would doing
>
> l = range(2082)
> random.shuffle(l)
> random.shuffle(l)
>
> give me (with a high probability) one of those permutations that is
> unreachable with a single shuffle? If so, I'd presume you could get
> any shuffle (in case you really cared) by calling random.shuffle
> repeatedly and reseeding the prng in between.

If you reseed, yes.  That injects new entropy into the system.  As I
said though, you can end up needing megabytes of entropy.


-- 
Adam Olsen, aka Rhamphoryncus


From tjreedy at udel.edu  Thu Mar 26 23:15:44 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 26 Mar 2009 18:15:44 -0400
Subject: [Python-ideas] with statement: multiple context manager
In-Reply-To: <ca471dc20903260814u5367537ei12994f93f51cf316@mail.gmail.com>
References: <goe3og$gpd$1@ger.gmane.org>	<ca471dc20903011146t791efdd6wa3cfeb680f22db75@mail.gmail.com>	<3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com>	<50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com>	<goesse$qhd$1@ger.gmane.org>
	<49ABCF35.5030002@molden.no>	<49CB9A2D.4000905@gmx.net>
	<ca471dc20903260814u5367537ei12994f93f51cf316@mail.gmail.com>
Message-ID: <gqguqg$177$1@ger.gmane.org>

Guido van Rossum wrote:


> No, we should maintain the parallel with" import a, b as c, d".

Which should then be mentioned in the doc.


From stephen at xemacs.org  Fri Mar 27 04:17:47 2009
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 27 Mar 2009 12:17:47 +0900
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
Message-ID: <87eiwjy7qc.fsf@xemacs.org>

Adam Olsen writes:

 > Is it or is it not broken?

What is so hard to understand about "depending on the statistical
properties you demand, it may be broken and then again it may not?"

 > That's all I want to know.  "maybe" isn't good enough.

"If you have to ask, you can't afford it."

Ie, you've defined your own answer: it's broken *for you*.  The rest
of us would like to be allowed to judge for ourselves, though.

 > "Not broken for small lists" implies it IS broken for large lists.

You're being contentious.  It logically implies no such thing, nor is
it idiomatically an implication among consenting adults.  And in any
case, the phrasing I recommended is "guaranteed to have uniform
distribution of shuffles up to N".  The implication of "no guarantee"
is "have a mechanic inspect it before you buy", not "this is a lemon".


From greg.ewing at canterbury.ac.nz  Fri Mar 27 05:28:30 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Mar 2009 17:28:30 +1300
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
Message-ID: <49CC55EE.4070208@canterbury.ac.nz>

Jan Kanis wrote:
> I'd presume you could get
> any shuffle (in case you really cared) by calling random.shuffle
> repeatedly and reseeding the prng in between.

But how are you going to reseed the prng? To get
an equal likelihood of any shuffle, you need another
prng with a big enough state to generate seeds for
your first prng. But then you might just as well
shuffle based on that larger prng in the first place.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Mar 27 06:00:53 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Mar 2009 18:00:53 +1300
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB8E4A.3050108@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
Message-ID: <49CC5D85.30409@canterbury.ac.nz>

Jacob Holm wrote:
> Just adding the 
> finally clause to the next try..except..else has the exact same semantics.

True -- I haven't quite got used to the idea that
you can do that yet!

> In any case I don't see a reason to single out _i for deletion.

That part seems to be a hangover from an earlier
version. You're probably right that it can go.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Mar 27 06:01:08 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Mar 2009 18:01:08 +1300
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CB5B25.4070105@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz>
	<49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz>
	<49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz>
	<49CB5B25.4070105@gmail.com>
Message-ID: <49CC5D94.7000609@canterbury.ac.nz>

Nick Coghlan wrote:

> That makes perfect sense to me as a justification for treating
> GeneratorExit the same as any other exception (i.e. delegating it to the
> subgenerator). It doesn't lead me to think that the semantics ever need
> to involve calling close().

I'm also treating close() and throw(GeneratorExit) on
the delegating generator as equivalent for finalization
purposes. So if throw(GeneratorExit) doesn't fall back
to close() on the subiterator, closing the delegating
generator won't finalize the subiterator unless it
pretends to be a generator by implementing throw().

Since the inlining principle strictly only applies to
subgenerators, it doesn't *require* this behaviour,
but to my mind it strongly suggests it.

-- 
Greg


From lorgandon at gmail.com  Fri Mar 27 07:59:34 2009
From: lorgandon at gmail.com (Imri Goldberg)
Date: Fri, 27 Mar 2009 09:59:34 +0300
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
Message-ID: <fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>

On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis <jan.kanis at phil.uu.nl> wrote:

> Just out of curiosity, would doing
>
> l = range(2082)
> random.shuffle(l)
> random.shuffle(l)
>
> give me (with a high probability) one of those permutations that is
> unreachable with a single shuffle? If so, I'd presume you could get
> any shuffle (in case you really cared) by calling random.shuffle
> repeatedly and reseeding the prng in between.


I'm a bit rusty on the math, but that doesn't have to be the case. If all
the permutations produced by random.shuffle() form a subgroup, or lie in a
subgroup, then what you'll get is just another permutation from that
subgroup, regardless of the randomness you put inside.

-- 
Imri Goldberg
--------------------------------------
www.algorithm.co.il/blogs/
--------------------------------------
-- insert signature here ----
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/3255d320/attachment.html>

From arnodel at googlemail.com  Fri Mar 27 08:24:23 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Fri, 27 Mar 2009 07:24:23 +0000
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
Message-ID: <A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>


On 27 Mar 2009, at 06:59, Imri Goldberg wrote:

>
>
> On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis <jan.kanis at phil.uu.nl>  
> wrote:
> Just out of curiosity, would doing
>
> l = range(2082)
> random.shuffle(l)
> random.shuffle(l)
>
> give me (with a high probability) one of those permutations that is
> unreachable with a single shuffle? If so, I'd presume you could get
> any shuffle (in case you really cared) by calling random.shuffle
> repeatedly and reseeding the prng in between.
>
> I'm a bit rusty on the math, but that doesn't have to be the case.  
> If all the permutations produced by random.shuffle() form a  
> subgroup, or lie in a subgroup, then what you'll get is just another  
> permutation from that subgroup, regardless of the randomness you put  
> inside.

There is no reason that the set of shuffled permutations(S_n) will  
form a subgroup of the set of permutations (P_n) and it may well  
generate the whole of P_n.  In fact you only need n transpositions to  
generate the whole of P_n.

However, any function generates a random permutation is a function  
from the set of possible states of the PRNG to the set of  
permutations.  Whatever tricks you use, if there are fewer states in  
the PRNG than there are permutations, you won't be able to reach them  
all.

-- 
Arnaud


From jh at improva.dk  Fri Mar 27 11:44:18 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 27 Mar 2009 11:44:18 +0100
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<200903252328.49177.steve@pearwood.info>	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>	<200903261058.59164.steve@pearwood.info>	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>	<87iqlwy3rf.fsf@xemacs.org>	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>
Message-ID: <49CCAE02.8@improva.dk>

Arnaud Delobelle wrote:
>
> On 27 Mar 2009, at 06:59, Imri Goldberg wrote:
>
>>
>>
>> On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis <jan.kanis at phil.uu.nl> 
>> wrote:
>> Just out of curiosity, would doing
>>
>> l = range(2082)
>> random.shuffle(l)
>> random.shuffle(l)
>>
>> give me (with a high probability) one of those permutations that is
>> unreachable with a single shuffle? If so, I'd presume you could get
>> any shuffle (in case you really cared) by calling random.shuffle
>> repeatedly and reseeding the prng in between.
>>
>> I'm a bit rusty on the math, but that doesn't have to be the case. If 
>> all the permutations produced by random.shuffle() form a subgroup, or 
>> lie in a subgroup, then what you'll get is just another permutation 
>> from that subgroup, regardless of the randomness you put inside.
>
> There is no reason that the set of shuffled permutations(S_n) will 
> form a subgroup of the set of permutations (P_n) and it may well 
> generate the whole of P_n. In fact you only need n transpositions to 
> generate the whole of P_n.
True, it is extremely likely that the group G_n generated by S_n is P_n 
and not a subgroup.

>
> However, any function generates a random permutation is a function 
> from the set of possible states of the PRNG to the set of 
> permutations. Whatever tricks you use, if there are fewer states in 
> the PRNG than there are permutations, you won't be able to reach them 
> all.
>
You are right, you won't be able to reach them all in a single call to 
shuffle. However, by repeated shuffling and reseeding like the OP 
suggested, you can in theory get to all elements of G_n *if you keep 
shuffling long enough*. Unfortunately you will need at least |G_n|/|S_n| 
shuffles which means it is not even remotely practical.

- Jacob


From ncoghlan at gmail.com  Fri Mar 27 11:51:37 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 27 Mar 2009 20:51:37 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CC5D94.7000609@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com>
	<49CC5D94.7000609@canterbury.ac.nz>
Message-ID: <49CCAFB9.1090800@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> That makes perfect sense to me as a justification for treating
>> GeneratorExit the same as any other exception (i.e. delegating it to the
>> subgenerator). It doesn't lead me to think that the semantics ever need
>> to involve calling close().
> 
> I'm also treating close() and throw(GeneratorExit) on
> the delegating generator as equivalent for finalization
> purposes. So if throw(GeneratorExit) doesn't fall back
> to close() on the subiterator, closing the delegating
> generator won't finalize the subiterator unless it
> pretends to be a generator by implementing throw().
> 
> Since the inlining principle strictly only applies to
> subgenerators, it doesn't *require* this behaviour,
> but to my mind it strongly suggests it.

I believe I already said this at some point, but after realising that
shareable subiterators are almost still going to be better handled by
iterating over them rather than delegating to them, I'm actually not too
worried one way or the other.

While I do still have a slight preference for limiting the methods
involved in generator delegation to just next(), send() and throw(), I
won't object strenuously to accepting close() as an alternative spelling
of throw(exc) that will always reraise the passed in exception.

As you say, it does make it easier to write a non-generator delegation
target, since implementing close() for finalisation means not having to
deal with the vagaries of correctly reraising exceptions.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From greg.ewing at canterbury.ac.nz  Fri Mar 27 12:05:01 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Mar 2009 23:05:01 +1200
Subject: [Python-ideas] About adding a
	new	iteratormethodcalled	"shuffled"
In-Reply-To: <49CCAE02.8@improva.dk>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>
	<49CCAE02.8@improva.dk>
Message-ID: <49CCB2DD.30706@canterbury.ac.nz>

Jacob Holm wrote:
> However, by repeated shuffling and reseeding like the OP 
> suggested, you can in theory get to all elements of G_n

But then you need a sufficient number of distinct seed
values, so you're back to the original problem.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Fri Mar 27 12:08:25 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 27 Mar 2009 23:08:25 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CCAFB9.1090800@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz>
	<49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz>
	<49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz>
	<49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz>
	<49CCAFB9.1090800@gmail.com>
Message-ID: <49CCB3A9.40300@canterbury.ac.nz>

Nick Coghlan wrote:

> As you say, it does make it easier to write a non-generator delegation
> target, since implementing close() for finalisation means not having to
> deal with the vagaries of correctly reraising exceptions.

It also means that existing things with a close
method, such as files, can be used without change.

Having a close method is a fairly well-established
way to make an iterator explicitly finalizable,
whereas having a throw method isn't.

-- 
Greg


From jh at improva.dk  Fri Mar 27 12:45:53 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 27 Mar 2009 12:45:53 +0100
Subject: [Python-ideas] About adding
	a	new	iteratormethodcalled	"shuffled"
In-Reply-To: <49CCB2DD.30706@canterbury.ac.nz>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<200903252328.49177.steve@pearwood.info>	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>	<200903261058.59164.steve@pearwood.info>	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>	<87iqlwy3rf.fsf@xemacs.org>	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>	<49CCAE02.8@improva.dk>
	<49CCB2DD.30706@canterbury.ac.nz>
Message-ID: <49CCBC71.7070905@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> However, by repeated shuffling and reseeding like the OP suggested, 
>> you can in theory get to all elements of G_n
>
> But then you need a sufficient number of distinct seed
> values, so you're back to the original problem.
>
Ehr, no. Suppose my PRNG only has period two and the shuffle based on it 
can only generate the permutations [1, 0, 2] and [2, 1, 0] from [0, 1, 
2]. Each time I reseed from a truly random source, the next shuffle will 
use one of those permutations at random. By shuffling and reseeding 
enough times I can get all combinations of those two permutations. This 
happens to be all 6 possible permutations of 3 elements.

- Jacob


From jh at improva.dk  Fri Mar 27 12:49:39 2009
From: jh at improva.dk (Jacob Holm)
Date: Fri, 27 Mar 2009 12:49:39 +0100
Subject: [Python-ideas] About
	adding	a	new	iteratormethodcalled	"shuffled"
In-Reply-To: <49CCBC71.7070905@improva.dk>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>	<200903252328.49177.steve@pearwood.info>	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>	<200903261058.59164.steve@pearwood.info>	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>	<87iqlwy3rf.fsf@xemacs.org>	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>	<49CCAE02.8@improva.dk>	<49CCB2DD.30706@canterbury.ac.nz>
	<49CCBC71.7070905@improva.dk>
Message-ID: <49CCBD53.8000903@improva.dk>

Jacob Holm wrote:
> Greg Ewing wrote:
>> Jacob Holm wrote:
>>> However, by repeated shuffling and reseeding like the OP suggested, 
>>> you can in theory get to all elements of G_n
>>
>> But then you need a sufficient number of distinct seed
>> values, so you're back to the original problem.
>>
> Ehr, no. Suppose my PRNG only has period two and the shuffle based on 
> it can only generate the permutations [1, 0, 2] and [2, 1, 0] from [0, 
> 1, 2]. Each time I reseed from a truly random source, the next shuffle 
> will use one of those permutations at random. By shuffling and 
> reseeding enough times I can get all combinations of those two 
> permutations. This happens to be all 6 possible permutations of 3 
> elements.
>
Ok, I may have misinterpreted your statement. Yes, you need to reseed a 
lot. You just don't need the seeds to be different.

- Jacob


From ncoghlan at gmail.com  Fri Mar 27 13:17:07 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 27 Mar 2009 22:17:07 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CCB3A9.40300@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz>	<49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz>	<49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz>	<49CB5B25.4070105@gmail.com>
	<49CC5D94.7000609@canterbury.ac.nz>	<49CCAFB9.1090800@gmail.com>
	<49CCB3A9.40300@canterbury.ac.nz>
Message-ID: <49CCC3C3.4050100@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> As you say, it does make it easier to write a non-generator delegation
>> target, since implementing close() for finalisation means not having to
>> deal with the vagaries of correctly reraising exceptions.
> 
> It also means that existing things with a close
> method, such as files, can be used without change.
> 
> Having a close method is a fairly well-established
> way to make an iterator explicitly finalizable,
> whereas having a throw method isn't.

But then we're back to the point that if someone *wants* deterministic
finalisation, then that's why the with statement exists. The part that
isn't clicking for me is that I still don't understand *why* 'yield
from' should include implicit finalisation as part of its definition.

The full delegation of next(), send() and throw() I get completely
(since that's the whole point of the new expression). The fact that that
*also* ends up delegating the close() method of generators in particular
 also makes sense (as it's a natural consequence of delegating the first
three methods).

It's the generalisation of that to all other iterators that happen to
offer a close() method that seems somewhat arbitrary. Other than the
fact that generators happen to provide a close() method that invokes
throw(), it appears to have nothing to do with generator delegation and
hence seems like a fairly random addition to the PEP.

Using a file as the subiterator is an interesting case in point (and
perhaps an interesting exploration as to when a shareable subiterator
may make sense: if a subiterator offers separate reading and writing
APIs, then those can be exposed as separate generators):

  class YieldingFile:
    # Mixing reads and writes with this strawman
    # version would be a rather bad idea :)
    EOF = object()

    def __init__(self, f):
      self.f = f

    def read_all(self):
      self.f.seek(0)
      yield from self.f

    def append_lines(self):
      self.f.seek(0, 2)
      lines_written = 0
      while 1:
        line = yield
        if line == self.EOF:
          break
        self.f.writeline(line)
        lines_written += 1
      return lines_written

The problem I see with the above is that with the current specification
in the PEP, the read_all() implementation is outright broken rather than
merely redundant (it is obviously wasteful, since it could just return
self.f instead of yielding from it - but it is far from clear that it
should be broken rather than just pointlessly slow). The first use of
read_all() will implicitly close the file when it is finished - that
seems totally nonobvious to me.

It strikes me as simpler all round to leave the deterministic
finalisation to the tool that was designed for the task, and let the new
expression focus solely on correct delegation to subgenerators without
worrying too much about other iterators.

Sure, there are plenty of ways to avoid the implicit finalisation if you
want to, but I'm still not convinced the "oh, you don't support throw()
so I will fall back to close() instead" fallback behaviour is a
particularly good idea. (It isn't a dealbreaker for me though - I still
support the PEP overall, even though I'm -0 on this particular aspect of
it).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From mrts.pydev at gmail.com  Fri Mar 27 16:26:52 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Fri, 27 Mar 2009 17:26:52 +0200
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
Message-ID: <ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>

Appending query parameters to a URL is a very common need. However, there's
nothing in urllib.parse (and older urlparse) that caters for that need.

Therefore, I propose adding the following to 2.7 and 3.1 in the respective
libs:

def add_query_params(url, **params):
    """
    Adds additional query parameters to the given url, preserving original
    parameters.

    Usage:
    >>> add_query_params('http://foo.com', a='b')
    'http://foo.com?a=b'
    >>> add_query_params('http://foo.com?a=b', b='c', d='q')
    'http://foo.com?a=b&b=c&d=q'

    The real implementation should be more strict, e.g. raise on the
    following:
    >>> add_query_params('http://foo.com?a=b', a='b')
    'http://foo.com?a=b&a=b'
    """
    if not params:
        return url
    encoded = urllib.urlencode(params)
    url = urlparse.urlparse(url)
    return urlparse.urlunparse((url.scheme, url.netloc, url.path,
url.params,
        (encoded if not url.query else url.query + '&' + encoded),
        url.fragment))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/07dde551/attachment.html>

From venkat83 at gmail.com  Fri Mar 27 16:55:02 2009
From: venkat83 at gmail.com (Venkatraman S)
Date: Fri, 27 Mar 2009 21:25:02 +0530
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
Message-ID: <a3b05e8b0903270855u23ecca1o92e9182f0efcb9e2@mail.gmail.com>

On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa <mrts.pydev at gmail.com> wrote:

>
>
>     Usage:
>     >>> add_query_params('http://foo.com', a='b')
>     'http://foo.com?a=b'
>     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>     'http://foo.com?a=b&b=c&d=q'
>
>     The real implementation should be more strict, e.g. raise on the
>     following:
>     >>> add_query_params('http://foo.com?a=b', a='b')
>     'http://foo.com?a=b&a=b'
>

Well, this is not 'generic'  - for eg. in Django sites the above would not
be applicable.

-V-
http://twitter.com/venkasub
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/aa49dbd6/attachment.html>

From mrts.pydev at gmail.com  Fri Mar 27 17:00:43 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Fri, 27 Mar 2009 18:00:43 +0200
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <a3b05e8b0903270855u23ecca1o92e9182f0efcb9e2@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<a3b05e8b0903270855u23ecca1o92e9182f0efcb9e2@mail.gmail.com>
Message-ID: <ad1f81530903270900u21d34c21r62eaee8972f62082@mail.gmail.com>

Why not?

2009/3/27 Venkatraman S <venkat83 at gmail.com>

>
> On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa <mrts.pydev at gmail.com>wrote:
>
>>
>>
>>     Usage:
>>     >>> add_query_params('http://foo.com', a='b')
>>     'http://foo.com?a=b'
>>     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>>     'http://foo.com?a=b&b=c&d=q'
>>
>>     The real implementation should be more strict, e.g. raise on the
>>     following:
>>     >>> add_query_params('http://foo.com?a=b', a='b')
>>     'http://foo.com?a=b&a=b'
>>
>
> Well, this is not 'generic'  - for eg. in Django sites the above would not
> be applicable.
>
> -V-
> http://twitter.com/venkasub
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/f304f10e/attachment.html>

From venkat83 at gmail.com  Fri Mar 27 17:05:51 2009
From: venkat83 at gmail.com (Venkatraman S)
Date: Fri, 27 Mar 2009 21:35:51 +0530
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903270900u21d34c21r62eaee8972f62082@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<a3b05e8b0903270855u23ecca1o92e9182f0efcb9e2@mail.gmail.com>
	<ad1f81530903270900u21d34c21r62eaee8972f62082@mail.gmail.com>
Message-ID: <a3b05e8b0903270905j1b5ca7b8racf22d69049a938e@mail.gmail.com>

On Fri, Mar 27, 2009 at 9:30 PM, Mart S?mermaa <mrts.pydev at gmail.com> wrote:

> Why not?
>
> 2009/3/27 Venkatraman S <venkat83 at gmail.com>
>
>>
>> On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa <mrts.pydev at gmail.com>wrote:
>>
>>>
>>>
>>>     Usage:
>>>     >>> add_query_params('http://foo.com', a='b')
>>>     'http://foo.com?a=b'
>>>     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>>>     'http://foo.com?a=b&b=c&d=q'
>>>
>>>     The real implementation should be more strict, e.g. raise on the
>>>     following:
>>>     >>> add_query_params('http://foo.com?a=b', a='b')
>>>     'http://foo.com?a=b&a=b'
>>>
>>
>> Well, this is not 'generic'  - for eg. in Django sites the above would not
>> be applicable.
>>
>

http://foo.com?a=b <http://foo.com/?a=b>   !=
http://foo.com/a/b<http://foo.com/?a=b>
.
Semantically , both are same,but the framework rules are different. Not sure
how you would this - by telling urllib that it is a 'pretty' django URL? (or
am i missing out something?)

-V-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/d58eddac/attachment.html>

From janssen at parc.com  Fri Mar 27 17:13:57 2009
From: janssen at parc.com (Bill Janssen)
Date: Fri, 27 Mar 2009 09:13:57 PDT
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
Message-ID: <19919.1238170437@parc.com>

Mart S?mermaa <mrts.pydev at gmail.com> wrote:

> Appending query parameters to a URL is a very common need. However, there's
> nothing in urllib.parse (and older urlparse) that caters for that need.
> 
> Therefore, I propose adding the following to 2.7 and 3.1 in the respective
> libs:

>     >>> add_query_params('http://foo.com?a=b', b='c', d='q')

To begin with, I wouldn't use keyword params.  They're syntactically
more restrictive than the rules for application/x-www-form-urlencoded
allow, so you start by ruling out whole classes of URLs.

Bill


From mrts.pydev at gmail.com  Fri Mar 27 17:16:02 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Fri, 27 Mar 2009 18:16:02 +0200
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <a3b05e8b0903270905j1b5ca7b8racf22d69049a938e@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<a3b05e8b0903270855u23ecca1o92e9182f0efcb9e2@mail.gmail.com>
	<ad1f81530903270900u21d34c21r62eaee8972f62082@mail.gmail.com>
	<a3b05e8b0903270905j1b5ca7b8racf22d69049a938e@mail.gmail.com>
Message-ID: <ad1f81530903270916i448bf129l4c71b752308d4700@mail.gmail.com>

You are definitely "missing out something". For the use case you describe,
there's already ulrjoin(). add_query_params() is for a different use case,
i.e. it *complements* urljoin().

2009/3/27 Venkatraman S <venkat83 at gmail.com>

>
> On Fri, Mar 27, 2009 at 9:30 PM, Mart S?mermaa <mrts.pydev at gmail.com>wrote:
>
>> Why not?
>>
>> 2009/3/27 Venkatraman S <venkat83 at gmail.com>
>>
>>>
>>> On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa <mrts.pydev at gmail.com>wrote:
>>>
>>>>
>>>>
>>>>     Usage:
>>>>     >>> add_query_params('http://foo.com', a='b')
>>>>     'http://foo.com?a=b'
>>>>     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>>>>     'http://foo.com?a=b&b=c&d=q'
>>>>
>>>>     The real implementation should be more strict, e.g. raise on the
>>>>     following:
>>>>     >>> add_query_params('http://foo.com?a=b', a='b')
>>>>     'http://foo.com?a=b&a=b'
>>>>
>>>
>>> Well, this is not 'generic'  - for eg. in Django sites the above would
>>> not be applicable.
>>>
>>
>
> http://foo.com?a=b <http://foo.com/?a=b>   !=  http://foo.com/a/b<http://foo.com/?a=b>
> .
> Semantically , both are same,but the framework rules are different. Not
> sure how you would this - by telling urllib that it is a 'pretty' django
> URL? (or am i missing out something?)
>
> -V-
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/4dce55f7/attachment.html>

From mrts.pydev at gmail.com  Fri Mar 27 17:17:52 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Fri, 27 Mar 2009 18:17:52 +0200
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <19919.1238170437@parc.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
Message-ID: <ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>

On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen <janssen at parc.com> wrote:

> Mart S?mermaa <mrts.pydev at gmail.com> wrote:
>
> > Appending query parameters to a URL is a very common need. However,
> there's
> > nothing in urllib.parse (and older urlparse) that caters for that need.
> >
> > Therefore, I propose adding the following to 2.7 and 3.1 in the
> respective
> > libs:
>
> >     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>
> To begin with, I wouldn't use keyword params.  They're syntactically
> more restrictive than the rules for application/x-www-form-urlencoded
> allow, so you start by ruling out whole classes of URLs.
>
> Bill
>

Valid point, using an ordinary dict instead would resolve that (i.e. def
add_query_params(url, param_dict)).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/c6a930e3/attachment.html>

From rhamph at gmail.com  Fri Mar 27 19:08:46 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 27 Mar 2009 12:08:46 -0600
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <87eiwjy7qc.fsf@xemacs.org>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
	<87eiwjy7qc.fsf@xemacs.org>
Message-ID: <aac2c7cb0903271108ucd94e26m71dcb6eeec1df27a@mail.gmail.com>

On Thu, Mar 26, 2009 at 9:17 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Adam Olsen writes:
> ?> "Not broken for small lists" implies it IS broken for large lists.
>
> You're being contentious. ?It logically implies no such thing, nor is
> it idiomatically an implication among consenting adults. ?And in any
> case, the phrasing I recommended is "guaranteed to have uniform
> distribution of shuffles up to N". ?The implication of "no guarantee"
> is "have a mechanic inspect it before you buy", not "this is a lemon".

We'll have to agree to disagree there.

The irony is that we only seed with 128 bits, so rather than 2**19937
combinations, there's just 2**128.  That drops our "safe" list size
down to 34.  Weee!


-- 
Adam Olsen, aka Rhamphoryncus


From arnodel at googlemail.com  Fri Mar 27 19:49:09 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Fri, 27 Mar 2009 18:49:09 +0000
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
Message-ID: <CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>


On 27 Mar 2009, at 16:17, Mart S?mermaa wrote:

> On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen <janssen at parc.com>  
> wrote:
> Mart S?mermaa <mrts.pydev at gmail.com> wrote:
>
> > Appending query parameters to a URL is a very common need.  
> However, there's
> > nothing in urllib.parse (and older urlparse) that caters for that  
> need.
> >
> > Therefore, I propose adding the following to 2.7 and 3.1 in the  
> respective
> > libs:
>
> >     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>
> To begin with, I wouldn't use keyword params.  They're syntactically
> more restrictive than the rules for application/x-www-form-urlencoded
> allow, so you start by ruling out whole classes of URLs.
>
> Bill
>
> Valid point, using an ordinary dict instead would resolve that (i.e.  
> def add_query_params(url, param_dict)).

Note that it's still not general enough as query fields can be  
repeated, e.g.

     http://foo.com/search/?q=spam&q=eggs

-- 
Arnaud


From eric at trueblade.com  Fri Mar 27 19:54:56 2009
From: eric at trueblade.com (Eric Smith)
Date: Fri, 27 Mar 2009 13:54:56 -0500
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1
 (and	urlparse in 2.7)
In-Reply-To: <CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>	<19919.1238170437@parc.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
Message-ID: <49CD2100.3070502@trueblade.com>

Arnaud Delobelle wrote:
> 
> On 27 Mar 2009, at 16:17, Mart S?mermaa wrote:
> 
>> On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen <janssen at parc.com> wrote:
>> Mart S?mermaa <mrts.pydev at gmail.com> wrote:
>>
>> > Appending query parameters to a URL is a very common need. However, 
>> there's
>> > nothing in urllib.parse (and older urlparse) that caters for that need.
>> >
>> > Therefore, I propose adding the following to 2.7 and 3.1 in the 
>> respective
>> > libs:
>>
>> >     >>> add_query_params('http://foo.com?a=b', b='c', d='q')
>>
>> To begin with, I wouldn't use keyword params.  They're syntactically
>> more restrictive than the rules for application/x-www-form-urlencoded
>> allow, so you start by ruling out whole classes of URLs.
>>
>> Bill
>>
>> Valid point, using an ordinary dict instead would resolve that (i.e. 
>> def add_query_params(url, param_dict)).
> 
> Note that it's still not general enough as query fields can be repeated, 
> e.g.
> 
>     http://foo.com/search/?q=spam&q=eggs
>

It's also possible that the order matters. I think an iterable of tuples 
(such as returned by dict.items(), but any iterable will do) would be an 
okay interface.


From jjb5 at cornell.edu  Fri Mar 27 20:29:52 2009
From: jjb5 at cornell.edu (Joel Bender)
Date: Fri, 27 Mar 2009 15:29:52 -0400
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1
 (and	urlparse in 2.7)
In-Reply-To: <49CD2100.3070502@trueblade.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>	<19919.1238170437@parc.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com>
Message-ID: <49CD2930.4080307@cornell.edu>

> It's also possible that the order matters. I think an iterable of tuples 
> (such as returned by dict.items(), but any iterable will do) would be an 
> okay interface.

Ordered dict then :-)


From jared.grubb at gmail.com  Fri Mar 27 21:46:18 2009
From: jared.grubb at gmail.com (Jared Grubb)
Date: Fri, 27 Mar 2009 13:46:18 -0700
Subject: [Python-ideas] [Python-Dev] Grammar for plus and minus unary ops
In-Reply-To: <gqj9fk$sqv$1@ger.gmane.org>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
	<gqj9fk$sqv$1@ger.gmane.org>
Message-ID: <2D10F39F-D5E3-4C72-B5A5-89ED4A351610@gmail.com>

(This is a reply to Joe's post on python-dev)

That looks like a good solution.

The downside I see with your rules is that combinations like "~+~-~ 
+~-" would still be valid, but if people want to write obfuscated  
code, there are always ways to do it. Forbidding the examples that you  
gave (and the ones I gave) is still a positive move, in my opinion.

Jared

On 27 Mar 2009, at 12:15, Joe Smith wrote:
> Jared Grubb wrote:
>> I'm not a EBNF expert, but it seems that we could modify the  
>> grammar  to be more restrictive so the above code would not be  
>> silently valid.  E.g., "++5" and "1+++5" and "1+-+5" are syntax  
>> errors, but still keep  "1++5", "1+-5", "1-+5" as valid. (Although,  
>> '~' throws in a kink...  should '~-5' be legal? Seems so...)
>
> So you want something like
> u_expr :: =
>         power | "-" xyzzy_expr | "+" xyzzy_expr | "\~" u_expr
> xyzzy_expr :: =
>         power | "\~" u_expr
>
> Such that:
> 5   # valid u_expr
> +5  # valid u_expr
> -5  # valid u_expr
> ~5  # valid u_expr
> ~~5 # valid u_expr
> ~+5 # valid u_expr
> +~5 # valid u_expr
> ~-5 # valid u_expr
> -~5 # valid u_expr
> +~-5# valid u_expr
>
> ++5 # not valid u_expr
> +-5 # not valid u_expr
> -+5 # not valid u_expr
> --5 # not valid u_expr
>
> While, I'm not a python developer, (just a python user) that sounds  
> reasonable to me, as long as this does not silently change the  
> meaning of any expression, but only noisily breaks programs, and  
> that the broken constructs are not used frequently.
>
> Can anybody come up with any expressions that would silently change  
> in meaning if the above were applied?
>
> Obviously a sane name would need to be chosen to replace xyzzy_expr.
>


From mrts.pydev at gmail.com  Fri Mar 27 22:00:47 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Fri, 27 Mar 2009 23:00:47 +0200
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <49CD2930.4080307@cornell.edu>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
Message-ID: <ad1f81530903271400o4507f4feh709dd920d3bd9735@mail.gmail.com>

As far as I can see, people tend to agree that this is useful. So, unless
someone steps up to oppose this, I'll file a feature request to the Python
bug tracker.

Will propose an implementation based on ordered dict (that will be in
2.7/3.1 anyway).

On Fri, Mar 27, 2009 at 9:29 PM, Joel Bender <jjb5 at cornell.edu> wrote:

> It's also possible that the order matters. I think an iterable of tuples
>> (such as returned by dict.items(), but any iterable will do) would be an
>> okay interface.
>>
>
> Ordered dict then :-)
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/a464c18a/attachment.html>

From jared.grubb at gmail.com  Fri Mar 27 22:07:16 2009
From: jared.grubb at gmail.com (Jared Grubb)
Date: Fri, 27 Mar 2009 14:07:16 -0700
Subject: [Python-ideas] Grammar for plus and minus unary ops
Message-ID: <CD2B703D-7173-4586-A839-40722BD9C857@gmail.com>

(Originally posted to python-dev, discussion moved here per request by  
GvR; also fixed pseudo-code to not use a keyword as local var)

Begin forwarded message:
>
> I was recently reviewing some Python code for a friend who is a C++  
> programmer, and he had code something like this:
>
> def foo():
>  attempt = 0
>  while attempt<MAX:
>     ret = bar()
>     if ret: break
>     ++attempt
>
> I was a bit surprised that this was syntactically valid, and because  
> the timeout condition only occurred in exceptional cases, the error  
> has not yet caused any problems.
>
> It appears that the grammar treats the above example as the unary +  
> op applied twice:
>
> u_expr ::=
>             power | "-" u_expr
>              | "+" u_expr | "\~" u_expr
>
> Playing in the interpreter, expressions like "1+++++++++5" and  "1+- 
> +-+-+-+-+-5" evaluate to 6.
>
> I'm not a EBNF expert, but it seems that we could modify the grammar  
> to be more restrictive so the above code would not be silently  
> valid. E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but  
> still keep "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in  
> a kink... should '~-5' be legal? Seems so...)
>
> Jared
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/6b74b435/attachment.html>

From guido at python.org  Fri Mar 27 22:27:43 2009
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Mar 2009 16:27:43 -0500
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903271400o4507f4feh709dd920d3bd9735@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<ad1f81530903271400o4507f4feh709dd920d3bd9735@mail.gmail.com>
Message-ID: <ca471dc20903271427p173cb708n2f57217f968547d4@mail.gmail.com>

2009/3/27 Mart S?mermaa <mrts.pydev at gmail.com>:
> As far as I can see, people tend to agree that this is useful. So, unless
> someone steps up to oppose this, I'll file a feature request to the Python
> bug tracker.
>
> Will propose an implementation based on ordered dict (that will be in
> 2.7/3.1 anyway).

I hope by this you mean you'll provide a patch!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From tjreedy at udel.edu  Sat Mar 28 00:36:20 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 27 Mar 2009 19:36:20 -0400
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <49CD2930.4080307@cornell.edu>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>	<19919.1238170437@parc.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>	<49CD2100.3070502@trueblade.com>
	<49CD2930.4080307@cornell.edu>
Message-ID: <gqjnti$qes$1@ger.gmane.org>

Joel Bender wrote:
>> It's also possible that the order matters. I think an iterable of 
>> tuples (such as returned by dict.items(), but any iterable will do) 
>> would be an okay interface.
> 
> Ordered dict then :-)

But that, unlike iterable of tuples, would exclude repeated fields, as 
in Arnaud's example

 >Note that it's still not general enough as query fields can be 
repeated, e.g.
 >http://foo.com/search/?q=spam&q=eggs

tjr


From greg.ewing at canterbury.ac.nz  Sat Mar 28 00:46:57 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 28 Mar 2009 11:46:57 +1200
Subject: [Python-ideas] About adding
	a	new	iteratormethodcalled	"shuffled"
In-Reply-To: <49CCBC71.7070905@improva.dk>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>
	<49CCAE02.8@improva.dk>
	<49CCB2DD.30706@canterbury.ac.nz> <49CCBC71.7070905@improva.dk>
Message-ID: <49CD6571.1010408@canterbury.ac.nz>

Jacob Holm wrote:
> Each time I reseed from a truly random source,

If you have a "truly random source" on hand, then
you have an infinite amount of entropy available
and there is no problem. Just feed your truly
random numbers straight into the shuffling
algorithm.

We're talking about the case where you *don't*
have truly random numbers, but only a PRNG with a
limited amount of internal state.

-- 
Greg


From tjreedy at udel.edu  Sat Mar 28 00:49:45 2009
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 27 Mar 2009 19:49:45 -0400
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <CD2B703D-7173-4586-A839-40722BD9C857@gmail.com>
References: <CD2B703D-7173-4586-A839-40722BD9C857@gmail.com>
Message-ID: <gqjomo$t87$1@ger.gmane.org>


>> I was recently reviewing some Python code for a friend who is a C++ 
>> programmer, and he had code something like this:
>>
>> def foo():
>>  attempt = 0
>>  while attempt<MAX:
>>     ret = bar()
>>     if ret: break
>>     ++attempt
>>
>> I was a bit surprised that this was syntactically valid, and because 
>> the timeout condition only occurred in exceptional cases, the error 
>> has not yet caused any problems.

A complete test suite would include such a case ;-).

>> It appears that the grammar treats the above example as the unary + op 
>> applied twice:
>>
>> u_expr ::=
>>             power | "-" u_expr
>>              | "+" u_expr | "\~" u_expr
>>
>> Playing in the interpreter, expressions like "1+++++++++5" and 
>>  "1+-+-+-+-+-+-5" evaluate to 6.
>>
>> I'm not a EBNF expert, but it seems that we could modify the grammar 
>> to be more restrictive so the above code would not be silently valid. 
>> E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep 
>> "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in a kink... 
>> should '~-5' be legal? Seems so...)

-1

1) This would be a petty, gratuitous restriction that would only 
complicate the language and make it harder to learn for no real gain.

2) It could break code.  + ob maps to type(ob).__pos__(ob), which could 
do anything, and no necessary just return ob as you are assuming.

3) Consider eval('+' + somecode).  Suppose somecode happens to start 
with '+'.  Currently the redundancy is harmless if not meaningful.

In summary, I think the following applies here:
"Special cases aren't special enough to break the rules."

Terry Jan Reedy


From greg.ewing at canterbury.ac.nz  Sat Mar 28 00:58:05 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 28 Mar 2009 11:58:05 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CCC3C3.4050100@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz>
	<49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz>
	<49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz>
	<49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz>
	<49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz>
	<49CCC3C3.4050100@gmail.com>
Message-ID: <49CD680D.2020502@canterbury.ac.nz>

Nick Coghlan wrote:
> The part that
> isn't clicking for me is that I still don't understand *why* 'yield
> from' should include implicit finalisation as part of its definition.
 >
 > It's the generalisation of that to all other iterators that happen to
 > offer a close() method that seems somewhat arbitrary.

It's a matter of opinion. I would find it surprising if
generators behaved differently from all other iterators
in this respect. It would be un-ducktypish.

I think we need a BDFL opinion to settle this one.

-- 
Greg


From george.sakkis at gmail.com  Sat Mar 28 01:28:59 2009
From: george.sakkis at gmail.com (George Sakkis)
Date: Fri, 27 Mar 2009 20:28:59 -0400
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <gqjnti$qes$1@ger.gmane.org>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
Message-ID: <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>

On Fri, Mar 27, 2009 at 7:36 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> Joel Bender wrote:
>>>
>>> It's also possible that the order matters. I think an iterable of tuples
>>> (such as returned by dict.items(), but any iterable will do) would be an
>>> okay interface.
>>
>> Ordered dict then :-)
>
> But that, unlike iterable of tuples, would exclude repeated fields, as in
> Arnaud's example
>
>>Note that it's still not general enough as query fields can be repeated,
>> e.g.
>>http://foo.com/search/?q=spam&q=eggs

Repeated fields can be packed together in a tuple/list:

add_query_params('http://foo.com', dict(q=('spam', 'eggs')))

To which one might reply that this would exclude non-consecutive
repeated fields,e g. '?q=spam&foo=bar&q=eggs.

To which I would reply that for this 0.01% of cases that require this
(a) do it by hand as now or (b) use the same signature as dict() (plus
the host in the beginning):

add_query_params(host, mapping_or_iterable=None, **params)

George


From jh at improva.dk  Sat Mar 28 01:36:25 2009
From: jh at improva.dk (Jacob Holm)
Date: Sat, 28 Mar 2009 01:36:25 +0100
Subject: [Python-ideas] About adding
	a	new	iteratormethodcalled	"shuffled"
In-Reply-To: <49CD6571.1010408@canterbury.ac.nz>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com>
	<fa980a6f0903262359w63e2941evc03c9fb1005f774b@mail.gmail.com>
	<A6FC5C46-7628-4D13-A130-34FB02C4429F@googlemail.com>
	<49CCAE02.8@improva.dk> <49CCB2DD.30706@canterbury.ac.nz>
	<49CCBC71.7070905@improva.dk> <49CD6571.1010408@canterbury.ac.nz>
Message-ID: <49CD7109.9040704@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> Each time I reseed from a truly random source,
>
> If you have a "truly random source" on hand, then
> you have an infinite amount of entropy available
> and there is no problem. Just feed your truly
> random numbers straight into the shuffling
> algorithm.
Of course.

>
> We're talking about the case where you *don't*
> have truly random numbers, but only a PRNG with a
> limited amount of internal state.
>

As it happens, you don't really need the random source.  As long as the 
set of shuffles you can get after reseeding generates the full set of 
permutations, all you need is to reseed in a way that will eventually 
have used all long enough sequences of  possible seed values.

No this is not even remotely practical, and it has very little to do 
with randomness, but I think I said that right from the start.  I was 
just reacting to the statement that you wouldn't be able to generate all 
permutations using shuffle+reseed.  You almost certainly can, but it is 
a silly thing to do.  If you want large random permutations, you need a 
PRNG with an *extremely* long period, and if you have that there is no 
need for repeated shuffles.

- Jacob


From guido at python.org  Sat Mar 28 03:26:44 2009
From: guido at python.org (Guido van Rossum)
Date: Fri, 27 Mar 2009 21:26:44 -0500
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
Message-ID: <ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>

There's way too much bikeshedding in this thread (not picking on you
specifically). I think the originally proposed API is fine, except it
should *not* reject duplicates. To add duplicates you'd just call it
multiple times, e.g. add_query_params(add_query_params(url, a='x'),
a='y'). It's a pretty minor use case anyways.

--Guido

On Fri, Mar 27, 2009 at 7:28 PM, George Sakkis <george.sakkis at gmail.com> wrote:
> On Fri, Mar 27, 2009 at 7:36 PM, Terry Reedy <tjreedy at udel.edu> wrote:
>
>> Joel Bender wrote:
>>>>
>>>> It's also possible that the order matters. I think an iterable of tuples
>>>> (such as returned by dict.items(), but any iterable will do) would be an
>>>> okay interface.
>>>
>>> Ordered dict then :-)
>>
>> But that, unlike iterable of tuples, would exclude repeated fields, as in
>> Arnaud's example
>>
>>>Note that it's still not general enough as query fields can be repeated,
>>> e.g.
>>>http://foo.com/search/?q=spam&q=eggs
>
> Repeated fields can be packed together in a tuple/list:
>
> add_query_params('http://foo.com', dict(q=('spam', 'eggs')))
>
> To which one might reply that this would exclude non-consecutive
> repeated fields,e g. '?q=spam&foo=bar&q=eggs.
>
> To which I would reply that for this 0.01% of cases that require this
> (a) do it by hand as now or (b) use the same signature as dict() (plus
> the host in the beginning):
>
> add_query_params(host, mapping_or_iterable=None, **params)
>
> George
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From dreamingforward at gmail.com  Sat Mar 28 03:33:49 2009
From: dreamingforward at gmail.com (average)
Date: Fri, 27 Mar 2009 19:33:49 -0700
Subject: [Python-ideas] Builtin test function
Message-ID: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com>

>>> There's been some discussion about automatic test discovery lately.
>>> Here's a random (not in any way thought through) idea: add a builtin
>>> function test() that runs tests associated with a given function,
>>> class, module, or object.
>>
>> Improved testing is always welcome, but why a built-in?
>>
>> I know testing is important, but is it so common and important that we
>> need it at our fingertips, so to speak, and can't even import a module
>> first before running tests? What's the benefit to making it a built-in
>> instead of part of a test module?
>
> The advantage would be a uniform and very simple interface for testing any
> module, without having to know whether I should import doctest,
> unittest or something else (and having to remember the commands
> used by each framework). It would certainly not be a replacement for more
> advanced test frameworks.

By making it a builtin it's also pointing out to users that
code-testing is an important part of the python culture (as well as
good development practice).   It may seem easy "just to do a module
import and then run the imported test function", but such a construct
says that testing is just an optional thing among many dozens of
modules within python.

As for a name, Guido's criticism aside, I do like like it spelled
test() with usage very much similar to the builtin help()
function--both would be accessing the same docstrings but for two
different purposes.  I think it would add a lot of encouragement for
the use of doctest (one of my favorites) as well as facilitate good
test-driven development.  And, regarding the name, if any function
deserves the name test() it would be this builtin--all others would
necessarily be secondary.  But if there's rancor regarding the name,
call it testdoc() or something.

Personally, I'm +2 on the idea, but that may only be in cents....

marcos

PS. Add test() to the GSoC suggesting of improving doctest with
scope-aware doc-test variables (for easing setup code between
module->class->method docs).


From bruce at leapyear.org  Sat Mar 28 04:16:05 2009
From: bruce at leapyear.org (Bruce Leban)
Date: Fri, 27 Mar 2009 20:16:05 -0700
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <gqjomo$t87$1@ger.gmane.org>
References: <CD2B703D-7173-4586-A839-40722BD9C857@gmail.com>
	<gqjomo$t87$1@ger.gmane.org>
Message-ID: <cf5b87740903272016g47707b5duf1dee62c64e86ef8@mail.gmail.com>

If you want to make sure that you can't use ++ or --, then target that
directly: add ++ and -- as tokens to the language and make them always
illegal. While that might be a bit of a kludge, I think it's far better than
adding a complicated rule that you can't put two + or two - in a row.

And if you're worried about eval('+' + somecode), you've got three choices:
(1) leave out the '+' because it has no effect; (2) write eval('+ ' +
somecode) and (3) are you sure you really want to use eval?

--- Bruce

On Fri, Mar 27, 2009 at 4:49 PM, Terry Reedy <tjreedy at udel.edu> wrote:

>
>  I was recently reviewing some Python code for a friend who is a C++
>>> programmer, and he had code something like this:
>>>
>>> def foo():
>>>  attempt = 0
>>>  while attempt<MAX:
>>>    ret = bar()
>>>    if ret: break
>>>    ++attempt
>>>
>>> I was a bit surprised that this was syntactically valid, and because the
>>> timeout condition only occurred in exceptional cases, the error has not yet
>>> caused any problems.
>>>
>>
> A complete test suite would include such a case ;-).
>
>  It appears that the grammar treats the above example as the unary + op
>>> applied twice:
>>>
>>> u_expr ::=
>>>            power | "-" u_expr
>>>             | "+" u_expr | "\~" u_expr
>>>
>>> Playing in the interpreter, expressions like "1+++++++++5" and
>>>  "1+-+-+-+-+-+-5" evaluate to 6.
>>>
>>> I'm not a EBNF expert, but it seems that we could modify the grammar to
>>> be more restrictive so the above code would not be silently valid. E.g.,
>>> "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep "1++5",
>>> "1+-5", "1-+5" as valid. (Although, '~' throws in a kink... should '~-5' be
>>> legal? Seems so...)
>>>
>>
> -1
>
> 1) This would be a petty, gratuitous restriction that would only complicate
> the language and make it harder to learn for no real gain.
>
> 2) It could break code.  + ob maps to type(ob).__pos__(ob), which could do
> anything, and no necessary just return ob as you are assuming.
>
> 3) Consider eval('+' + somecode).  Suppose somecode happens to start with
> '+'.  Currently the redundancy is harmless if not meaningful.
>
> In summary, I think the following applies here:
> "Special cases aren't special enough to break the rules."
>
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090327/727c1fc0/attachment.html>

From steve at pearwood.info  Sat Mar 28 05:13:20 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 28 Mar 2009 15:13:20 +1100
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
Message-ID: <200903281513.21029.steve@pearwood.info>

Moved from python-dev to python-ideas.

On Sat, 28 Mar 2009 04:19:46 am Jared Grubb wrote:
> I was recently reviewing some Python code for a friend who is a C++
> programmer, and he had code something like this:
>
> def foo():
>    try = 0
>    while try<MAX:
>       ret = bar()
>       if ret: break
>       ++try
>
> I was a bit surprised that this was syntactically valid, 

You shouldn't be. Unary operators are inspired by the equivalent 
mathematical unary operators.

...
> It appears that the grammar treats the above example as the unary +
> op applied twice:

As it should.


...
> I'm not a EBNF expert, but it seems that we could modify the grammar
> to be more restrictive so the above code would not be silently valid.
> E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep
> "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in a kink...
> should '~-5' be legal? Seems so...)

Why would we want to do this? I'm sure there are plenty of other syntax 
constructions in Python which just happen to look like something from 
other languages, but have a different meaning. Do we have to chase our 
tails removing every possible syntactically valid string in Python that 
has a different meaning in some other language? Or is C++ somehow 
special that we treat it differently from all the other languages?

Not only is this a self-inflicted error (writing C++ code in a Python 
program is a PEBCAK error), but it's rare: it only affects a minority 
of C++ programmers, and they are only a minority of Python programmers. 
There's no need to complicate the grammar to prevent this sort of 
error. Keep it simple. ---1 on the proposal (*grin*).


-- 
Steven D'Aprano


From dickinsm at gmail.com  Sat Mar 28 05:26:59 2009
From: dickinsm at gmail.com (Mark Dickinson)
Date: Sat, 28 Mar 2009 04:26:59 +0000
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <200903281513.21029.steve@pearwood.info>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
	<200903281513.21029.steve@pearwood.info>
Message-ID: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com>

Does PyChecker check for uses of '--' and '++'?  That
would seem like the obvious place to have such a check.

Mark


From aahz at pythoncraft.com  Sat Mar 28 05:59:44 2009
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 27 Mar 2009 21:59:44 -0700
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <200903281513.21029.steve@pearwood.info>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
	<200903281513.21029.steve@pearwood.info>
Message-ID: <20090328045944.GA14415@panix.com>

On Sat, Mar 28, 2009, Steven D'Aprano wrote:
>
> Not only is this a self-inflicted error (writing C++ code in a Python 
> program is a PEBCAK error), but it's rare: it only affects a minority 
> of C++ programmers, and they are only a minority of Python programmers. 
> There's no need to complicate the grammar to prevent this sort of 
> error. Keep it simple. ---1 on the proposal (*grin*).

In all fairness, "++" is valid in many C-derived languages, so it hits C
and C++ programmers, plus Ruby and Perl programmers.  I'm not in favor of
this restriction, but I'm not opposed, either, and I think your thesis is
invalid.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"At Resolver we've found it useful to short-circuit any doubt and just
refer to comments in code as 'lies'. :-)"
--Michael Foord paraphrases Christian Muirhead on python-dev, 2009-3-22


From leif.walsh at gmail.com  Sat Mar 28 06:12:02 2009
From: leif.walsh at gmail.com (Leif Walsh)
Date: Sat, 28 Mar 2009 01:12:02 -0400 (EDT)
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com>
Message-ID: <fstui6i1ewky51l03gUYAxe124vaj_firegpg@mail.gmail.com>

2009/3/28 Mark Dickinson <dickinsm at gmail.com>:
> Does PyChecker check for uses of '--' and '++'? ?That
> would seem like the obvious place to have such a check.

+--1 ;-)

-- 
Cheers,
Leif

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090328/40036f11/attachment.pgp>

From arnodel at googlemail.com  Sat Mar 28 08:54:03 2009
From: arnodel at googlemail.com (Arnaud Delobelle)
Date: Sat, 28 Mar 2009 07:54:03 +0000
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <49CD2930.4080307@cornell.edu>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>	<19919.1238170437@parc.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
Message-ID: <3230F12B-2FC3-42C9-B8B5-70829BC9C5C5@googlemail.com>


On 27 Mar 2009, at 19:29, Joel Bender wrote:

>> It's also possible that the order matters. I think an iterable of  
>> tuples (such as returned by dict.items(), but any iterable will do)  
>> would be an okay interface.
>
> Ordered dict then :-)

Why not use the same signature as dict.update()?

update(...)
     D.update(E, **F) -> None.  Update D from E and F: for k in E:  
D[k] = E[k]
     (if E has keys else: for (k, v) in E: D[k] = v) then: for k in F:  
D[k] = F[k]

-- 
Arnaud


From ncoghlan at gmail.com  Sat Mar 28 09:16:07 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Mar 2009 18:16:07 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CD680D.2020502@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com>
	<49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com>
	<49CC5D94.7000609@canterbury.ac.nz>
	<49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz>
	<49CCC3C3.4050100@gmail.com> <49CD680D.2020502@canterbury.ac.nz>
Message-ID: <49CDDCC7.6050105@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
>> The part that
>> isn't clicking for me is that I still don't understand *why* 'yield
>> from' should include implicit finalisation as part of its definition.
>>
>> It's the generalisation of that to all other iterators that happen to
>> offer a close() method that seems somewhat arbitrary.
> 
> It's a matter of opinion. I would find it surprising if
> generators behaved differently from all other iterators
> in this respect. It would be un-ducktypish.
> 
> I think we need a BDFL opinion to settle this one.

It's still your PEP, so unless Guido objects to your preference, I'll
cope - I suspect either approach can be explained easily enough in the
documentation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sat Mar 28 09:19:40 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Mar 2009 18:19:40 +1000
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <cf5b87740903272016g47707b5duf1dee62c64e86ef8@mail.gmail.com>
References: <CD2B703D-7173-4586-A839-40722BD9C857@gmail.com>	<gqjomo$t87$1@ger.gmane.org>
	<cf5b87740903272016g47707b5duf1dee62c64e86ef8@mail.gmail.com>
Message-ID: <49CDDD9C.3050102@gmail.com>

Bruce Leban wrote:
> And if you're worried about eval('+' + somecode), you've got three
> choices: (1) leave out the '+' because it has no effect

That's not true for all data types - Decimal is the one that comes to
mind as having a significant use for unary '+' (specifically, it is used
to say "round to currently defined precision, but otherwise leave the
value alone")

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sat Mar 28 10:03:40 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 Mar 2009 19:03:40 +1000
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>	<200903281513.21029.steve@pearwood.info>
	<5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com>
Message-ID: <49CDE7EC.2090706@gmail.com>

Mark Dickinson wrote:
> Does PyChecker check for uses of '--' and '++'?  That
> would seem like the obvious place to have such a check.

Yep, sounds like a pychecker/pylint kind of problem to me as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From denis.spir at free.fr  Sat Mar 28 10:18:26 2009
From: denis.spir at free.fr (spir)
Date: Sat, 28 Mar 2009 10:18:26 +0100
Subject: [Python-ideas] Builtin test function
In-Reply-To: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com>
References: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com>
Message-ID: <20090328101826.337907bc@o>

Le Fri, 27 Mar 2009 19:33:49 -0700,
average <dreamingforward at gmail.com> s'exprima ainsi:

> >>> There's been some discussion about automatic test discovery lately.
> >>> Here's a random (not in any way thought through) idea: add a builtin
> >>> function test() that runs tests associated with a given function,
> >>> class, module, or object.
> >>
> >> Improved testing is always welcome, but why a built-in?
> >>
> >> I know testing is important, but is it so common and important that we
> >> need it at our fingertips, so to speak, and can't even import a module
> >> first before running tests? What's the benefit to making it a built-in
> >> instead of part of a test module?
> >
> > The advantage would be a uniform and very simple interface for testing any
> > module, without having to know whether I should import doctest,
> > unittest or something else (and having to remember the commands
> > used by each framework). It would certainly not be a replacement for more
> > advanced test frameworks.
> 
> By making it a builtin it's also pointing out to users that
> code-testing is an important part of the python culture (as well as
> good development practice).   It may seem easy "just to do a module
> import and then run the imported test function", but such a construct
> says that testing is just an optional thing among many dozens of
> modules within python.

Really true for me.
Also, I think python needs a standard method for testing. As well as for doc-ing.
[But I'm not sure that pseudo-strings are the best format to store test information (idem for doc). I would prefere specialized types -- maybe subtype of string.]
I really support the idea because I feel personally concerned: would probably do a more systematical use of tests if there were a (well thought / straightforward / *clear*) builtin standard.

> As for a name, Guido's criticism aside, I do like like it spelled
> test() with usage very much similar to the builtin help()
> function--both would be accessing the same docstrings but for two
> different purposes.  I think it would add a lot of encouragement for
> the use of doctest (one of my favorites) as well as facilitate good
> test-driven development.  And, regarding the name, if any function
> deserves the name test() it would be this builtin--all others would
> necessarily be secondary.  But if there's rancor regarding the name,
> call it testdoc() or something.

The analogy with help() sounds sensible.
A builtin/standard testing func should definitely be called test(). *Other* test methods should use another name or be prefixed with a module name.
Now, we must also cope with existing code. The name should not imply that it's a special method. Maybe runtest() or check()?

> Personally, I'm +2 on the idea, but that may only be in cents....
> 
> marcos

Denis
------
la vita e estrany


From foobarmus at gmail.com  Sat Mar 28 11:06:13 2009
From: foobarmus at gmail.com (Mark Donald)
Date: Sat, 28 Mar 2009 18:06:13 +0800
Subject: [Python-ideas] suggestion for try/except program flow
Message-ID: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>

In this situation, which happens to me fairly frequently...

try:
 try:
     raise Cheese
 except Cheese, e:
     # handle cheese
     raise
except:
 # handle all manner of stuff, including cheese

...it would be nice (& more readable) if one were able to recatch a
named exception with the generic (catch-all) except clause of its own
try, something like this:

try:
 raise Cheese
except Cheese, e:
 # handle cheese
 recatch
except:
 # handle all manner of stuff, including cheese


From guido at python.org  Sat Mar 28 12:56:50 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 06:56:50 -0500
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>
Message-ID: <ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>

On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald <foobarmus at gmail.com> wrote:
> In this situation, which happens to me fairly frequently...
>
> try:
> ?try:
> ? ? raise Cheese
> ?except Cheese, e:
> ? ? # handle cheese
> ? ? raise
> except:
> ?# handle all manner of stuff, including cheese
>
> ...it would be nice (& more readable) if one were able to recatch a
> named exception with the generic (catch-all) except clause of its own
> try, something like this:
>
> try:
> ?raise Cheese
> except Cheese, e:
> ?# handle cheese
> ?recatch
> except:
> ?# handle all manner of stuff, including cheese

I'm not sure recatch is all that more reasonable -- it's another
fairly obscure control flow verb. I think the current situation isn't
so bad. Nick already pointed out an idiom for doing this without two
try clauses:

except BaseException as e:
  if isinstance(e, Cheese):
    # handle cheese
  # handle all manner of stuff, including cheese

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jh at improva.dk  Sat Mar 28 14:44:31 2009
From: jh at improva.dk (Jacob Holm)
Date: Sat, 28 Mar 2009 14:44:31 +0100
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CC5D85.30409@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz>
	<49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
Message-ID: <49CE29BF.3040502@improva.dk>

Hi Greg

There seems to be another issue with GeneratorExit in the latest 
expansion (reproduced below).  Based on the inlining/refactoring 
principle, I would expect the following code:

def inner():
    try:
        yield 1
        yield 2
        yield 3
    except GeneratorExit:
        val = 'closed'
    else:
        val = 'exhausted'
    return val.upper()

def outer():
    val = yield from inner()
    print val


To be equivalent to this:

def outer():
    try:
        yield 1
        yield 2
        yield 3
    except GeneratorExit:
        val = 'closed'
    else:
        val = 'exhausted'
    val = val.upper()
    print val


However, with the current expansion they are different.  Only the 
version not using "yield from" will print "CLOSED" in this case:

g = outer()
g.next()   # prints 1
g.close()  # should print "CLOSED", but doesn't because the GeneratorExit is reraised by yield-from


I currently don't think that a special case for GeneratorExit is 
needed.  Can you give me an example showing that it is?

- Jacob


------------------------------------------------------------------------

_i = iter(EXPR)
try:
    _y = _i.next()
except StopIteration, _e:
    _r = _e.value
else:
    while 1:
        try:
            _s = yield _y
        except:
            _m = getattr(_i, 'throw', None)
            if _m is not None:
                _x = sys.exc_info()
                try:
                    _y = _m(*_x)
                except StopIteration, _e:
                    if _e is _x[1] or isinstance(_x[1], GeneratorExit):
                        raise
                    else:
                        _r = _e.value
                        break
            else:
                _m = getattr(_i, 'close', None)
                if _m is not None:
                    _m()
                raise
        else:
            try:
                if _s is None:
                    _y = _i.next()
                else:
                    _y = _i.send(_s)
            except StopIteration, _e:
                _r = _e.value
                break
RESULT = _r


From ptspts at gmail.com  Sat Mar 28 15:44:52 2009
From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=)
Date: Sat, 28 Mar 2009 15:44:52 +0100
Subject: [Python-ideas] method decorators @final and @override in Python 2.4
Message-ID: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>

Hi,

If Python had method decorators @final (meaning: it is an error to
override this method in any subclass) and @override (meaning: it is an
error not having this method in a superclass), I would use them in my
projects (some of them approaching 20 000 lines of Python code) and
I'll feel more confident writing object-oriented Python code. Java
already has similar decorators or specifiers. Do you think it is a
good idea to have these in Python?

I've created a proof-of-concept implementation, which uses
metaclasses, and it works in Python 2.4 an Python 2.5. See
http://www.math.bme.hu/~pts/pobjects.py and
http://www.math.bme.hu/~pts/pobjects_example.py

Best regards,

P?ter


From foobarmus at gmail.com  Sat Mar 28 15:54:33 2009
From: foobarmus at gmail.com (Mark Donald)
Date: Sat, 28 Mar 2009 22:54:33 +0800
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>
	<ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>
Message-ID: <e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>

Unless something has changed in Python 3+, I believe Nick's idiom
requires the generic handler code to be copied into a second except
clause to achieve identical behaviour, as follows...

except BaseException as e:
    if isinstance(e, Cheese):
        # handle cheese
    # handle all manner of stuff, including cheese
except:
    # handle all manner of OTHER stuff in the same way

...which makes the nested try block more semantic. This can be tested
by putting > raise "this is deprecated..."  into the try block, which
generates an error that is NOT caught by "except BaseException"
however IS caught by "except".

I realise that "recatch" may seem like a trivial suggestion, but I
believe it would be a slight improvement, which means it's not
trivial. Try/except statements are supposedly preferential to
"untried" if statements that handle isolated cases, but if statements
are a lot easier to imagine, so people use them all them time - style
be damned. Each slight improvement in try/except is liable to make it
more attractive to coders, the end result being better code (for
example, the implementation of PEP 341 increased my team's use of
try/except significantly, causing a huge improvement to code
readability).

Imagine if you could do this:

    runny = True
    runnier_than_you_like_it = False
    raise Cheese
except Cheese, e
    # handle cheese
    if runny: recatch as Camembert
except Camembert, e
    # handle camembert
    if runnier_than_you_like_it:
        recatch
    else:
        uncatch # ie, else clause will be effective...
except:
    # not much of a cheese shop, really is it?
else:
    # negotiate vending of cheesy comestibles
finally:
    # sally forth

And, I'm not trying to be belligerent here, but before somebody says...

Cheese(Exception)
Camembert(Cheese)

...just please have a little think about it.

Mark

2009/3/28 Guido van Rossum <guido at python.org>:
> On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald <foobarmus at gmail.com> wrote:
>> In this situation, which happens to me fairly frequently...
>>
>> try:
>> ?try:
>> ? ? raise Cheese
>> ?except Cheese, e:
>> ? ? # handle cheese
>> ? ? raise
>> except:
>> ?# handle all manner of stuff, including cheese
>>
>> ...it would be nice (& more readable) if one were able to recatch a
>> named exception with the generic (catch-all) except clause of its own
>> try, something like this:
>>
>> try:
>> ?raise Cheese
>> except Cheese, e:
>> ?# handle cheese
>> ?recatch
>> except:
>> ?# handle all manner of stuff, including cheese
>
> I'm not sure recatch is all that more reasonable -- it's another
> fairly obscure control flow verb. I think the current situation isn't
> so bad. Nick already pointed out an idiom for doing this without two
> try clauses:
>
> except BaseException as e:
> ?if isinstance(e, Cheese):
> ? ?# handle cheese
> ?# handle all manner of stuff, including cheese
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>


From grosser.meister.morti at gmx.net  Sat Mar 28 16:21:51 2009
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 28 Mar 2009 16:21:51 +0100
Subject: [Python-ideas] method decorators @final and @override in Python
 2.4
In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
Message-ID: <49CE408F.1030005@gmx.net>

P?ter Szab? wrote:
> Hi,
> 
> If Python had method decorators @final (meaning: it is an error to
> override this method in any subclass) and @override (meaning: it is an
> error not having this method in a superclass), I would use them in my
> projects (some of them approaching 20 000 lines of Python code) and
> I'll feel more confident writing object-oriented Python code. Java
> already has similar decorators or specifiers. Do you think it is a
> good idea to have these in Python?
> 
> I've created a proof-of-concept implementation, which uses
> metaclasses, and it works in Python 2.4 an Python 2.5. See
> http://www.math.bme.hu/~pts/pobjects.py and
> http://www.math.bme.hu/~pts/pobjects_example.py
> 
> Best regards,
> 
> P?ter

+1 on the idea.
however, using a metaclass would be to limiting imho. can you implement it in a 
different way? a lot of things people use metaclasses for work perfectly fine 
without them (instead use a superclass that overrides __new__ or similar).

	-apnzi


From aahz at pythoncraft.com  Sat Mar 28 16:40:09 2009
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 28 Mar 2009 08:40:09 -0700
Subject: [Python-ideas] About adding a new
	iteratormethodcalled	"shuffled"
In-Reply-To: <aac2c7cb0903271108ucd94e26m71dcb6eeec1df27a@mail.gmail.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
	<87eiwjy7qc.fsf@xemacs.org>
	<aac2c7cb0903271108ucd94e26m71dcb6eeec1df27a@mail.gmail.com>
Message-ID: <20090328154008.GB7421@panix.com>

On Fri, Mar 27, 2009, Adam Olsen wrote:
>
> The irony is that we only seed with 128 bits, so rather than 2**19937
> combinations, there's just 2**128.  That drops our "safe" list size
> down to 34.  Weee!

That's probably worth a bug report or RFE if one doesn't already exist.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"At Resolver we've found it useful to short-circuit any doubt and just
refer to comments in code as 'lies'. :-)"
--Michael Foord paraphrases Christian Muirhead on python-dev, 2009-3-22


From Scott.Daniels at Acm.Org  Sat Mar 28 19:32:19 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat, 28 Mar 2009 11:32:19 -0700
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
Message-ID: <gqlq8o$etf$1@ger.gmane.org>

P?ter Szab? wrote:
> Hi,
> 
> If Python had method decorators @final (meaning: it is an error to
> override this method in any subclass) and @override (meaning: it is an
> error not having this method in a superclass), I would use them in my
> projects (some of them approaching 20 000 lines of Python code) and
> I'll feel more confident writing object-oriented Python code. Java
> already has similar decorators or specifiers. Do you think it is a
> good idea to have these in Python?
> 
> I've created a proof-of-concept implementation, which uses
> metaclasses, and it works in Python 2.4 an Python 2.5. See
> http://www.math.bme.hu/~pts/pobjects.py and
> http://www.math.bme.hu/~pts/pobjects_example.py

I have no idea why you want these, and severe trepidation about
dealing with code that uses them "just to be safe."  It smacks of
the over-use I see of doubled underscores.  For @override, just
because you've built a base class for one kind of object does not
mean I have not thought of an interesting way to use 40% of your
code to accomplish my own end.  Why make me cut and paste?  You
are not responsible for the correctness of my flea-brained idea
whether I inherit from your class or not.  For @final, "how dare
you" for similar reasons.  Java at least has an excuse (compilation
can proceed differently).

--Scott David Daniels
Scott.Daniels at Acm.Org


From jared.grubb at gmail.com  Sat Mar 28 20:26:33 2009
From: jared.grubb at gmail.com (Jared Grubb)
Date: Sat, 28 Mar 2009 12:26:33 -0700
Subject: [Python-ideas] Grammar for plus and minus unary ops
In-Reply-To: <200903281513.21029.steve@pearwood.info>
References: <D3053FED-FCB3-47B2-80B2-5F7A779E7943@gmail.com>
	<200903281513.21029.steve@pearwood.info>
Message-ID: <A917E23C-44DA-41D0-AD5E-488A7FAFEFDE@gmail.com>


On 27 Mar 2009, at 21:13, Steven D'Aprano wrote:
>>
>> I was a bit surprised that this was syntactically valid,
>
> You shouldn't be. Unary operators are inspired by the equivalent
> mathematical unary operators.
> .....
> Why would we want to do this? I'm sure there are plenty of other  
> syntax
> constructions in Python which just happen to look like something from
> other languages, but have a different meaning. Do we have to chase our
> tails removing every possible syntactically valid string in Python  
> that
> has a different meaning in some other language? Or is C++ somehow
> special that we treat it differently from all the other languages?
>
> Not only is this a self-inflicted error (writing C++ code in a Python
> program is a PEBCAK error), but it's rare: it only affects a minority
> of C++ programmers, and they are only a minority of Python  
> programmers.
> There's no need to complicate the grammar to prevent this sort of
> error. Keep it simple. ---1 on the proposal (*grin*).

It *was* a surprise. Of the languages I've used in my life (BASIC, C, C 
++, Java, Javascript, Perl, PHP, and Python), only two would treat  
prefix ++ as double unary plus (and I try to forget BASIC as best I  
can :) ). I remember when I first picked up Python, I wrote "i++" once  
(I think many beginning Python programmers do), and I was grateful  
that a syntax error popped up (rather than silently doing nothing) and  
I never did it again... So, now, a few years later I was reviewing  
code that had "++i" in it (from a new Python developer), and did a  
double-take on the code and had a moment of surprise that it had even  
run at all.

As a devil's advocate: any code that requires double-unary plus is  
probably either abusing operator overloading or is abusing the eval  
keyword. It seems that adding a restriction to the grammar would  
probably be more helpful than harmful (the workaround for the alien  
case, if there is one, of needing double-unary plus would be to use  
parens: "+(+x)").

In any case, I understand that dynamic languages are going to allow  
for side effects to occur anywhere, so it's tough to remove it. I'm  
actually only +0 on it as it is... Just a "nice" feature I thought I'd  
throw out there.... :)

Jared


From guido at python.org  Sat Mar 28 22:24:11 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 16:24:11 -0500
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>
	<ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>
	<e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>
Message-ID: <ca471dc20903281424l24ab1bf6y81402628ef3e033d@mail.gmail.com>

On Sat, Mar 28, 2009 at 9:54 AM, Mark Donald <foobarmus at gmail.com> wrote:
> Unless something has changed in Python 3+, I believe Nick's idiom
> requires the generic handler code to be copied into a second except
> clause to achieve identical behaviour, as follows...
>
> except BaseException as e:
> ? ?if isinstance(e, Cheese):
> ? ? ? ?# handle cheese
> ? ?# handle all manner of stuff, including cheese
> except:
> ? ?# handle all manner of OTHER stuff in the same way

In 3.0, the second except clause is unreachable because all exceptions
inherit from BaseException.

> ...which makes the nested try block more semantic. This can be tested
> by putting > raise "this is deprecated..." ?into the try block, which
> generates an error that is NOT caught by "except BaseException"
> however IS caught by "except".
>
> I realise that "recatch" may seem like a trivial suggestion, but I
> believe it would be a slight improvement, which means it's not
> trivial. Try/except statements are supposedly preferential to
> "untried" if statements that handle isolated cases, but if statements
> are a lot easier to imagine, so people use them all them time - style
> be damned. Each slight improvement in try/except is liable to make it
> more attractive to coders, the end result being better code (for
> example, the implementation of PEP 341 increased my team's use of
> try/except significantly, causing a huge improvement to code
> readability).
>
> Imagine if you could do this:
>
> ? ?runny = True
> ? ?runnier_than_you_like_it = False
> ? ?raise Cheese
> except Cheese, e
> ? ?# handle cheese
> ? ?if runny: recatch as Camembert
> except Camembert, e
> ? ?# handle camembert
> ? ?if runnier_than_you_like_it:
> ? ? ? ?recatch
> ? ?else:
> ? ? ? ?uncatch # ie, else clause will be effective...
> except:
> ? ?# not much of a cheese shop, really is it?
> else:
> ? ?# negotiate vending of cheesy comestibles
> finally:
> ? ?# sally forth
>
> And, I'm not trying to be belligerent here, but before somebody says...
>
> Cheese(Exception)
> Camembert(Cheese)
>
> ...just please have a little think about it.
>
> Mark
>
> 2009/3/28 Guido van Rossum <guido at python.org>:
>> On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald <foobarmus at gmail.com> wrote:
>>> In this situation, which happens to me fairly frequently...
>>>
>>> try:
>>> ?try:
>>> ? ? raise Cheese
>>> ?except Cheese, e:
>>> ? ? # handle cheese
>>> ? ? raise
>>> except:
>>> ?# handle all manner of stuff, including cheese
>>>
>>> ...it would be nice (& more readable) if one were able to recatch a
>>> named exception with the generic (catch-all) except clause of its own
>>> try, something like this:
>>>
>>> try:
>>> ?raise Cheese
>>> except Cheese, e:
>>> ?# handle cheese
>>> ?recatch
>>> except:
>>> ?# handle all manner of stuff, including cheese
>>
>> I'm not sure recatch is all that more reasonable -- it's another
>> fairly obscure control flow verb. I think the current situation isn't
>> so bad. Nick already pointed out an idiom for doing this without two
>> try clauses:
>>
>> except BaseException as e:
>> ?if isinstance(e, Cheese):
>> ? ?# handle cheese
>> ?# handle all manner of stuff, including cheese
>>
>> --
>> --Guido van Rossum (home page: http://www.python.org/~guido/)
>>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sat Mar 28 23:03:50 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 17:03:50 -0500
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <49CE408F.1030005@gmx.net>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<49CE408F.1030005@gmx.net>
Message-ID: <ca471dc20903281503p1b98244ft2cbb918ef55faf3@mail.gmail.com>

On Sat, Mar 28, 2009 at 10:21 AM, Mathias Panzenb?ck
<grosser.meister.morti at gmx.net> wrote:
> P?ter Szab? wrote:
>> If Python had method decorators @final (meaning: it is an error to
>> override this method in any subclass) and @override (meaning: it is an
>> error not having this method in a superclass), I would use them in my
>> projects (some of them approaching 20 000 lines of Python code) and
>> I'll feel more confident writing object-oriented Python code. Java
>> already has similar decorators or specifiers. Do you think it is a
>> good idea to have these in Python?
>>
>> I've created a proof-of-concept implementation, which uses
>> metaclasses, and it works in Python 2.4 an Python 2.5. See
>> http://www.math.bme.hu/~pts/pobjects.py and
>> http://www.math.bme.hu/~pts/pobjects_example.py
>>
>> Best regards,
>>
>> P?ter
>
> +1 on the idea.
> however, using a metaclass would be to limiting imho. can you implement it
> in a different way? a lot of things people use metaclasses for work
> perfectly fine without them (instead use a superclass that overrides __new__
> or similar).

While it could be done by overriding __new__ in a superclass I'm not
sure how that would make it easier to use, and it would make it harder
to implement efficiently: this is a check that you would like to
happen once at class definition time rather than on each instance
creation. Of course you could do some caching to do it at the first
instantiation only, but that still sounds clumsy; the metaclass is the
obvious place to put this, and gives better error messages (at import
instead of first use).

But I don't think this idea is ripe for making it into a set of
builtins yet, at least, I would prefer if someone coded this up as a
3rd party package and got feedback from a community of early adopters
first. Or maybe one of the existing frameworks would be interested in
adding this? While this may not be everyone's cup of tea (e.g. Scott
David Daniels' reply), some frameworks cater to users who do like to
be told when they're making this kind of errors.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From castironpi-ng at comcast.net  Sat Mar 28 23:04:28 2009
From: castironpi-ng at comcast.net (castironpi-ng at comcast.net)
Date: Sat, 28 Mar 2009 22:04:28 +0000 (UTC)
Subject: [Python-ideas] python-like garbage collector & workaround
In-Reply-To: <1398015503.825821238277798169.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>
Message-ID: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>

I am writing a garbage collector that is similar to Python's.  I want to know what you think, what problems I may encounter, and what kind of value I'm looking at.

For my review of literature, I have read excerpts from, and stepped through, Python's GC.  I'm picturing it as a specialized breadth-first search.

I am concerned by the inability to call user-defined finalization methods.  I'm considering a workaround that performs GC in two steps.  First, it requests the objects to drop their references that participate in the cycle.  Then, it enqueues the decref'ed object for an unnested destruction.

Here is a proof-of-concept implementation.

http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3bb410cc6dcae54/f4b282e545335c30


From tleeuwenburg at gmail.com  Sat Mar 28 23:29:30 2009
From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg)
Date: Sun, 29 Mar 2009 09:29:30 +1100
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
Message-ID: <43c8685c0903281529qe2c98abn1f22aaa0d92ae086@mail.gmail.com>

Just thinking... this sounds rather like trying to bolt interfaces into
Python. In the 'consenting adults' view, shouldn't you be able to override a
method that you inherit if you would like to? I can well imagine some
well-meaning library author protecting some method with @final, then me
spending hours cursing under my breath because I am unable to tweak the
functionality in some new direction.
If I understand what you are suggesting correctly, then I'm -1 on the idea.

I would suggest that a good docstring could do the job just as well --
"Don't override this method in subclasses!".

Do you have any use cases to highlight the problem you are trying to fix
with this suggestion?

Cheers,
-T

2009/3/29 P?ter Szab? <ptspts at gmail.com>

> Hi,
>
> If Python had method decorators @final (meaning: it is an error to
> override this method in any subclass) and @override (meaning: it is an
> error not having this method in a superclass), I would use them in my
> projects (some of them approaching 20 000 lines of Python code) and
> I'll feel more confident writing object-oriented Python code. Java
> already has similar decorators or specifiers. Do you think it is a
> good idea to have these in Python?
>
> I've created a proof-of-concept implementation, which uses
> metaclasses, and it works in Python 2.4 an Python 2.5. See
> http://www.math.bme.hu/~pts/pobjects.py and
> http://www.math.bme.hu/~pts/pobjects_example.py
>
> Best regards,
>
> P?ter
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--------------------------------------------------
Tennessee Leeuwenburg
http://myownhat.blogspot.com/
"Don't believe everything you think"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090329/f81147d9/attachment.html>

From greg.ewing at canterbury.ac.nz  Sat Mar 28 23:32:27 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Mar 2009 10:32:27 +1200
Subject: [Python-ideas] python-like garbage collector & workaround
In-Reply-To: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>
References: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>
Message-ID: <49CEA57B.5070004@canterbury.ac.nz>

castironpi-ng at comcast.net wrote:

 > I'm considering a workaround that performs GC in two steps.  First, it
> requests the objects to drop their references that participate in the 
> cycle.  Then, it enqueues the decref'ed object for an unnested
> destruction.

I don't see how that solves anything. The problem is that
the destructors might depend on other objects in the cycle
that have already been deallocated. Deferring the calling
of the destructors doesn't help with that.

The only thing that will help is decoupling the destructor
from the object being destroyed. You can do that now by
storing a weak reference to the object with the destructor
as a callback. But the destructor needs to be designed so
that it can work without holding any reference to the
object being destroyed, since it will no longer exist by
the time the destructor is called.

-- 
Greg


From ben+python at benfinney.id.au  Sat Mar 28 23:41:44 2009
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sun, 29 Mar 2009 09:41:44 +1100
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
Message-ID: <874oxds21j.fsf@benfinney.id.au>

P?ter Szab? <ptspts at gmail.com> writes:

> If Python had method decorators @final (meaning: it is an error to
> override this method in any subclass)

What use case is there for this? It would have to be quite strong to
override the Python philosophy that ?we're all consenting adults
here?, and that the programmer of the subclass is the one who knows
best whether a method needs overriding.

> and @override (meaning: it is an error not having this method in a
> superclass)

I'm not sure I understand this one, but if I'm right this is supported
now with:

    class FooABC(object):
        def frobnicate(self):
            raise NotImplementedError("Must be implemented in derived class")

Or perhaps:

    class FooABC(object):
        def __init__(self):
            if self.frobnicate is NotImplemented:
                raise ValueError("Must override 'frobnicate' in derived class")

        frobnicate = NotImplemented

But, again, what is the use case? Is it strong enough to take away the
ability of the derived class's implementor (who is, remember, a
consenting adult) to take what they want from a class and leave the
rest?

-- 
 \             ?We can't depend for the long run on distinguishing one |
  `\         bitstream from another in order to figure out which rules |
_o__)               apply.? ?Eben Moglen, _Anarchism Triumphant_, 1999 |
Ben Finney


From greg.ewing at canterbury.ac.nz  Sun Mar 29 00:55:15 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Mar 2009 11:55:15 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CE29BF.3040502@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
Message-ID: <49CEB8E3.20407@canterbury.ac.nz>

Jacob Holm wrote:

> I currently don't think that a special case for GeneratorExit is 
> needed.  Can you give me an example showing that it is?

Someone said something that made me think it was needed,
but I think you're right, it shouldn't be there.

-- 
Greg


From ncoghlan at gmail.com  Sun Mar 29 00:55:10 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 09:55:10 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CE29BF.3040502@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C2EBAA.9020106@improva.dk>	<49C316E9.1090103@canterbury.ac.nz>	<49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
Message-ID: <49CEB8DE.8060200@gmail.com>

> However, with the current expansion they are different.  Only the
> version not using "yield from" will print "CLOSED" in this case:
> 
> g = outer()
> g.next()   # prints 1
> g.close()  # should print "CLOSED", but doesn't because the
> GeneratorExit is reraised by yield-from
> 
> 
> I currently don't think that a special case for GeneratorExit is
> needed.  Can you give me an example showing that it is?

Take your example, replace the "print val" with a "yield val" and you
get a broken generator that will yield again when close() is called.

Generators that catch and do anything with GeneratorExit other than turn
it into StopIteration are almost always going to be broken - the new
expression needs to avoid making it easy to do that accidentally.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From g.brandl at gmx.net  Sun Mar 29 01:03:15 2009
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 28 Mar 2009 19:03:15 -0500
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <874oxds21j.fsf@benfinney.id.au>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<874oxds21j.fsf@benfinney.id.au>
Message-ID: <gqmds5$2ps$1@ger.gmane.org>

Ben Finney schrieb:
> P?ter Szab? <ptspts at gmail.com> writes:
> 
>> If Python had method decorators @final (meaning: it is an error to
>> override this method in any subclass)
> 
> What use case is there for this? It would have to be quite strong to
> override the Python philosophy that ?we're all consenting adults
> here?, and that the programmer of the subclass is the one who knows
> best whether a method needs overriding.

I agree. This goes in the same direction as suggesting private attributes.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From greg.ewing at canterbury.ac.nz  Sun Mar 29 01:12:05 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Mar 2009 12:12:05 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CEB8DE.8060200@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com>
Message-ID: <49CEBCD5.7020107@canterbury.ac.nz>

Nick Coghlan wrote:

> Generators that catch and do anything with GeneratorExit other than turn
> it into StopIteration are almost always going to be broken - the new
> expression needs to avoid making it easy to do that accidentally.

However, as this example shows, the suggested solution
of reraising GeneratorExit is not viable because it
violates the inlining principle.

The basic problem is that there's no way of telling the
difference between a StopIteration that means "it's okay,
I've finalized myself" and "I really mean to return
normally here".

-- 
Greg


From ncoghlan at gmail.com  Sun Mar 29 01:24:01 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 10:24:01 +1000
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>	<ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>
	<e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>
Message-ID: <49CEBFA1.2050109@gmail.com>

Mark Donald wrote:
> Unless something has changed in Python 3+, I believe Nick's idiom
> requires the generic handler code to be copied into a second except
> clause to achieve identical behaviour, as follows...

Guido already said this, but yes, something did change in 3.0: unlike
the 2.x series, the raise statement in 3.x only accepts instances of
BaseException, so having both an "except BaseException:" clause and a
bare "except:" clause becomes redundant.

Running 2.x code with the -3 flag to enable Py3k deprecation warnings
actually points this out whenever a non-instance of BaseException is raised.

'Normal' exceptions are encouraged to inherit from Exception, with only
'terminal' exceptions (currently only SystemExit, GeneratorExit,
KeyboardInterrupt) outside that heirarchy.

I agree that in 2.x, this means that if you want to handle non-Exception
exceptions along with well-behaved exceptions, you need to use
sys.exc_info() to adapt my previous example:

except:
    _et, e, _tb = sys.exc_info()
    if isinstance(e, Cheese):
        # handle cheese
    # handle all manner of stuff, including cheese

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sun Mar 29 01:40:38 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 10:40:38 +1000
Subject: [Python-ideas] method decorators @final and @override in Python
 2.4
In-Reply-To: <874oxds21j.fsf@benfinney.id.au>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<874oxds21j.fsf@benfinney.id.au>
Message-ID: <49CEC386.4050204@gmail.com>

Ben Finney wrote:
> P?ter Szab? <ptspts at gmail.com> writes:
> 
>> If Python had method decorators @final (meaning: it is an error to
>> override this method in any subclass)
> 
> What use case is there for this? It would have to be quite strong to
> override the Python philosophy that ?we're all consenting adults
> here?, and that the programmer of the subclass is the one who knows
> best whether a method needs overriding.

Agreed - the base class author has no right to tell subclass authors
that they *can't* do something. They can give hints that something
shouldn't be messed with by using a leading underscore and leaving it
undocumented (or advising against overriding it in the documentation).

That said, if a @suggest_final decorator was paired with an
@override_final decorator, I could actually see the point: one thing
that can happen with undocumented private methods and attributes in a
large class heirarchy is a subclass *accidentally* overriding them,
which can then lead to bugs which are tricky to track down (avoiding
such conflicts is actually one of the legitimate use cases for name
mangling). A suggest_final/override_final decorator pair would flag
accidental naming conflicts in complicated heirarchies at class
definition time, while still granting the subclass author the ability to
replace the nominally 'final' methods if they found it necessary.

>> and @override (meaning: it is an error not having this method in a
>> superclass)
> 
> I'm not sure I understand this one, but if I'm right this is supported
> now with:
> 
>     class FooABC(object):
>         def frobnicate(self):
>             raise NotImplementedError("Must be implemented in derived class")

Even better:

>>> import abc
>>> class FooABC(): # use metaclass keyword arg in Py3k
...   __metaclass__ = abc.ABCMeta
...   @abc.abstractmethod
...   def must_override():
...     raise NotImplemented
...

>>> x = FooABC()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class FooABC with abstract methods
must_override

>>> class Fail(FooABC): pass
...
>>> x = Fail()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Fail with abstract methods
must_override

>>> class Succeed(FooABC):
...   def must_override(self):
...     print "Overridden!"
...
>>> x = Succeed()
>>> x.must_override()
Overridden!

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From greg.ewing at canterbury.ac.nz  Sun Mar 29 01:47:01 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 29 Mar 2009 12:47:01 +1200
Subject: [Python-ideas] Yield-From: GeneratorReturn exception
In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz>
Message-ID: <49CEC505.4060906@canterbury.ac.nz>

While attempting to update the PEP to incorporate a
GeneratorReturn exception, I've thought of a potential
difficulty in making the exception type depend on
whether the return statement had a value.

Currently the StopIteration exception is created after
the return statement has unwound the stack frame, by
which time we've lost track of whether it had an
expression.

-- 
Greg


From ptspts at gmail.com  Sun Mar 29 01:48:00 2009
From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=)
Date: Sun, 29 Mar 2009 01:48:00 +0100
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <gqmds5$2ps$1@ger.gmane.org>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<874oxds21j.fsf@benfinney.id.au> <gqmds5$2ps$1@ger.gmane.org>
Message-ID: <4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com>

Hi,

Thanks for pointing out that a @final decorator in the superclass can
be an obstacle for code reuse if we assume that the author of the
subclass has no (or has only limited) control over the code of the
superclass. I'll come up with features with which the subclass can
bypass some or all decorators imposed in the superclass. (One easy way
to do this right now is saying ``__metaclass__ = type'' in the
subclass.) I agree that the programmer of the subclass is the one who
knows best whether a method needs overriding. We have to give the
programmer the power to enforce his will. But I think the errors
coming from the decorators are very useful for notifying the
programmer of the subclass that he is trying to something unexpected
-- then he should make his decision to reconsider or enforce (e.g.
@override_final as suggested by Nick Coghlan). By raising an error we
inform the programmer that there is a decision he has to make.

I think using a metaclass for implementing the checks based on the
decorator is more appropriate than just overriding __new__ -- because
we want to hook class creation (for which a metaclass is the adequate
choice), not instance creation (for which overriding __new__ is the
right choice). I definitely don't want any check at instance creation
time, not even once. If I managed to create the class, it should be
possible to create instances from it without decorator checks. By the
way, as far as I can imaginge, using __new__ instead of the metaclass
wouldn't make the implementations I can come up with simpler or
shorter.

A nice English docstring saying ``please don't override this method''
wouldn't make me happy. In my use case a few programmers including me
are co-developing a fairly complex system in Python. There are tons of
classes, tons of methods, each of them with docstrings. When I add
some methods, I sometimes assume @final or @override, and I'm sure the
system would break or function incorrectly if somebody added a
subclass or changed a superclass ignoring my assumptions. Let's
suppose this happens, but we don't notice it early enough; it becomes
obvious only days or weeks later that the system cannot work this way,
and the original reason of the problem was that somebody ignored a
@final or @override assumption, because he didn't pay close attention
to the thousands of docstrings. So we wast hours or days fixing the
system. How can we avoid this problem in the future? Option A. Rely on
writing and reading docstrings, everybody always correctly. Option B.
Get an exception if a @final or @override assumption is violated.
Option B is acceptable for me, Option A is not, because with option A
there is no guarantee that the overlooking won't happen again. With
Option B the programmer gets notified early, and he can reconsider his
code or refactor my code early, must faster than fixing it weeks
later.

Best regards,

P?ter


From ncoghlan at gmail.com  Sun Mar 29 01:56:44 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 10:56:44 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk>
	<49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz>
Message-ID: <49CEC74C.4030508@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> Generators that catch and do anything with GeneratorExit other than turn
>> it into StopIteration are almost always going to be broken - the new
>> expression needs to avoid making it easy to do that accidentally.
> 
> However, as this example shows, the suggested solution
> of reraising GeneratorExit is not viable because it
> violates the inlining principle.
> 
> The basic problem is that there's no way of telling the
> difference between a StopIteration that means "it's okay,
> I've finalized myself" and "I really mean to return
> normally here".

Well, there is a way to tell the difference - if we just threw
GeneratorExit in, then it finalised itself, otherwise it is finishing
normally.

The only question is what to do in the outer scope in the first case.

1. Accept the StopIteration as a normal termination of the subiterator
and continue execution of the delegating generator instead of finalising
it. This is very bad as it will lead to any generator that yields again
after a yield from expression almost certainly being broken [1].

2. Reraise the original GeneratorExit.

3. Reraise the subiterator's StopIteration exception.

4. Return immediately from the delegating generator.

I actually quite like option 4, as I believe it best reflects what the
subiterator has done by trapping GeneratorExit and turning it into
"normal" termination of the subiterator, without creating a situation
where generators that use yield from a likely to accidentally ignore
GeneratorExit.

Cheers,
Nick.

[1] By "broken" in this context, I mean "close() will raise
RuntimeError", as would occur if Jacob's example used "yield val"
instead of "print val", or as occurs in the following normal generator:

>>> def gen():
...   try:
...     yield
...   except GeneratorExit:
...     pass
...   yield
...
>>> g = gen()
>>> g.next()
>>> g.close()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: generator ignored GeneratorExit


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sun Mar 29 03:00:24 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 11:00:24 +1000
Subject: [Python-ideas] Yield-From: GeneratorReturn exception
In-Reply-To: <49CEC505.4060906@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CEC505.4060906@canterbury.ac.nz>
Message-ID: <49CEC828.9060608@gmail.com>

Greg Ewing wrote:
> While attempting to update the PEP to incorporate a
> GeneratorReturn exception, I've thought of a potential
> difficulty in making the exception type depend on
> whether the return statement had a value.
> 
> Currently the StopIteration exception is created after
> the return statement has unwound the stack frame, by
> which time we've lost track of whether it had an
> expression.

Does it become easier if "return None" raises StopIteration instead of
raising GeneratorReturn(None)?

I think I'd prefer that to having to perform major surgery on the eval
loop to make it do something else... (Guido may have other ideas,
obviously).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From ncoghlan at gmail.com  Sun Mar 29 03:09:44 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 11:09:44 +1000
Subject: [Python-ideas] method decorators @final and @override in Python
 2.4
In-Reply-To: <4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>	<874oxds21j.fsf@benfinney.id.au>
	<gqmds5$2ps$1@ger.gmane.org>
	<4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com>
Message-ID: <49CECA58.9030204@gmail.com>

P?ter Szab? wrote:
> (One easy way
> to do this right now is saying ``__metaclass__ = type'' in the
> subclass.)

Actually, that doesn't work as you might think...

>>> class TryNotToBeAnABC(FooABC):
...   __metaclass__ = type
...
>>> type(TryNotToBeAnABC)
<class 'abc.ABCMeta'>

The value assigned to '__metaclass__' (or the metaclass keyword argument
in Py3k) is only one candidate metaclass that the metaclass
determination algorithm considers - the metaclasses of all base classes
are also candidates, and the algorithm picks the one which is a subclass
of all of the candidate classes. If none of the candidates meet that
criteria, then it complains loudly:

>>> class OtherMeta(type): pass
...
>>> class TryNotToBeAnABC(FooABC):
...   __metaclass__ = OtherMeta
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
    metaclass conflict: the metaclass of a derived class must be a
(non-strict) subclass of the metaclasses of all its bases

And remember, as far as @overrides goes, I believe @abc.abstractmethod
already does what you want - it's only the
@suggest_final/@override_final part of the idea that doesn't exist.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From guido at python.org  Sun Mar 29 04:50:16 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 21:50:16 -0500
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CD680D.2020502@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com>
	<49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com>
	<49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com>
	<49CCB3A9.40300@canterbury.ac.nz> <49CCC3C3.4050100@gmail.com>
	<49CD680D.2020502@canterbury.ac.nz>
Message-ID: <ca471dc20903281950s39d1d913sa4b56238cd1dfc63@mail.gmail.com>

On Fri, Mar 27, 2009 at 6:58 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nick Coghlan wrote:
>>
>> The part that
>> isn't clicking for me is that I still don't understand *why* 'yield
>> from' should include implicit finalisation as part of its definition.
>
>>
>> It's the generalisation of that to all other iterators that happen to
>> offer a close() method that seems somewhat arbitrary.
>
> It's a matter of opinion. I would find it surprising if
> generators behaved differently from all other iterators
> in this respect. It would be un-ducktypish.
>
> I think we need a BDFL opinion to settle this one.

To be honest, I don't follow this in detail yet, but I believe I don't
really care that much either way, and I'd like to recommend that you
do whatever makes the specification (and hence hopefully the
implementation) have the least special cases. There are several Python
Zen rules about this. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sun Mar 29 04:52:44 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 21:52:44 -0500
Subject: [Python-ideas] Yield-From: GeneratorReturn exception
In-Reply-To: <49CEC828.9060608@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz>
	<49CEC505.4060906@canterbury.ac.nz> <49CEC828.9060608@gmail.com>
Message-ID: <ca471dc20903281952o41c9f364r894a3ef41ee151c0@mail.gmail.com>

On Sat, Mar 28, 2009 at 8:00 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Greg Ewing wrote:
>> While attempting to update the PEP to incorporate a
>> GeneratorReturn exception, I've thought of a potential
>> difficulty in making the exception type depend on
>> whether the return statement had a value.
>>
>> Currently the StopIteration exception is created after
>> the return statement has unwound the stack frame, by
>> which time we've lost track of whether it had an
>> expression.
>
> Does it become easier if "return None" raises StopIteration instead of
> raising GeneratorReturn(None)?
>
> I think I'd prefer that to having to perform major surgery on the eval
> loop to make it do something else... (Guido may have other ideas,
> obviously).

I think my first response on this (yesterday?) already mentioned that
I didn't mind so much whether "return None" was treated more like
"return" or more like "return <value>". So please do whatever can be
implemented easily.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sun Mar 29 04:55:23 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 21:55:23 -0500
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <49CEC386.4050204@gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com>
Message-ID: <ca471dc20903281955x17d5e242k42183484f08d2d52@mail.gmail.com>

On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Ben Finney wrote:
>> P?ter Szab? <ptspts at gmail.com> writes:
>>
>>> If Python had method decorators @final (meaning: it is an error to
>>> override this method in any subclass)
>>
>> What use case is there for this? It would have to be quite strong to
>> override the Python philosophy that ?we're all consenting adults
>> here?, and that the programmer of the subclass is the one who knows
>> best whether a method needs overriding.
>
> Agreed - the base class author has no right to tell subclass authors
> that they *can't* do something.

I'm sorry, but this is going too far. There are plenty of situations
where, indeed, this ought to be only a hint, but I think it goes to
far to say that a base class can never have the last word about
something. Please note that I already suggested this be put in a 3rd
party package -- I'm not about to make these builtins.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From ncoghlan at gmail.com  Sun Mar 29 05:14:42 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 13:14:42 +1000
Subject: [Python-ideas] method decorators @final and @override in Python
 2.4
In-Reply-To: <ca471dc20903281955x17d5e242k42183484f08d2d52@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>	
	<874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com>
	<ca471dc20903281955x17d5e242k42183484f08d2d52@mail.gmail.com>
Message-ID: <49CEE7A2.8090604@gmail.com>

Guido van Rossum wrote:
> On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Agreed - the base class author has no right to tell subclass authors
>> that they *can't* do something.
> 
> I'm sorry, but this is going too far. There are plenty of situations
> where, indeed, this ought to be only a hint, but I think it goes to
> far to say that a base class can never have the last word about
> something.

Sorry, what I wrote was broader in scope than what I actually meant. I
only intended to refer to otherwise arbitrary non-functional constraints
like marking elements of the base as "private" or "final" without giving
a subclass author a way to override them (after all, even name mangling
can be reversed with sufficient motivation).

A base class obviously needs to impose some real constraints on
subclasses in practice, or it isn't going to be a very useful (if
nothing else, it needs to set down the details of the shared API).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From guido at python.org  Sun Mar 29 05:18:38 2009
From: guido at python.org (Guido van Rossum)
Date: Sat, 28 Mar 2009 22:18:38 -0500
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <49CEE7A2.8090604@gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com>
	<ca471dc20903281955x17d5e242k42183484f08d2d52@mail.gmail.com>
	<49CEE7A2.8090604@gmail.com>
Message-ID: <ca471dc20903282018u1a0661ewc84d17dba73a9e6b@mail.gmail.com>

On Sat, Mar 28, 2009 at 10:14 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
>> On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> Agreed - the base class author has no right to tell subclass authors
>>> that they *can't* do something.
>>
>> I'm sorry, but this is going too far. There are plenty of situations
>> where, indeed, this ought to be only a hint, but I think it goes to
>> far to say that a base class can never have the last word about
>> something.
>
> Sorry, what I wrote was broader in scope than what I actually meant. I
> only intended to refer to otherwise arbitrary non-functional constraints
> like marking elements of the base as "private" or "final" without giving
> a subclass author a way to override them (after all, even name mangling
> can be reversed with sufficient motivation).

To paraphrase a cliche: "Having 'private' (or 'final') in a language
doesn't cause unusable software. People using 'private' (or 'final')
indiscriminately cause unusable software." :-)

> A base class obviously needs to impose some real constraints on
> subclasses in practice, or it isn't going to be a very useful (if
> nothing else, it needs to set down the details of the shared API).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From foobarmus at gmail.com  Sun Mar 29 07:06:29 2009
From: foobarmus at gmail.com (Mark Donald)
Date: Sun, 29 Mar 2009 13:06:29 +0800
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <49CEBFA1.2050109@gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>
	<ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>
	<e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>
	<49CEBFA1.2050109@gmail.com>
Message-ID: <e5b661850903282206y6973204n930859a7c534f723@mail.gmail.com>

> Guido already said this, but yes, something did change in 3.0: unlike
> the 2.x series, the raise statement in 3.x only accepts instances of
> BaseException, so having both an "except BaseException:" clause and a
> bare "except:" clause becomes redundant.

Ah, apologies... I need to update myself.

'uncatch' to subsequently execute the else clause is still going to be
impossible, but I don't have a real-world need for that as yet.

Cheers


From ncoghlan at gmail.com  Sun Mar 29 08:33:31 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 Mar 2009 16:33:31 +1000
Subject: [Python-ideas] suggestion for try/except program flow
In-Reply-To: <e5b661850903282206y6973204n930859a7c534f723@mail.gmail.com>
References: <e5b661850903280306p4a443bd0s4829fb852aabc311@mail.gmail.com>	
	<ca471dc20903280456p518a801ar20afa2ec9e046c8@mail.gmail.com>	
	<e5b661850903280754h19c59e29v9475fdeda41c932d@mail.gmail.com>	
	<49CEBFA1.2050109@gmail.com>
	<e5b661850903282206y6973204n930859a7c534f723@mail.gmail.com>
Message-ID: <49CF163B.90409@gmail.com>

Mark Donald wrote:
>> Guido already said this, but yes, something did change in 3.0: unlike
>> the 2.x series, the raise statement in 3.x only accepts instances of
>> BaseException, so having both an "except BaseException:" clause and a
>> bare "except:" clause becomes redundant.
> 
> Ah, apologies... I need to update myself.
> 
> 'uncatch' to subsequently execute the else clause is still going to be
> impossible, but I don't have a real-world need for that as yet.

If you really find yourself doing this kind of exception interrogation a
lot, you may find it easier to do it in the __exit__ method of a context
manager. Those are *always* invoked regardless of how the with statement
ends and you can then do whatever flow control you like based on the
type of the first argument (and whether or not it is None).

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From Scott.Daniels at Acm.Org  Sun Mar 29 09:16:54 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 29 Mar 2009 00:16:54 -0700
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <gqlq8o$etf$1@ger.gmane.org>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<gqlq8o$etf$1@ger.gmane.org>
Message-ID: <gqn72a$ddj$1@ger.gmane.org>

Scott David Daniels wrote:
> P?ter Szab? wrote:
>> If Python had method decorators @final (meaning: it is an error to
>> override this method in any subclass) and @override (meaning: it is an
>> error not having this method in a superclass), I would use them in my
>> projects (some of them approaching 20 000 lines of Python code) and
>> I'll feel more confident writing object-oriented Python code....
> 
> I have no idea why you want these, and severe trepidation about
> dealing with code that uses them "just to be safe."  It smacks of
> the over-use I see of doubled underscores.  For @override, just
> because you've built a base class for one kind of object does not
> mean I have not thought of an interesting way to use 40% of your
> code to accomplish my own end.  Why make me cut and paste?  You
> are not responsible for the correctness of my flea-brained idea
> whether I inherit from your class or not.  For @final, "how dare
> you" for similar reasons.  Java at least has an excuse (compilation
> can proceed differently).

I was asked off-group to give an example where use of @override
prevents reusing some code.  First, the above is an overstatement
of my case, probably an attempt to "bully" you off that position.
For that bullying, I apologize.  Second, what follows below is one
example of what @overrides prevents me from doing.  Say you've
built a class named "MostlyAbstract" with comparisons:

     class MostlyAbstract(object):
         @override
         def __hash__(self, other):
             pass
         @override
         def __lt__(self, other):
             pass
         @override
         def __eq__(self, other):
             pass
         def __le__(self, other):
             return self.__lt__(other) or self.__eq__(other)
         def __gt__(self, other):
             return other.__lt__(self)
         def __ge__(self, other):
             return self.__gt__(other) or self.__eq__(other)

and I decide the comparison should works a bit differently:

     class MostAbstract(MostlyAbstract):
         def __gt__(self, other):
             return not self.__le__(self)


This choice of mine won't work, even when I'm trying to just do a
slight change to your abstraction.  Similarly, If I want to
monkey-path in a debugging print or two, I cannot do it without
having to create a bunch of vacuous implementations.  Also, a
@final will prevent me from sneaking in aextra print when I'm
bug-chasing.

That being said, a mechanism like the following could be used
as a facility to implement your two desires, by providing a nice
simple place called as each class definition is completed:

class Type(type):
     '''A MetaClass to call __initclass__ for freshly defined classes.'''
     def __new__(class_, name, supers, methods):
         if '__initclass__' in methods and not isinstance(
                             methods['__initclass__'], classmethod):
             method = methods['__initclass__']
             methods['__initclass__'] = classmethod(method)
         return type.__new__(class_, name, supers, methods)

     def __init__(self, name, supers, methods):
         type.__init__(self, name, supers, methods)
         if hasattr(self, '__initclass__'):
             self.__initclass__()


In 2.5, for example, you'd use it like:

class Foo(SomeParent):
     __metaclass__ = Type

     def __init_class__(self):
         <check for whatever you like.>

--Scott David Daniels
Scott.Daniels at Acm.Org


From jh at improva.dk  Sun Mar 29 14:33:51 2009
From: jh at improva.dk (Jacob Holm)
Date: Sun, 29 Mar 2009 14:33:51 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C36306.4040002@improva.dk>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz>
Message-ID: <49CF6AAF.70109@improva.dk>

Greg Ewing wrote:
> Nick Coghlan wrote:
>
>> Generators that catch and do anything with GeneratorExit other than turn
>> it into StopIteration are almost always going to be broken - the new
>> expression needs to avoid making it easy to do that accidentally.
>
> However, as this example shows, the suggested solution
> of reraising GeneratorExit is not viable because it
> violates the inlining principle.
>
> The basic problem is that there's no way of telling the
> difference between a StopIteration that means "it's okay,
> I've finalized myself" and "I really mean to return
> normally here".
>

Would it be possible to attach the current exception (if any) to the 
StopIteration/GeneratorReturn raised by a return statement in a finally 
clause? (Using the __traceback__ and __cause__ attributes from PEP-3134) 
Then the PEP expansion could check for and reraise the attached 
exception. Now that I think about it, this is almost required by the 
inlining/refactoring principle. Consider this example:

def inner():
    try:
        yield 1
        yield 2
        yield 3
    finally:
        return 'VALUE'

def outer():
    val = yield from inner()
    print val


Which I think should be equivalent to:

def outer():
    try:
        yield 1
        yield 2
        yield 3
    finally:
        val = 'VALUE'
    print val


The problem is that any exception thrown into inner is converted to a 
GeneratorReturn, which is then swallowed by the yield-from instead of 
being reraised.

- Jacob


From castironpi-ng at comcast.net  Sun Mar 29 15:42:22 2009
From: castironpi-ng at comcast.net (castironpi-ng at comcast.net)
Date: Sun, 29 Mar 2009 13:42:22 +0000 (UTC)
Subject: [Python-ideas] python-like garbage collector & workaround
In-Reply-To: <49CEA57B.5070004@canterbury.ac.nz>
Message-ID: <318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>


----- Original Message ----- 
From: "Greg Ewing" < greg.ewing at canterbury.ac.nz > 
To: castironpi-ng at comcast.net 
Cc: Python-ideas at python.org 
Sent: Saturday, March 28, 2009 5:32:27 PM GMT -06:00 US/Canada Central 
Subject: Re: [Python-ideas] python-like garbage collector & workaround 

castironpi-ng at comcast.net wrote: 

> I'm considering a workaround that performs GC in two steps. First, it 
> requests the objects to drop their references that participate in the 
> cycle. Then, it enqueues the decref'ed object for an unnested 
> destruction. 

I don't see how that solves anything. The problem is that 
the destructors might depend on other objects in the cycle 
that have already been deallocated. Deferring the calling 
of the destructors doesn't help with that. 

The only thing that will help is decoupling the destructor 
from the object being destroyed. You can do that now by 
storing a weak reference to the object with the destructor 
as a callback. But the destructor needs to be designed so 
that it can work without holding any reference to the 
object being destroyed, since it will no longer exist by 
the time the destructor is called. 

-- 
Greg 
==================================== 

Nice response time. 

> Deferring the calling of the destructors doesn't help with that. 

I beg to differ. There is a complex example in the test code at the address. Here is a simple one. 'A' has a reference to 'B' and 'B' has a reference to 'A'. They both need to call each other's methods during their respective finalizations. 

1. Ref counts: A-1, B-1 
2. Request A to drop ref. to B. 
3. Ref counts: A-1, B-0. 
4. Finalize & deallocate B. 
5. ... B drops ref. to A 
6. Ref counts: A-0 
7. Finalize & deallocate A. 

'A' performs its final call to 'B' in step 2, still having a reference to it. It empties the attribute of its own that refers to B. 'B's reference count goes to 0. 'B' performs its final call to 'A' in step 5, still having a reference to it. 'A' still has control of its fields, and can make remaining subordinate calls if necessary. 'B' releases its reference to 'A', and 'A's reference count goes to zero. 'B' is deallocated. 'A' performs its finalization, and should check its field to see if it still has the reference to B. If it did, it would perform the call in step 2. In this case, it doesn't, and it can keep a record of the fact that it already made that final call. 'A's finalizer exits without any calls to 'B', because the field that held its reference to 'B' is clear. 'A' is deallocated. 

> But the destructor needs to be designed so 
> that it can work without holding any reference to the 
> object being destroyed 

I want to give 'A' control of that. To accomplish this, I bring it to A's attention the fact that it has left reachability, /and/ is in a cycle with B. It can perform its normal finalization at this time and maintain its consistency of state. 

I believe it solves the problem of failing to call the destructor at all, but I may have just shirked it. Will it work? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090329/dae56e8b/attachment.html>

From jh at improva.dk  Sun Mar 29 17:47:23 2009
From: jh at improva.dk (Jacob Holm)
Date: Sun, 29 Mar 2009 17:47:23 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CF6AAF.70109@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C3EF5E.1050807@improva.dk>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk>
Message-ID: <49CF980B.6030400@improva.dk>

Jacob Holm wrote:
>
> Would it be possible to attach the current exception (if any) to the 
> StopIteration/GeneratorReturn raised by a return statement in a 
> finally clause? (Using the __traceback__ and __cause__ attributes from 
> PEP-3134) Then the PEP expansion could check for and reraise the 
> attached exception.

Based on that idea, here is the 3.0-based expansion I propose:

_i = iter(EXPR)
try:
    _t = None
    _y = next(_i)
    while 1:
        try:
            _s = yield _y
        except BaseException as _e:
            _t = _e
            _m = getattr(_i, 'throw', None)
            if _m is None:
                raise
            _y = _m(_t)
        else:
            _t = None
            if _s is None:
                _y = next(_i)
            else:
                _y = _i.send(_s)
except StopIteration as _e:
    if _e is _t: 
        # If _e is the exception that we have just thrown to the subiterator, reraise it.
        if _m is None:
            # If there was no "throw" method, explicitly close the iterator before reraising.
            _m = getattr(_i, 'close', None)
            if _m is not None:
                _m()
        raise
    if _e.__cause__ is not None:
        # If the return was from inside a finally clause with an active exception, reraise that exception.
        raise _e.__cause__
    # Normal return
    RESULT = _e.value


I have moved the code around a bit to use fewer try blocks while 
preserving semantics, then removed the check for GeneratorExit and added 
a different check for __cause__.

Even if the __cause__ idea is shot down, I think I prefer the way this 
expansion reads. It makes it easier to see at a glance what is part of 
the loop and what is part of the cleanup.

What do you think?

- Jacob


From rhamph at gmail.com  Sun Mar 29 20:04:53 2009
From: rhamph at gmail.com (Adam Olsen)
Date: Sun, 29 Mar 2009 12:04:53 -0600
Subject: [Python-ideas] About adding a new iteratormethodcalled
	"shuffled"
In-Reply-To: <20090328154008.GB7421@panix.com>
References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com>
	<200903252328.49177.steve@pearwood.info>
	<3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com>
	<200903261058.59164.steve@pearwood.info>
	<aac2c7cb0903252342w6af0be75te01c28c37481ad99@mail.gmail.com>
	<87iqlwy3rf.fsf@xemacs.org>
	<aac2c7cb0903261043r3a8fd4am684b501c921a1882@mail.gmail.com>
	<87eiwjy7qc.fsf@xemacs.org>
	<aac2c7cb0903271108ucd94e26m71dcb6eeec1df27a@mail.gmail.com>
	<20090328154008.GB7421@panix.com>
Message-ID: <aac2c7cb0903291104y2254e81cy104e7b93e977e3d7@mail.gmail.com>

On Sat, Mar 28, 2009 at 9:40 AM, Aahz <aahz at pythoncraft.com> wrote:
> On Fri, Mar 27, 2009, Adam Olsen wrote:
>>
>> The irony is that we only seed with 128 bits, so rather than 2**19937
>> combinations, there's just 2**128. ?That drops our "safe" list size
>> down to 34. ?Weee!
>
> That's probably worth a bug report or RFE if one doesn't already exist.

It seems sufficient to me.  We don't want to needlessly drain the
system's entropy pool.

How about a counter proposal?  We add an orange or red box in the
random docs that explain a few things together:

* What a cryptographically secure RNG is, that ours isn't it, and that
ours is unacceptable any time money or security is involved.
* Specifically, 624 "iterates" allows you to predict the full state,
and thus all future (and past?) output
* The limitations of our default seed, and how it isn't a practical
problem, overshadowed by the above two things
* The limitations on shuffling a large list, how equidistance means
it's not a practical problem, and is overshadowed by all of the above

Some of that already exists, but is inline.  IMO, security issues
deserve a few flashing lights.  The context of other problems also
gives the proper light to shuffling's limitations.


-- 
Adam Olsen, aka Rhamphoryncus


From ptspts at gmail.com  Sun Mar 29 20:37:38 2009
From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=)
Date: Sun, 29 Mar 2009 20:37:38 +0200
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <gqn72a$ddj$1@ger.gmane.org>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<gqlq8o$etf$1@ger.gmane.org> <gqn72a$ddj$1@ger.gmane.org>
Message-ID: <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com>

>    class MostlyAbstract(object):
>        @override
>        def __hash__(self, other):
>            pass
>        @override
>        def __lt__(self, other):
>            pass
>        @override
>        def __eq__(self, other):
>            pass
>        def __le__(self, other):
>            return self.__lt__(other) or self.__eq__(other)
>        def __gt__(self, other):
>            return other.__lt__(self)
>        def __ge__(self, other):
>            return self.__gt__(other) or self.__eq__(other)
>
> and I decide then comparison should works a bit differently:
>
>    class MostAbstract(MostlyAbstract):
>        def __gt__(self, other):
>            return not self.__le__(self)
>
>
> This choice of mine won't work, even when I'm trying to just do a
> slight change to your abstraction.

I think we have a different understanding what @override means. I
define @override like this: ``class B(A): @override def F(self):
pass'' is OK only if A.F is defined, i.e. there is a method F to
override. What I understand about your mails is that your definition
is: if there is @override on A.F, then any subclass of A must override
A.F. Do I get the situation of the different understanding right? If
so, do you find anything in my definition which prevents code reuse?
(I don't.)

>    def __init__(self, name, supers, methods):
>        type.__init__(self, name, supers, methods)
>        if hasattr(self, '__initclass__'):
>            self.__initclass__()

Thanks for the idea, this sounds generic enough for various uses, and
it gives power to the author of the subclass. I'll see if my
decorators can be implemented using __initclass__.


From Scott.Daniels at Acm.Org  Sun Mar 29 21:26:16 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 29 Mar 2009 12:26:16 -0700
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>	<gqlq8o$etf$1@ger.gmane.org>
	<gqn72a$ddj$1@ger.gmane.org>
	<4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com>
Message-ID: <gqohpr$mlu$1@ger.gmane.org>

P?ter Szab? wrote:
...
> I think we have a different understanding what @override means. I
> define @override like this: ``class B(A): @override def F(self):
> pass'' is OK only if A.F is defined, i.e. there is a method F to
> override. What I understand about your mails is that your definition
> is: if there is @override on A.F, then any subclass of A must override
> A.F. Do I get the situation of the different understanding right? If
> so, do you find anything in my definition which prevents code reuse?
> (I don't.)

Nor do I.  I completely misunderstood what you meant by override, and I
agree that what you are specifying there _is_ a help to those writing
code (I'd document it as a way of marking an intentional override).

As to @final, I'd prefer a warning to an error when I override a
final method.   Overriding is a rich way of debugging, and if the
point is to catch coding "misteaks", ignoring warnings is easier
than changing package code when debugging.

--Scott David Daniels
Scott.Daniels at Acm.Org


From janssen at parc.com  Sun Mar 29 22:45:57 2009
From: janssen at parc.com (Bill Janssen)
Date: Sun, 29 Mar 2009 13:45:57 PDT
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
Message-ID: <47913.1238359557@parc.com>

P?ter Szab? <ptspts at gmail.com> wrote:

> If Python had method decorators @final (meaning: it is an error to
> override this method in any subclass) and @override (meaning: it is an
> error not having this method in a superclass), I would use them in my
> projects (some of them approaching 20 000 lines of Python code) and
> I'll feel more confident writing object-oriented Python code. Java
> already has similar decorators or specifiers. Do you think it is a
> good idea to have these in Python?

No on @final (I've had more trouble with ill-considered Java "final"
classes than I can believe), but @override sounds interesting.  I can
see the point of that.  Should do the check at compile time, right?

Bill


From steve at pearwood.info  Sun Mar 29 23:30:43 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 30 Mar 2009 08:30:43 +1100
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <gqohpr$mlu$1@ger.gmane.org>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com>
	<gqohpr$mlu$1@ger.gmane.org>
Message-ID: <200903300830.44820.steve@pearwood.info>

On Mon, 30 Mar 2009 06:26:16 am Scott David Daniels wrote:
> P?ter Szab? wrote:
> ...
>
> > I think we have a different understanding what @override means. I
> > define @override like this: ``class B(A): @override def F(self):
> > pass'' is OK only if A.F is defined, i.e. there is a method F to
> > override. What I understand about your mails is that your
> > definition is: if there is @override on A.F, then any subclass of A
> > must override A.F. Do I get the situation of the different
> > understanding right? If so, do you find anything in my definition
> > which prevents code reuse? (I don't.)
>
> Nor do I.  I completely misunderstood what you meant by override, and
> I agree that what you are specifying there _is_ a help to those
> writing code (I'd document it as a way of marking an intentional
> override).

Perhaps I just haven't worked on enough 20,000 line projects, but I 
don't get the point of @override. It doesn't prevent somebody from 
writing (deliberately or accidentally) B.F in the absence of A.F, since 
the coder can simply leave off the @override.

If @override is just a way of catching spelling mistakes, perhaps it 
would be better in pylint or pychecker. What have I missed?


-- 
Steven D'Aprano


From ncoghlan at gmail.com  Sun Mar 29 23:46:15 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Mar 2009 07:46:15 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CF6AAF.70109@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
Message-ID: <49CFEC27.1050005@gmail.com>

Jacob Holm wrote:
> The problem is that any exception thrown into inner is converted to a
> GeneratorReturn, which is then swallowed by the yield-from instead of
> being reraised.

That actually only happens if inner *catches and suppresses* the thrown
in exception. Otherwise throw() will reraise the original exception
automatically:

>>> def gen():
...   try:
...     yield
...   except:
...     print "Suppressed"
...
>>> g = gen()
>>> g.next()
>>> g.throw(AssertionError)
Suppressed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> def gen():
...   try:
...     yield
...   finally:
...     print "Not suppressed"
...
>>> g = gen()
>>> g.next()
>>> g.throw(AssertionError)
Not suppressed
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in gen
AssertionError

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From Scott.Daniels at Acm.Org  Mon Mar 30 00:03:03 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 29 Mar 2009 15:03:03 -0700
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <200903300830.44820.steve@pearwood.info>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>	<4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com>	<gqohpr$mlu$1@ger.gmane.org>
	<200903300830.44820.steve@pearwood.info>
Message-ID: <gqoqvp$gru$1@ger.gmane.org>

Steven D'Aprano wrote:
> ... Perhaps I just haven't worked on enough 20,000 line projects, but I 
> don't get the point of @override. It doesn't prevent somebody from 
> writing (deliberately or accidentally) B.F in the absence of A.F, since 
> the coder can simply leave off the @override.
? B.F vs. A.F?  Could you expand this a trifle?

> If @override is just a way of catching spelling mistakes, perhaps it 
> would be better in pylint or pychecker. What have I missed?

If, for example, you have a huge testing framework, and some developers
are given the task of developing elements from the framework by (say)
overriding the test_sources and test_outcome methods, They can be
handed an example module with @override demonstrating where to make
the changes.

     class TestMondoDrive(DriveTestBase):

         @override
         def test_sources(self):
             return os.listdir('/standard/mondo/tests')

         @override
         def test_outcome(self, testname, outcome):
             if outcome != 'success':
                 self.failures('At %s %s failed: %s' % (
                     time.strftime('%Y.%m.%d %H:%M:%S'),
                     test_name, outcome))
             else:
                 assert False, "I've no idea how to deal with success"

The resulting tests will be a bit easier to read, because you can
easily distinguish between support methods and framework methods.
Further, the entire warp drive test is not started if we stupidly
spell the second "test_result" (as it was on the Enterprise tests).

--Scott David Daniels
Scott.Daniels at Acm.Org


From jh at improva.dk  Mon Mar 30 00:21:19 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 30 Mar 2009 00:21:19 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CFEC27.1050005@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C43636.9080402@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49CFEC27.1050005@gmail.com>
Message-ID: <49CFF45F.8090304@improva.dk>

Nick Coghlan wrote:
> Jacob Holm wrote:
>   
>> The problem is that any exception thrown into inner is converted to a
>> GeneratorReturn, which is then swallowed by the yield-from instead of
>> being reraised.
>>     
>
> That actually only happens if inner *catches and suppresses* the thrown
> in exception. 
Having a return in the finally clause like in my example is sufficient 
to suppress the exception.

> Otherwise throw() will reraise the original exception
> automatically:
>   
I am not sure what your point is.  Yes, this is a corner case.  I am 
trying to make sure we have the corner cases working as well.

In the example I gave I think it was pretty clear what should happen 
according to the inlining principle.  The suppression of the initial 
exception is an accidental side effect of the refactoring.  It looks to 
me like using the __cause__ attribute on the GeneratorReturn will allow 
us to reraise the exception.  This seems like exactly the kind of thing 
that the __cause__ and __context__ attributes from PEP 3134 was designed 
for. 

- Jacob


From steve at pearwood.info  Mon Mar 30 00:29:28 2009
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 30 Mar 2009 09:29:28 +1100
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <gqoqvp$gru$1@ger.gmane.org>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>
	<200903300830.44820.steve@pearwood.info>
	<gqoqvp$gru$1@ger.gmane.org>
Message-ID: <200903300929.28589.steve@pearwood.info>

On Mon, 30 Mar 2009 09:03:03 am Scott David Daniels wrote:
> Steven D'Aprano wrote:
> > ... Perhaps I just haven't worked on enough 20,000 line projects,
> > but I don't get the point of @override. It doesn't prevent somebody
> > from writing (deliberately or accidentally) B.F in the absence of
> > A.F, since the coder can simply leave off the @override.
>
> ? B.F vs. A.F?  Could you expand this a trifle?

Classes B and A, method F.


In case it is still unclear, I'm referencing P?ter Szab?'s post, which 
we both quoted:

> > I think we have a different understanding what @override means. I
> > define @override like this: ``class B(A): @override def F(self):
> > pass'' is OK only if A.F is defined, i.e. there is a method F to
> > override.

The intention is that this will fail:

class A: pass

class B(A):
    @override
    def F(self): pass

but this will be okay:

class A: 
    def F(self): pass

class B(A):
    @override
    def F(self): pass


But if I leave out the @override then I can define B.F regardless of 
whether or not A.F exists, so it doesn't prevent the creation of B.F.


> > If @override is just a way of catching spelling mistakes, perhaps
> > it would be better in pylint or pychecker. What have I missed?
>
> If, for example, you have a huge testing framework, and some
> developers are given the task of developing elements from the
> framework by (say) overriding the test_sources and test_outcome
> methods, They can be handed an example module with @override
> demonstrating where to make the changes.

"# OVERRIDE" or "# TODO" will do that just as well.

>      class TestMondoDrive(DriveTestBase):
>
>          @override
>          def test_sources(self):
>              return os.listdir('/standard/mondo/tests')
>
>          @override
>          def test_outcome(self, testname, outcome):
>              if outcome != 'success':
>                  self.failures('At %s %s failed: %s' % (
>                      time.strftime('%Y.%m.%d %H:%M:%S'),
>                      test_name, outcome))
>              else:
>                  assert False, "I've no idea how to deal with
> success"
>
> The resulting tests will be a bit easier to read, because you can
> easily distinguish between support methods and framework methods.

Maybe so, but comments and/or naming conventions do that too.


> Further, the entire warp drive test is not started if we stupidly
> spell the second "test_result" (as it was on the Enterprise tests).

That's a reasonable benefit, but it still sounds to me like something 
that should go in pylint.

I don't really have any objection to this, and Guido has already said it 
should go into a third party module first. Thank you for explaining the 
use-case.


-- 
Steven D'Aprano


From mrs at mythic-beasts.com  Sun Mar 29 23:57:03 2009
From: mrs at mythic-beasts.com (Mark Seaborn)
Date: Sun, 29 Mar 2009 22:57:03 +0100 (BST)
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>
References: <ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
	<20090319.231249.343185657.mrs@localhost.localdomain>
	<ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>
Message-ID: <20090329.225703.432823651.mrs@localhost.localdomain>

Guido van Rossum <guido at python.org> wrote:

> On Thu, Mar 19, 2009 at 4:12 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> > Guido van Rossum <guido at python.org> wrote:
> >> > Guido said, "I don't understand where the function object f gets its
> >> > magic powers".
> >> >
> >> > The answer is that function definitions directly inside class
> >> > statements are treated specially by the verifier.
> >>
> >> Hm, this sounds like a major change in language semantics, and if I
> >> were Sun I'd sue you for using the name "Python" in your product. :-)
> >
> > Damn, the makers of Typed Lambda Calculus had better watch out for
> > legal action from the makers of Lambda Calculus(tm) too... :-) ?Is it
> > really a major change in semantics if it's just a subset? ;-)
> 
> Well yes. The empty subset is also a subset. :-)

As a side note, it is interesting to compare CapPython to ECMAScript
3.1's strict mode, which, as I understand it, changes the semantics of
ECMAScript's attribute access such that doing X.A when X does not have
an attribute A raises an exception rather than returning undefined.

Since existing Javascript implementations lack this feature, Cajita (a
fail-stop subset of Javascript, part of the Caja project) has to go to
some lengths to emulate it.  This seems to be the main reason that
Cajita rewrites Javascript code, to add attribute existence checks.

Fortunately CapPython does not have to make this kind of semantic
change.

Interestingly, in Javascript is is easier to add this kind of change
on a per-module basis than in Python, because dynamic attribute access
in Javascript is done via a builtin syntax (x[a]) rather than via a
function (getattr in Python).

However, CPython's restricted execution mode (which Tav is proposing
to resurrect) does change the semantics of attribute access.  It's not
yet clear to me how this works, and how it applies to the getattr
function.  I suspect it involves looking up the stack.


> More seriously, IIUC you are disallowing all use of attribute names
> starting with underscores, which not only invalidates most Python
> code in practical use (though you might not care about that) but
> also disallows the use of many features that are considered part of
> the language, such as access to __dict__ and many other
> introspective attributes.

This is true.  I'm not claiming that a lot of Python code will pass
the verifier.  It might not accept all idiomatic code; I'm just
claiming that code using encapsulated objects under CapPython can
still be idiomatic.

We could probably allow reading self.__dict__ safely in CapPython.

The term "introspection" covers a lot of language features.  Some are
OK in an object-capability language and some are not.

For example, some might consider dir() to be an introspective feature,
and this function is fine if suitably wrapped.

x.__class__.__name__ is a common idiom.  Although we can't allow
x.__class__ on its own, we could provide a get_class_name function and
rewrite "x.__class__.__name__" to "get_class_name(x)".

"type(x) is C" is another common idiom.  Again, CapPython doesn't
provide type() but it can provide a type_is() function:
def type_is(x, t):
    return type(x) is t

The "locals" builtin is not something CapPython can allow in general.
Any function that can look up the stack in this way is potentially
dangerous.  But it might be OK to allow "locals()", i.e. the case
where "locals" is called as a function and not used as a first class
value.  I would prefer not to have to do that though.


> > To some extent the verifier's check of only accessing private
> > attributes through self is just checking a coding style that I already
> > follow when writing Python code (except sometimes for writing test
> > cases).
> 
> You might wish this to be true, but for most Python programmers, it
> isn't. Introspection is a commonly-used part of the language (probably
> more so than in Java). So is the use of attribute names starting with
> a single underscore outside the class tree, e.g. by "friend"
> functions.

The friend function pattern is an example of something that CapPython
could support, with some extra notation in order to make it explicit.
It is a case of what is known as rights amplification in capability
systems.

Here's an example of how I envisage it would work in CapPython:

class C(object):
    def _get_foo(self):
        return self._foo
_get_foo = C._get_foo

Although C._get_foo would normally be rejected, the verifier would
allow reading C._get_foo immediately after the class definition as a
special case.  The resulting _get_foo function would only be able to
operate on instances of C (assuming the presence of unbound methods in
the language).


> > Of course some of the verifier's checks, such as only allowing
> > attribute assignments through self, are a lot more draconian than
> > coding style checks.
> 
> That also sounds like a rather serious hindrance to writing Python as
> most people think of it.

Attribute assignment is something that we could handle by rewriting.
For example,

  x.y = z

could be rewritten to

  x.set_attribute("y", z)

x's class definition would have to declare that attribute y is
assignable.  The problem with attribute assignment in Python as it
stands is that it is opt-out.  Attributes can be made read-only (by
using "property" or defining __setattr__), but this is not the
default.


> > Whether these function definitions are accepted by the verifier
> > depends on their context.
> 
> But this isn't.
> 
> Are you saying that the verifier accepts the use of self._foo in a
> method?

Yes.

> That would make the scenario of potentially passing a class
> defined by Alice into Bob's code much harder to verify -- now suddenly
> Alice has to know about a lot of things before she can be sure that
> she doesn't leave open a backdoor for Bob.

In most cases Alice would not want Bob to extend classes that she has
defined, so she would not give Bob access to the unwrapped class
objects.  She would just give Bob the constructor.  If Alice wants to
be sure that she does that, she can add a decorator to all her class
definitions:

def constructor_only(klass):
    def wrapper(*args, **kwargs):
        return klass(*args, **kwargs)
    return wrapper

@constructor_only
class C(object):
    ...

(However, this assumes that class decorators are available, and
CapPython does not support Python 2.6 yet.)


> > The default environment doesn't provide the real getattr() function.
> > It provides a wrapped version that rejects private attribute names.
> 
> Do you have a web page describing the precise list of limitations you
> apply in your "subset" of Python?

I started some wiki pages to explain the verifier rules and which
builtins are allowed, blocked or wrapped:
http://plash.beasts.org/wiki/CapPython/VerifierRules
http://plash.beasts.org/wiki/CapPython/Builtins
I hope that will make things clearer.

> Does it support import of some form?

Yes, it supports import:
http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html

The safeeval module allows callers to provide their own __import__
function when evalling code.

Mark


From ncoghlan at gmail.com  Mon Mar 30 00:43:12 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Mar 2009 08:43:12 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CFF45F.8090304@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk>
Message-ID: <49CFF980.90606@gmail.com>

Jacob Holm wrote:
> Nick Coghlan wrote:
>> Jacob Holm wrote:
>>  
>>> The problem is that any exception thrown into inner is converted to a
>>> GeneratorReturn, which is then swallowed by the yield-from instead of
>>> being reraised.
>>>     
>>
>> That actually only happens if inner *catches and suppresses* the thrown
>> in exception. 
> Having a return in the finally clause like in my example is sufficient
> to suppress the exception.

Ah, I did miss that - I think it just means the code has been refactored
incorrectly though.

>> Otherwise throw() will reraise the original exception
>> automatically:
>>   
> I am not sure what your point is.  Yes, this is a corner case.  I am
> trying to make sure we have the corner cases working as well.

I think the refactoring is buggy, because it has changed the code from
leaving exceptions alone to suppressing them. Consider what it would
mean to do the same refactoring with normal functions:

def inner():
   try:
       perform_operation()
   finally:
       return 'VALUE'

def outer():
   val = inner()
   print val

That code does NOT do the same thing as:

def outer():
   try:
       perform_operation()
   finally:
       val = 'VALUE'
   print val

A better refactoring would keep the return outside the finally clause in
the inner generator:

Either:

def inner():
   try:
       yield 1
       yield 2
       yield 3
   finally:
       val = 'VALUE'
   return val

Or else:

def inner():
   try:
       yield 1
       yield 2
       yield 3
   finally:
       pass
   return 'VALUE'

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From jh at improva.dk  Mon Mar 30 00:59:48 2009
From: jh at improva.dk (Jacob Holm)
Date: Mon, 30 Mar 2009 00:59:48 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CFF980.90606@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C44674.5030107@canterbury.ac.nz>	<49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz>	<49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk>
	<49CFF980.90606@gmail.com>
Message-ID: <49CFFD64.8080902@improva.dk>

Nick Coghlan wrote:
> Jacob Holm wrote:
>   
>> Having a return in the finally clause like in my example is sufficient
>> to suppress the exception.
>>     
>
> Ah, I did miss that - I think it just means the code has been refactored
> incorrectly though.
>
>   
Ok

> I think the refactoring is buggy, because it has changed the code from
> leaving exceptions alone to suppressing them. Consider what it would
> mean to do the same refactoring with normal functions:
>
> def inner():
>    try:
>        perform_operation()
>    finally:
>        return 'VALUE'
>
> def outer():
>    val = inner()
>    print val
>
> That code does NOT do the same thing as:
>
> def outer():
>    try:
>        perform_operation()
>    finally:
>        val = 'VALUE'
>    print val
>   
Good point.  Based on this observation, I withdraw the proposal about 
storing the active exception on the GeneratorReturn and reraising it in 
yield-from.  I still think we should get rid of the check for 
GeneratorExit, because of the other example I gave.

- Jacob


From Scott.Daniels at Acm.Org  Mon Mar 30 01:34:53 2009
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun, 29 Mar 2009 16:34:53 -0700
Subject: [Python-ideas] method decorators @final and @override in Python
	2.4
In-Reply-To: <200903300929.28589.steve@pearwood.info>
References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com>	<200903300830.44820.steve@pearwood.info>	<gqoqvp$gru$1@ger.gmane.org>
	<200903300929.28589.steve@pearwood.info>
Message-ID: <gqp0bv$sn1$1@ger.gmane.org>

Steven D'Aprano wrote:
> On Mon, 30 Mar 2009 09:03:03 am Scott David Daniels wrote:
>> Steven D'Aprano wrote:
>>> ... Perhaps I just haven't worked on enough 20,000 line projects,
>>> but I don't get the point of @override. It doesn't prevent somebody
>>> from writing (deliberately or accidentally) B.F in the absence of
>>> A.F, since the coder can simply leave off the @override.
>> ? B.F vs. A.F?  Could you expand this a trifle?
> 
> Classes B and A, method F.
...
> The intention is that this will fail:
...
> but this will be okay:
...
> But if I leave out the @override then I can define B.F regardless of 
> whether or not A.F exists, so it doesn't prevent the creation of B.F.
Thanks for the expansion.  The check is not so much to prevent creating
B.F, as it is asserting we are plugging into a framework here.

 >> Further, the entire warp drive test is not started if we stupidly
 >> spell the second "test_result" (as it was on the Enterprise tests).
 >
 > That's a reasonable benefit, but it still sounds to me like something
 > that should go in pylint.

Yes, after all we did lose all of sector 4.66.73 on that unfortunate
accident :-).  I agree that it does feel a bit pylint-ish, but I have
work on large unwieldy frameworks where large machines get powered on
by the framework as part of running a test, and it is nice to see the
whole test not even start in such circumstances.  This is why I wrote
that (easily ponied in) possible addition to type named "__initclass__",
it seemed a more-useful technique that could be be used by the OP to
implement his desires, while providing a simple place to put class
initialization code that allows people to get a bit fancier with their
classes without having to do the metaclass dance themselves.  I'll try
to putting up an ActiveState recipe for this in the coming week.

--Scott David Daniels
Scott.Daniels at Acm.Org


From greg.ewing at canterbury.ac.nz  Mon Mar 30 06:37:45 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Mar 2009 16:37:45 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CFF980.90606@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk>
	<49CFF980.90606@gmail.com>
Message-ID: <49D04C99.7000407@canterbury.ac.nz>

Nick Coghlan wrote:

> I think the refactoring is buggy, because it has changed the code from
> leaving exceptions alone to suppressing them.

That looks like the right assessment to me.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon Mar 30 07:45:51 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 30 Mar 2009 17:45:51 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49CF6AAF.70109@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk>
	<49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
Message-ID: <49D05C8F.3040800@canterbury.ac.nz>

The problem of how to handle GeneratorExit doesn't
seem to have any entirely satisfactory solution.

On the one hand, the inlining principle requires that
we never re-raise it if the subgenerator turns it into
a StopIteration (or GeneratorReturn).

On the other hand, not re-raising it means that a
broken generator can easily result from innocuously
combining two things that are individually legitimate.

I think we just have to accept this, and state that
refactoring only preserves semantics as long as the
code block being factored out does not catch
GeneratorExit without re-raising it. Then we're free
to always re-raise GeneratorExit and prevent broken
generators from occurring.

I'm inclined to think this situation is a symptom that
the idea of being able to catch GeneratorExit at all
is flawed. If generator finalization were implemented
by means of a forced return, or something equally
uncatchable, instead of an exception, we wouldn't have
so much of a problem.

Earlier I said that I thought GeneratorExit was best
regarded as an implementation detail of generators.
I'd like to strengthen that statement and say that it
should be considered a detail of the *present*
implementation of generators, subject to change in
future or alternate Pythons.

Related to that, I'm starting to come back to my
original instinct that GeneratorExit should not be
thrown into the subiterator at all. Rather, it should
be taken as an indication that the delegating generator
is being finalized, and the subiterator's close()
method called if it has one. Then there's never any
question about whether to re-raise it -- we should
always do so.

-- 
Greg


From mrts.pydev at gmail.com  Mon Mar 30 12:04:26 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Mon, 30 Mar 2009 13:04:26 +0300
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
Message-ID: <ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>

On Sat, Mar 28, 2009 at 5:26 AM, Guido van Rossum <guido at python.org> wrote:

> There's way too much bikeshedding in this thread (not picking on you
> specifically). I think the originally proposed API is fine, except it
> should *not* reject duplicates. To add duplicates you'd just call it
> multiple times, e.g. add_query_params(add_query_params(url, a='x'),
> a='y'). It's a pretty minor use case anyways.


So be it. I'll open a ticket and provide a patch, tests and documentation.

For people concerned about ordering -- you can always use an odict for
passing the kwargs:

add_query_params('http://foo.com', **odict('a' = 1, 'b' = 2))

For people concerned about syntactically more restrictive rules than
application/x-www-form-urlencoded allows -- pass in the kwargs via ordinary
dict:

add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that py2k
allows UTF-8 in argument names anyway

The latter is bad practice anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090330/fa136c52/attachment.html>

From mrts.pydev at gmail.com  Mon Mar 30 12:06:17 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Mon, 30 Mar 2009 13:06:17 +0300
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
Message-ID: <ad1f81530903300306saf15520xbe90ced8cac4c667@mail.gmail.com>

On Mon, Mar 30, 2009 at 1:04 PM, Mart S?mermaa <mrts.pydev at gmail.com> wrote:

>
> add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that
> py2k allows UTF-8 in argument names anyway
>
>
s/py2k/py3k/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090330/ecc14a5c/attachment.html>

From eric at trueblade.com  Mon Mar 30 12:28:31 2009
From: eric at trueblade.com (Eric Smith)
Date: Mon, 30 Mar 2009 05:28:31 -0500
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1
 (and	urlparse in 2.7)
In-Reply-To: <ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270826p62499af5lb08a78e6cabc40a6@mail.gmail.com>	<19919.1238170437@parc.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>	<49CD2100.3070502@trueblade.com>
	<49CD2930.4080307@cornell.edu>	<gqjnti$qes$1@ger.gmane.org>	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
Message-ID: <49D09ECF.5090407@trueblade.com>

> For people concerned about ordering -- you can always use an odict for 
> passing the kwargs:
> 
> add_query_params('http://foo.com <http://foo.com/>', **odict('a' = 1, 
> 'b' = 2))

Not that I want to continue the discussion about this particular issue, 
but I'd like to correct this statement, since this statement is wrong 
(beyond the syntax of creating the odict being incorrect). "**" converts 
the parameters to an ordinary dict. The caller does not receive the same 
object you call the function with. So any ordering of the values in the 
odict will be lost.

$ ./python.exe
Python 2.7a0 (trunk:70598, Mar 25 2009, 17:30:54)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> from collections import OrderedDict as odict
 >>> def foo(**kwargs):
...   print type(kwargs)
...   for k, v in kwargs.iteritems():
...     print k, v
...
 >>> o1=odict(); o1['a']=1; o1['b']=2
 >>> o1
OrderedDict([('a', 1), ('b', 2)])
 >>> o2=odict(); o2['b']=2; o2['a']=1
 >>> o2
OrderedDict([('b', 2), ('a', 1)])
 >>> foo(**o1)
<type 'dict'>
a 1
b 2
 >>> foo(**o2)
<type 'dict'>
a 1
b 2
 >>>

Further, when an odict is created and arguments are supplied, the 
ordering is also lost:
 >>> odict(a=1, b=2)
OrderedDict([('a', 1), ('b', 2)])
 >>> odict(b=2, a=1)
OrderedDict([('a', 1), ('b', 2)])
 >>>

3.1 works the same way (once you change the print statement and use 
.items instead of .iteritems: I need to run 2to3 on my example!).

I just want to make sure everyone realized the limitations. odict won't 
solve problems like this. I think these are both "gotchas" waiting to 
happen.

Eric.


From ncoghlan at gmail.com  Mon Mar 30 12:47:00 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Mar 2009 20:47:00 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D05C8F.3040800@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C43636.9080402@canterbury.ac.nz>
	<49C43D05.3010903@improva.dk>	<49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com>	<49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz>	<49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz>	<49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz>
Message-ID: <49D0A324.1030701@gmail.com>

Greg Ewing wrote:
> I'm inclined to think this situation is a symptom that
> the idea of being able to catch GeneratorExit at all
> is flawed. If generator finalization were implemented
> by means of a forced return, or something equally
> uncatchable, instead of an exception, we wouldn't have
> so much of a problem.

Well, in theory people are meant to be writing "except Exception:"
rather than using a bare except or catching BaseException - that's a big
part of the reason SystemExit, KeyboardInterrupt and GeneratorExit
*aren't* Exception subclasses.

> Related to that, I'm starting to come back to my
> original instinct that GeneratorExit should not be
> thrown into the subiterator at all. Rather, it should
> be taken as an indication that the delegating generator
> is being finalized, and the subiterator's close()
> method called if it has one. Then there's never any
> question about whether to re-raise it -- we should
> always do so.

I think that's a simpler finalisation rule to remember, so I'd be fine
with that approach. I don't think we're going to be able to completely
eliminate the tricky subtleties from this expression, but we can at
least try to keep them as simple as possible.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From mrts.pydev at gmail.com  Mon Mar 30 12:55:28 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Mon, 30 Mar 2009 13:55:28 +0300
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <49D09ECF.5090407@trueblade.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
	<49D09ECF.5090407@trueblade.com>
Message-ID: <ad1f81530903300355g2e112cadwcf5250761d4e1f87@mail.gmail.com>

On Mon, Mar 30, 2009 at 1:28 PM, Eric Smith <eric at trueblade.com> wrote:

> "**" converts the parameters to an ordinary dict. The caller does not
> receive the same object you call the function with. So any ordering of the
> values in the odict will be lost.


Right you are, sorry for the mental blunder. So what if the signature is as
follows to support passing query parameters via an ordered dict:

add_query_params(url, params_dict=None, **kwargs)

with the following behaviour:

>>> pd = odict()
>>> pd['a'] = 1
>>> pd['b'] = 2
>>> add_query_params('http://foo.com/?a=0', pd, a=3)
'http://foo.com/?a=0&a=1&b=2&a=3'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090330/406e8314/attachment.html>

From ncoghlan at gmail.com  Mon Mar 30 13:28:21 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 30 Mar 2009 21:28:21 +1000
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1
 (and	urlparse in 2.7)
In-Reply-To: <ad1f81530903300355g2e112cadwcf5250761d4e1f87@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>	<49CD2100.3070502@trueblade.com>
	<49CD2930.4080307@cornell.edu>	<gqjnti$qes$1@ger.gmane.org>	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>	<49D09ECF.5090407@trueblade.com>
	<ad1f81530903300355g2e112cadwcf5250761d4e1f87@mail.gmail.com>
Message-ID: <49D0ACD5.5090209@gmail.com>

Mart S?mermaa wrote:
> Right you are, sorry for the mental blunder. So what if the signature is
> as follows to support passing query parameters via an ordered dict:
> 
> add_query_params(url, params_dict=None, **kwargs)
> 
> with the following behaviour:
> 
>>>> pd = odict()
>>>> pd['a'] = 1
>>>> pd['b'] = 2
>>>> add_query_params('http://foo.com/?a=0', pd, a=3)
> 'http://foo.com/?a=0&a=1&b=2&a=3 <http://foo.com/?a=0&a=1&b=2&a=3>'

When setting up a dict.update style interface like that, it is often
better to use *args for the two positional arguments - it avoids
accidental name conflicts between the positional arguments and arbitrary
keyword arguments.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From mrts.pydev at gmail.com  Mon Mar 30 14:22:33 2009
From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=)
Date: Mon, 30 Mar 2009 15:22:33 +0300
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <49D0ACD5.5090209@gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
	<49D09ECF.5090407@trueblade.com>
	<ad1f81530903300355g2e112cadwcf5250761d4e1f87@mail.gmail.com>
	<49D0ACD5.5090209@gmail.com>
Message-ID: <ad1f81530903300522m51fd1099s90c05983ae748fa3@mail.gmail.com>

On Mon, Mar 30, 2009 at 2:28 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Mart S?mermaa wrote:
> > Right you are, sorry for the mental blunder. So what if the signature is
> > as follows to support passing query parameters via an ordered dict:
> >
> > add_query_params(url, params_dict=None, **kwargs)
> >
> > with the following behaviour:
> >
> >>>> pd = odict()
> >>>> pd['a'] = 1
> >>>> pd['b'] = 2
> >>>> add_query_params('http://foo.com/?a=0', pd, a=3)
> > 'http://foo.com/?a=0&a=1&b=2&a=3 <http://foo.com/?a=0&a=1&b=2&a=3>'
>
> When setting up a dict.update style interface like that, it is often
> better to use *args for the two positional arguments - it avoids
> accidental name conflicts between the positional arguments and arbitrary
> keyword arguments.


Thanks, another good point.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20090330/37cfed85/attachment.html>

From guido at python.org  Mon Mar 30 17:29:08 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Mar 2009 10:29:08 -0500
Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and
	urlparse in 2.7)
In-Reply-To: <ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
References: <ad1f81530903260849k7867f8e7k45a558f3cb608dd3@mail.gmail.com>
	<19919.1238170437@parc.com>
	<ad1f81530903270917p43eb818fi81326c14eb368440@mail.gmail.com>
	<CC95D979-0B63-4CDA-9A82-DF5302E1B794@googlemail.com>
	<49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu>
	<gqjnti$qes$1@ger.gmane.org>
	<91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com>
	<ca471dc20903271926y61f16740h8c3f29e4a1e4c376@mail.gmail.com>
	<ad1f81530903300304m796e75dmc942d38c015e4fc6@mail.gmail.com>
Message-ID: <ca471dc20903300829s24bc46d7tc7ce41add87d08ae@mail.gmail.com>

On Mon, Mar 30, 2009 at 5:04 AM, Mart S?mermaa <mrts.pydev at gmail.com> wrote:
> On Sat, Mar 28, 2009 at 5:26 AM, Guido van Rossum <guido at python.org> wrote:
>>
>> There's way too much bikeshedding in this thread (not picking on you
>> specifically). I think the originally proposed API is fine, except it
>> should *not* reject duplicates. To add duplicates you'd just call it
>> multiple times, e.g. add_query_params(add_query_params(url, a='x'),
>> a='y'). It's a pretty minor use case anyways.
>
> So be it. I'll open a ticket and provide a patch, tests and documentation.
>
> For people concerned about ordering -- you can always use an odict for
> passing the kwargs:
>
> add_query_params('http://foo.com', **odict('a' = 1, 'b' = 2))

Alas, that doesn't work -- f(**X) copies X into a real dict.

But web apps that care about the order are crazy IMO.

> For people concerned about syntactically more restrictive rules than
> application/x-www-form-urlencoded allows -- pass in the kwargs via ordinary
> dict:
>
> add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that py2k
> allows UTF-8 in argument names anyway
>
> The latter is bad practice anyway.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Mon Mar 30 23:19:51 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Mar 2009 16:19:51 -0500
Subject: [Python-ideas] CapPython's use of unbound methods
In-Reply-To: <20090329.225703.432823651.mrs@localhost.localdomain>
References: <ca471dc20903121433p783ea549k9dcdc7114709ffd9@mail.gmail.com>
	<20090319.231249.343185657.mrs@localhost.localdomain>
	<ca471dc20903221531q2b65d21ew3aa962543e6cdc55@mail.gmail.com>
	<20090329.225703.432823651.mrs@localhost.localdomain>
Message-ID: <ca471dc20903301419p4305e05cy926f709b37124890@mail.gmail.com>

On Sun, Mar 29, 2009 at 4:57 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:
> As a side note, it is interesting to compare CapPython to ECMAScript
> 3.1's strict mode, which, as I understand it, changes the semantics of
> ECMAScript's attribute access such that doing X.A when X does not have
> an attribute A raises an exception rather than returning undefined.
>
> Since existing Javascript implementations lack this feature, Cajita (a
> fail-stop subset of Javascript, part of the Caja project) has to go to
> some lengths to emulate it. ?This seems to be the main reason that
> Cajita rewrites Javascript code, to add attribute existence checks.
>
> Fortunately CapPython does not have to make this kind of semantic
> change.

Well of course it makes a much more severe semantic change by
declaring illegal all use of attribute names starting with underscore.

> Interestingly, in Javascript is is easier to add this kind of change
> on a per-module basis than in Python, because dynamic attribute access
> in Javascript is done via a builtin syntax (x[a]) rather than via a
> function (getattr in Python).

I guess if you wanted to override getattr on a per-module basis you
could give each module a separate __builtins__.

> However, CPython's restricted execution mode (which Tav is proposing
> to resurrect) does change the semantics of attribute access.

It does not change the general semantics of attribute access -- it
only takes away a small set of *specific* attributes (e.g. __code__
and func_code) from a small set of *specific* object types (e.g.
function objects). This is because every object has the ability to
override getting attributes (via __getattribute__ in Python, or
tp_getattro in C).

> It's not
> yet clear to me how this works, and how it applies to the getattr
> function. ?I suspect it involves looking up the stack.

No, it does not look at the stack. It looks at the globals, which
contain a special magic entry __builtins__ (with an 's') which is the
dict where built-in functions are looked up. When this dict is the
same object as the *default* built-in dict (which is
__builtin__.__dict__ where __builtin__ -- without 's' -- is the module
defining the built-in functions), it gives you supervisor privileges;
if it is any other object, it disallows access to those specific
attributes I referred to above.

I really recommend that you study the CPython implementation. Without
understanding it you stand a chance of creating a secure subset.

The getattr() function and the x.y notation both invoke the same
implementation (PyObject_GetAttr()). This in turn defers to the
tp_getattro slot of the object x. And if the object is implemented in
Python, this in turn defers to the object's __getattribute__ method.
Then object.__getattribute__ defines the default lookup code, which
searches into the object's __dict__ if there is one, then in the
class's __dict__ and walking the MRO, and finally (just before raising
AttributeError) calls the __getattr__ hook if it exists (don't confuse
the latter with __getattribute__).

> Guido van Rossum <guido at python.org> wrote:
>> More seriously, IIUC you are disallowing all use of attribute names
>> starting with underscores, which not only invalidates most Python
>> code in practical use (though you might not care about that) but
>> also disallows the use of many features that are considered part of
>> the language, such as access to __dict__ and many other
>> introspective attributes.
>
> This is true. ?I'm not claiming that a lot of Python code will pass
> the verifier. ?It might not accept all idiomatic code; I'm just
> claiming that code using encapsulated objects under CapPython can
> still be idiomatic.

For some definition of idiomatic. There are a lot of well-known Python
idioms involving attribute names starting with underscore.

(I hate to question your Python proficiency, but I do have to wonder
-- how much Python have you written in your life? Where did you learn
Python?)

> We could probably allow reading self.__dict__ safely in CapPython.

Though that's not enough -- peeking in other.__dict__ is also somewhat common.

> The term "introspection" covers a lot of language features. ?Some are
> OK in an object-capability language and some are not.

Agreed. And many introspection features aren't that important or
commonly used. But some others are, and this includes using __dict__
and  __class__.

> For example, some might consider dir() to be an introspective feature,

It is.

> and this function is fine if suitably wrapped.

You'd have to look at the C implementation to see what it might do though.

> x.__class__.__name__ is a common idiom. ?Although we can't allow
> x.__class__ on its own, we could provide a get_class_name function and
> rewrite "x.__class__.__name__" to "get_class_name(x)".
>
> "type(x) is C" is another common idiom.

Though in most cases isinstance(x, C) is preferred.

> Again, CapPython doesn't
> provide type() but it can provide a type_is() function:
> def type_is(x, t):
> ? ?return type(x) is t

And slowly we slide down the path of writing less and less idiomatic Python...

> The "locals" builtin is not something CapPython can allow in general.
> Any function that can look up the stack in this way is potentially
> dangerous. ?But it might be OK to allow "locals()", i.e. the case
> where "locals" is called as a function and not used as a first class
> value. ?I would prefer not to have to do that though.

Using locals() isn't that idiomatic anyway, so this is probably fine.
It's mostly used by beginners who are still exploring the extreme end
of the language's dynamism. :-)

>> > To some extent the verifier's check of only accessing private
>> > attributes through self is just checking a coding style that I already
>> > follow when writing Python code (except sometimes for writing test
>> > cases).
>>
>> You might wish this to be true, but for most Python programmers, it
>> isn't. Introspection is a commonly-used part of the language (probably
>> more so than in Java). So is the use of attribute names starting with
>> a single underscore outside the class tree, e.g. by "friend"
>> functions.
>
> The friend function pattern is an example of something that CapPython
> could support, with some extra notation in order to make it explicit.
> It is a case of what is known as rights amplification in capability
> systems.
>
> Here's an example of how I envisage it would work in CapPython:
>
> class C(object):
> ? ?def _get_foo(self):
> ? ? ? ?return self._foo
> _get_foo = C._get_foo
>
> Although C._get_foo would normally be rejected, the verifier would
> allow reading C._get_foo immediately after the class definition as a
> special case. ?The resulting _get_foo function would only be able to
> operate on instances of C (assuming the presence of unbound methods in
> the language).

I'm not sure how useful this is -- friends aren't necessarily in the
same module as the class, otherwise they might as well be declared as
static methods.

>> > Of course some of the verifier's checks, such as only allowing
>> > attribute assignments through self, are a lot more draconian than
>> > coding style checks.
>>
>> That also sounds like a rather serious hindrance to writing Python as
>> most people think of it.
>
> Attribute assignment is something that we could handle by rewriting.
> For example,
>
> ?x.y = z
>
> could be rewritten to
>
> ?x.set_attribute("y", z)

Why not

x.set_y(z)

?

> x's class definition would have to declare that attribute y is
> assignable. ?The problem with attribute assignment in Python as it
> stands is that it is opt-out. ?Attributes can be made read-only (by
> using "property" or defining __setattr__), but this is not the
> default.

This will encourage people to write "Java in Python" which is an
unfortunately common anti-pattern.

>> > Whether these function definitions are accepted by the verifier
>> > depends on their context.
>>
>> But this isn't.
>>
>> Are you saying that the verifier accepts the use of self._foo in a
>> method?
>
> Yes.
>
>> That would make the scenario of potentially passing a class
>> defined by Alice into Bob's code much harder to verify -- now suddenly
>> Alice has to know about a lot of things before she can be sure that
>> she doesn't leave open a backdoor for Bob.
>
> In most cases Alice would not want Bob to extend classes that she has
> defined, so she would not give Bob access to the unwrapped class
> objects. ?She would just give Bob the constructor.

Or perhaps, better, a factory function, right?

> If Alice wants to
> be sure that she does that, she can add a decorator to all her class
> definitions:
>
> def constructor_only(klass):
> ? ?def wrapper(*args, **kwargs):
> ? ? ? ?return klass(*args, **kwargs)
> ? ?return wrapper
>
> @constructor_only
> class C(object):
> ? ?...

Clever. It does meant that even the class body of C cannot refer to
C-the-class, which prevents certain idioms (mostly involving updating
class variables -- perhaps not all that common).

> (However, this assumes that class decorators are available, and
> CapPython does not support Python 2.6 yet.)

Well you can always do this manually:

class C(object):
    ...
C = constructor_only(C)

>> > The default environment doesn't provide the real getattr() function.
>> > It provides a wrapped version that rejects private attribute names.
>>
>> Do you have a web page describing the precise list of limitations you
>> apply in your "subset" of Python?
>
> I started some wiki pages to explain the verifier rules and which
> builtins are allowed, blocked or wrapped:
> http://plash.beasts.org/wiki/CapPython/VerifierRules
> http://plash.beasts.org/wiki/CapPython/Builtins
> I hope that will make things clearer.

Ok, I'll try to remember to look there before responding next time.

>> Does it support import of some form?
>
> Yes, it supports import:
> http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html
>
> The safeeval module allows callers to provide their own __import__
> function when evalling code.

Ok. Have you done a security contest like Tav did yet? Implementing
import correctly *and* safely is fiendishly difficult.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg.ewing at canterbury.ac.nz  Tue Mar 31 00:12:01 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Mar 2009 10:12:01 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D0A324.1030701@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C43D05.3010903@improva.dk>
	<49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com>
Message-ID: <49D143B1.9040009@canterbury.ac.nz>

Nick Coghlan wrote:

> Well, in theory people are meant to be writing "except Exception:"
> rather than using a bare except or catching BaseException - that's a big
> part of the reason SystemExit, KeyboardInterrupt and GeneratorExit
> *aren't* Exception subclasses.

Yes, it probably isn't something people will do very
often. But as long as GeneratorExit is documented as
an official part of the language, we need to explain
how we're dealing with it.

BTW, how official *is* it meant to be? There seems to
be very little said about it in either the Language or
Library Reference.

The Library Ref says it's the "exception raised when a
generator's close() method is called". The Language Ref
says that the close() method "allows finally clauses to
run", but doesn't say how that is accomplished.

And I can't find throw() mentioned anywhere!

-- 
Greg


From guido at python.org  Tue Mar 31 00:22:07 2009
From: guido at python.org (Guido van Rossum)
Date: Mon, 30 Mar 2009 17:22:07 -0500
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D143B1.9040009@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz>
	<49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz>
Message-ID: <ca471dc20903301522n7de045f8t710de565b821bf0a@mail.gmail.com>

On Mon, Mar 30, 2009 at 5:12 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Nick Coghlan wrote:
>
>> Well, in theory people are meant to be writing "except Exception:"
>> rather than using a bare except or catching BaseException - that's a big
>> part of the reason SystemExit, KeyboardInterrupt and GeneratorExit
>> *aren't* Exception subclasses.
>
> Yes, it probably isn't something people will do very
> often. But as long as GeneratorExit is documented as
> an official part of the language, we need to explain
> how we're dealing with it.
>
> BTW, how official *is* it meant to be? There seems to
> be very little said about it in either the Language or
> Library Reference.

That's one of our many doc bugs. (Maybe someone at the PyCon sprints
can fix these?)

PEP 342 defines GeneratorExit, inheriting from Exception. However a
later change to the code base made it inherit from BaseException.

> The Library Ref says it's the "exception raised when a
> generator's close() method is called". The Language Ref
> says that the close() method "allows finally clauses to
> run", but doesn't say how that is accomplished.
>
> And I can't find throw() mentioned anywhere!

Also defined in PEP 342.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


From jh at improva.dk  Tue Mar 31 01:22:22 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 31 Mar 2009 01:22:22 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D143B1.9040009@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C44674.5030107@canterbury.ac.nz>
	<49C4D67B.4010109@improva.dk>	<49C55F9A.6070305@canterbury.ac.nz>
	<49C57698.7030808@gmail.com>	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com>	<49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz>	<49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz>	<49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk>	<49D05C8F.3040800@canterbury.ac.nz>
	<49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz>
Message-ID: <49D1542E.7070503@improva.dk>

Greg Ewing wrote:
> Nick Coghlan wrote:
>
>> Well, in theory people are meant to be writing "except Exception:"
>> rather than using a bare except or catching BaseException - that's a big
>> part of the reason SystemExit, KeyboardInterrupt and GeneratorExit
>> *aren't* Exception subclasses.
>
> Yes, it probably isn't something people will do very
> often. But as long as GeneratorExit is documented as
> an official part of the language, we need to explain
> how we're dealing with it.
As my last (flawed) example shows, it is easy to accidently convert the 
GeneratorExit (along with any other uncaught exception) to a 
StopIteration if you are using a finally clause. You don't need to 
explicitly catch anything. Code that does this should be considered 
broken. Not so much because it is swallowing GeneratorExit, but because 
it swallows *any* exception. I don't think we should add special cases 
to the yield-from semantics to cater for broken code.

I even think it might have been a mistake in PEP 342 to let close 
swallow StopIteration. It might have been better if a throw to an 
already-closed generator just raised the thrown exception, and close 
only swallowed GeneratorExit. That way, you would quickly discover that 
the generator was swallowing exceptions because a call to close would 
cause a StopIteration. With that definition, we would consider any 
generator that did not (under normal conditions) raise GeneratorExit 
when thrown a GeneratorExit to be broken. Had that been the definition, 
I think we would long ago have agreed to let yield-from treat 
GeneratorExit like any other exception.

Unfortunately that is not how things work, and I am afraid that changing 
it would "break" too much code. I put "break" in quotes, because I think 
most such code is already broken in the sense that it can swallow 
exceptions that it shouldn't, such as KeyboardInterrupt and SystemExit.

Even without changing throw and close, I still think we should forward 
GeneratorExit like any other exception, and not do anything special to 
reraise it or call close on the subiterator. To me that sounds like the 
cleaner solution, and it is what the inlining principle suggests. It is 
unfortunate that you have to be a bit more careful about not swallowing 
GeneratorExit, but I think that care is needed anyway to avoid 
swallowing other exceptions as well.

>
> BTW, how official *is* it meant to be? There seems to
> be very little said about it in either the Language or
> Library Reference.
>
> The Library Ref says it's the "exception raised when a
> generator's close() method is called". The Language Ref
> says that the close() method "allows finally clauses to
> run", but doesn't say how that is accomplished.
>
> And I can't find throw() mentioned anywhere!
>
All the generator methods are described here: 
http://docs.python.org/reference/expressions.html#yield-expressions

- Jacob


From greg.ewing at canterbury.ac.nz  Tue Mar 31 02:23:48 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Mar 2009 12:23:48 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D1542E.7070503@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C4D67B.4010109@improva.dk>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com>
	<49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk>
Message-ID: <49D16294.9030205@canterbury.ac.nz>

Jacob Holm wrote:

> Even without changing throw and close, I still think we should forward 
> GeneratorExit like any other exception, and not do anything special to 
> reraise it or call close on the subiterator.

But that allows you to inadvertently create a broken
generator by calling another generator that, according to
the rules you've just acknowledged we can't change, is
behaving correctly.

Asking users not to call such generators would require
them to have knowledge about the implementation of every
generator they call, which I don't think is acceptable.

-- 
Greg


From jan.kanis at phil.uu.nl  Tue Mar 31 02:36:06 2009
From: jan.kanis at phil.uu.nl (Jan Kanis)
Date: Tue, 31 Mar 2009 02:36:06 +0200
Subject: [Python-ideas] python-like garbage collector & workaround
In-Reply-To: <318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>
References: <49CEA57B.5070004@canterbury.ac.nz>
	<318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net>
Message-ID: <59a221a0903301736y40a348b5ib18b938643fad98b@mail.gmail.com>

2009/3/29  <castironpi-ng at comcast.net>:
> ----- Original Message -----
> From: "Greg Ewing" <greg.ewing at canterbury.ac.nz>
> To: castironpi-ng at comcast.net
> Cc: Python-ideas at python.org
> Sent: Saturday, March 28, 2009 5:32:27 PM GMT -06:00 US/Canada Central
> Subject: Re: [Python-ideas] python-like garbage collector & workaround
>
> castironpi-ng at comcast.net wrote:
>
> ?> I'm considering a workaround that performs GC in two steps. ?First, it
>> requests the objects to drop their references that participate in the
>> cycle. ?Then, it enqueues the decref'ed object for an unnested
>> destruction.
>

Castironpi,
I don't think your solution solves the problem. In a single stage
finalization design, it is allways possible to call the destructors of
the objects in the cycle in random order. The problem is that now when
A gets finalized, it cannot use its reference to B anymore because B
may have already been finalized, and thus we cannot assume B can still
be used for anything usefull. The problem, of course, is that one of A
or B may still need the other during its finalization.

In your solution, the real question is what the state of an object is
supposed to be when it is in between the two stages of finalization.
Is it still supposed to be a fully functional object, that handles all
operations just as if it were still fully alive? In that case the
object can only drop the references that it doesn't actually need to
perform any of its operations (not just finalization). But if we
assume that an object has all its references for a reason, there is
nothing it can drop. (except if it uses a reference for caching or
similar things. But I think that is only a minority of all use cases.)
If you propose an object counts as 'finalized' (or at least, no longer
fully functional) when it is in between stages of finalization, we
have the same problem as in the single stage random order
finalization: other objects that refer to it can no longer use it for
anything usefull.

The only option that is left is to have the object be in some
in-between state. But that really complicates Pythons object model,
because every object now has two visible states: alive and
about-to-die. So every object that wants to support this form of
finalization has to specify what kind of operations are still
available in its about-to-die state, and all destructors of all
objects need to restrict themselves to only these kind of operations.
And then, of course, there is still the question of what to do if
there are still cycles left after the first stage.

If you still think your proposal is usefull, you'll probably need to
explain why these problems don't matter enough or whether there are
important use cases that it solves.


From jh at improva.dk  Tue Mar 31 03:30:47 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 31 Mar 2009 03:30:47 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D16294.9030205@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com>
	<49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk>
	<49D16294.9030205@canterbury.ac.nz>
Message-ID: <49D17247.20705@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>
>> Even without changing throw and close, I still think we should 
>> forward GeneratorExit like any other exception, and not do anything 
>> special to reraise it or call close on the subiterator.
>
> But that allows you to inadvertently create a broken
> generator by calling another generator that, according to
> the rules you've just acknowledged we can't change, is
> behaving correctly.

According to the rules for generator finalization it might behave 
correctly.  However, in most cases this will be code that is breaking 
the rule about not catching KeyboardInterrupt and SystemExit.  This is 
broken code IMNSHO, and I don't think we should complicate the 
yield-from expression to cater for it.   Yes there might be existing 
code that is not broken even by that standard and that still converts 
GeneratorExit to StopIteration.  I don't think that is common enough 
that we have to care.  If you use such a generator in a yield-from 
expression, you will get a RuntimeError('generator ignored 
GeneratorExit') on close, telling you that something is wrong.

>
> Asking users not to call such generators would require
> them to have knowledge about the implementation of every
> generator they call, which I don't think is acceptable.
>

I think that getting a RuntimeError on close is sufficient indication 
that such a generator should not be used in yield-from.

That said, I don't really care much either way.  Both versions are 
acceptable to me, and it is your PEP.

- Jacob


From greg.ewing at canterbury.ac.nz  Tue Mar 31 05:25:37 2009
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 31 Mar 2009 15:25:37 +1200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D17247.20705@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz> <49C57698.7030808@gmail.com>
	<49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com>
	<49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk>
	<49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk>
Message-ID: <49D18D31.9000008@canterbury.ac.nz>

Jacob Holm wrote:
> in most cases this will be code that is breaking the rule about
 > not catching KeyboardInterrupt and SystemExit.

Not necessarily, it could be doing

   except GeneratorExit:
     return

> If you use such a generator in a yield-from 
> expression, you will get a RuntimeError('generator ignored 
> GeneratorExit') on close, telling you that something is wrong.

But it won't be at all clear *what* is wrong or what to
do about it. The caller is making a perfectly ordinary
yield-from call, and he's calling what looks to all the
world like a perfectly well-behaved iterator. Where's
the mistake?

Remember that the generator being called may have been
written by someone else. The caller may not know anything
about its internals or be in a position to fix them if
he did.

 > I think that getting a RuntimeError on close is sufficient indication
 > that such a generator should not be used in yield-from.

But it's a perfectly valid generator by current standards.
I don't want to declare some existing class of generators
as being second-class citizens with respect to yield-from,
especially based on some internal implementation detail
unknowable to its caller.

-- 
Greg


From jh at improva.dk  Tue Mar 31 11:44:06 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 31 Mar 2009 11:44:06 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D18D31.9000008@canterbury.ac.nz>
References: <49AB1F90.7070201@canterbury.ac.nz>
	<49C5F3AE.4060402@canterbury.ac.nz>
	<49C60430.7030108@canterbury.ac.nz>
	<49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk>
	<49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz>
	<49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com>
	<49C9D162.5040907@canterbury.ac.nz>
	<49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk>
	<49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk>
	<49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk>
	<49CB155E.4040504@canterbury.ac.nz>
	<49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz>
	<49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com>
	<49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk>
	<49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com>
	<49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk>
	<49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk>
	<49D18D31.9000008@canterbury.ac.nz>
Message-ID: <49D1E5E6.5000007@improva.dk>

Greg Ewing wrote:
> Jacob Holm wrote:
>> in most cases this will be code that is breaking the rule about
> > not catching KeyboardInterrupt and SystemExit.
>
> Not necessarily, it could be doing
>
>   except GeneratorExit:
>     return

I said *most* cases, not all.  I don't have any proof of this, just a 
gut feeling that the majority of generators that convert GeneratorExit 
to StopIteration do so because they are using a return in a finally clause.

>
>> If you use such a generator in a yield-from expression, you will get 
>> a RuntimeError('generator ignored GeneratorExit') on close, telling 
>> you that something is wrong.
>
> But it won't be at all clear *what* is wrong or what to
> do about it. The caller is making a perfectly ordinary
> yield-from call, and he's calling what looks to all the
> world like a perfectly well-behaved iterator. Where's
> the mistake?

If this was documented in the PEP, I would say the mistake was in using 
such a generator in yield-from that wasn't the final yield.  Note that 
it is perfectly ok to use such a generator in a yield-from as long as no 
outer generator yields afterwards.

>
> Remember that the generator being called may have been
> written by someone else. The caller may not know anything
> about its internals or be in a position to fix them if
> he did.

Right, that makes it harder to fix the source of the problem.

>
> > I think that getting a RuntimeError on close is sufficient indication
> > that such a generator should not be used in yield-from.
>
> But it's a perfectly valid generator by current standards.
> I don't want to declare some existing class of generators
> as being second-class citizens with respect to yield-from,
> especially based on some internal implementation detail
> unknowable to its caller.
>

I get that.

As I see it we have the following options, listed in my order of preference:

   1. Don't throw GeneratorExit to the subiterator but raise it in the
      outer generator, and don't explicitly call close.  This is the
      only version where sharing a subgenerator does not require special
      care.  It has the problem that it behaves differently in
      refcounting and non-refcounting implementations due to the
      implicit close that would happen after the yield-from in
      refcounting implementations.  It also breaks the inlining
      principle in the case of throw(GeneratorExit).
   2. Do throw GeneratorExit and don't try to reraise it.  This is the
      version that most closely follows the inlining principle.  It has
      the problem that generators that convert GeneratorExit to
      StopIteration can only be used in a yield-from if none of the
      outer generators do a yield afterwards.  Breaking this rule gives
      a RuntimeError('generator ignored GeneratorExit') on close.
   3. Do throw GeneratorExit to the subiterator, and explicitly reraise
      it if it was converted to a StopIteration.  It has the problem
      that it breaks the inlining principle for generators that convert
      GeneratorExit to StopIteration.
   4. Don't throw GeneratorExit to the subiterator, instead explicitly
      call close before raising it in the outer generator.  This is the
      behavior that #1 would have for non-shared generators in a
      refcounting implementation.  Same problem as #3 and hides the
      GeneratorExit from non-generators.

My guess is that your preference is more like 4, 3, 2, 1.  #3 is closest 
to what is in the current PEP, and is probably what it meant to say. 
(The PEP checks if the thrown exception was GeneratorExit, then does a 
bare raise instead of raising the thrown exception).

- Jacob


From ncoghlan at gmail.com  Tue Mar 31 14:08:30 2009
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 Mar 2009 22:08:30 +1000
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D1E5E6.5000007@improva.dk>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C60430.7030108@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk>	<49D05C8F.3040800@canterbury.ac.nz>
	<49D0A324.1030701@gmail.com>	<49D143B1.9040009@canterbury.ac.nz>
	<49D1542E.7070503@improva.dk>	<49D16294.9030205@canterbury.ac.nz>
	<49D17247.20705@improva.dk>	<49D18D31.9000008@canterbury.ac.nz>
	<49D1E5E6.5000007@improva.dk>
Message-ID: <49D207BE.8090909@gmail.com>

Jacob Holm wrote:
> My guess is that your preference is more like 4, 3, 2, 1.  #3 is closest
> to what is in the current PEP, and is probably what it meant to say.
> (The PEP checks if the thrown exception was GeneratorExit, then does a
> bare raise instead of raising the thrown exception).

4, 3, 2, 1 is the position I've come around to. Since using send(),
throw() and close() on a shared subiterator doesn't make any sense, and
the whole advantage of the new expression over a for loop is to make it
easy to delegate send() throw() and close() correctly, I now believe
that shared subiterators are best handled by actually *iterating* over
them in a for loop rather than by delegating to them with "yield from".

So the fact that a definition of yield from that provides prompt
finalisation guarantees isn't friendly to using it with shared
subiterators is actually now a *bonus* in my book - it should hopefully
serve as a hint to developers that they're misusing the tool.

By adopting position 4, I believe the guarantees for the exception
handling in the new expression become as simple as possible:
 - if the subiterator does not provide a throw() method, or the
exception thrown in is GeneratorExit, then the subiterator's close()
method (if any) is called and the thrown in exception raised in the
current frame
 - otherwise, the exception (including traceback) is passed down to the
subiterator's throw() method

With these semantics, subiterators will be finalised promptly when the
outermost generator is finalised without any special effort on the
developer's part and it won't be trivially easy to accidentally suppress
GeneratorExit.

To my mind, the practical benefits of such an approach are enough to
justify the deviation from the general 'inline behaviour' guideline.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


From dangyogi at gmail.com  Tue Mar 31 16:09:50 2009
From: dangyogi at gmail.com (Bruce Frederiksen)
Date: Tue, 31 Mar 2009 10:09:50 -0400
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D207BE.8090909@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>	<49CF6AAF.70109@improva.dk>	<49D05C8F.3040800@canterbury.ac.nz>	<49D0A324.1030701@gmail.com>	<49D143B1.9040009@canterbury.ac.nz>	<49D1542E.7070503@improva.dk>	<49D16294.9030205@canterbury.ac.nz>	<49D17247.20705@improva.dk>	<49D18D31.9000008@canterbury.ac.nz>	<49D1E5E6.5000007@improva.dk>
	<49D207BE.8090909@gmail.com>
Message-ID: <49D2242E.9040302@gmail.com>

Nick Coghlan wrote:
> 4, 3, 2, 1 is the position I've come around to. [...]
What he said.

I think that 4 also has the advantage of raising RuntimeError in the 
inner generator's close method (using the definition of close provided 
in PEP 342) when the inner generator doesn't obey the rules for 
GeneratorExit laid out in PEP 342.  Throwing GeneratorExit to the inner 
generator causes the outer generator's close to report the RuntimeError, 
which pins the blame on the wrong generator (in the stack traceback, 
which won't even show the inner generator).

-bruce frederiksen


From jh at improva.dk  Tue Mar 31 19:41:16 2009
From: jh at improva.dk (Jacob Holm)
Date: Tue, 31 Mar 2009 19:41:16 +0200
Subject: [Python-ideas] Yield-From: Finalization guarantees
In-Reply-To: <49D207BE.8090909@gmail.com>
References: <49AB1F90.7070201@canterbury.ac.nz>	<49C6F695.1050100@gmail.com>
	<49C789F3.30301@improva.dk>	<49C7F0C3.10904@gmail.com>
	<49C81621.9040600@canterbury.ac.nz>	<49C81A45.1070803@canterbury.ac.nz>
	<49C94CF6.5070301@gmail.com>	<49C9D162.5040907@canterbury.ac.nz>	<49CA20F2.7040207@gmail.com>
	<49CA4029.6050703@improva.dk>	<49CABFC6.1080207@canterbury.ac.nz>
	<49CAC0FE.5010305@improva.dk>	<49CACB39.3020708@canterbury.ac.nz>
	<49CAD15D.2090008@improva.dk>	<49CB155E.4040504@canterbury.ac.nz>	<49CB8E4A.3050108@improva.dk>
	<49CC5D85.30409@canterbury.ac.nz>	<49CE29BF.3040502@improva.dk>
	<49CEB8DE.8060200@gmail.com>	<49CEBCD5.7020107@canterbury.ac.nz>
	<49CF6AAF.70109@improva.dk>	<49D05C8F.3040800@canterbury.ac.nz>
	<49D0A324.1030701@gmail.com>	<49D143B1.9040009@canterbury.ac.nz>
	<49D1542E.7070503@improva.dk>	<49D16294.9030205@canterbury.ac.nz>
	<49D17247.20705@improva.dk>	<49D18D31.9000008@canterbury.ac.nz>
	<49D1E5E6.5000007@improva.dk> <49D207BE.8090909@gmail.com>
Message-ID: <49D255BC.6080503@improva.dk>

Nick Coghlan wrote:
> 4, 3, 2, 1 is the position I've come around to. 
>
> [...snip...]
>
> By adopting position 4, I believe the guarantees for the exception
> handling in the new expression become as simple as possible:
>  - if the subiterator does not provide a throw() method, or the
> exception thrown in is GeneratorExit, then the subiterator's close()
> method (if any) is called and the thrown in exception raised in the
> current frame
>  - otherwise, the exception (including traceback) is passed down to the
> subiterator's throw() method
>   

Below I have attached a heavily annotated version of the expansion that 
I expect for #4.  This version fixes an issue I have forgotten to 
mention where the subiterator is not closed due to an AttributeError 
caused by a missing send method.

> With these semantics, subiterators will be finalised promptly when the
> outermost generator is finalised without any special effort on the
> developer's part and it won't be trivially easy to accidentally suppress
> GeneratorExit.
>   

The way I see it, it will actually be hard to do even on purpose, unless 
you are willing to take a significant performance hit by using a 
non-generator wrapper for every generator.

> To my mind, the practical benefits of such an approach are enough to
> justify the deviation from the general 'inline behaviour' guideline.
>
>   

I disagree, but it seems like I am the only one here that does.   It 
will eliminate a potential pitfall, but will also remove some behavior 
that could have been useful, such as the ability to suppress the 
GeneratorExit if you know what you are doing.

- Jacob


------------------------------------------------------------------------

_i = iter(EXPR)  # Raises TypeError if not an iterable.
try:
    _x = None   # No current exception.
    _y = _i.__next__() # Guaranteed to be there by iter().
    while 1:
        try:
            _s = yield _y
        except BaseException as _e:
            # An exception was thrown in, either by a call to throw() on the generator or implicitly by a call
            # to close().
            _x = _e # Save the thrown-in exception as current.
            if isinstance(_x, GeneratorExit): 
                _m = None  # Don't forward GeneratorExit.
            else:
                _m = getattr(_i, 'throw', None) # Forward any other exception if there is a throw() method.
            if _m is None:
                # Not forwarding. Exit loop and go to finally clause (possibly via "except StopIteration"),
                # which will close _i before reraising _x.
                raise
            _y = _m(_x)
        else:
            if _s is None:
                # Either a send(None) or a __next__(), forward as __next__().
                _x = None # No current exception
                _y = _i.__next__() # Guaranteed to be there by iter().
            else:
                # A send(non-None).  We need to handle the case where the subiterator has no send() method.
                try:
                    _m = _i.send
                except AttributeError as _e:
                    # No send method.  Ensure that the subiterator is closed, then reraise the AttributeError.
                    _x = _e   # Save the AttributeError as the current exception.
                    _m = None # Clear _m so we know _x has not been forwarded.
                    raise     # Exit loop and go to finally clause, which will close _i before reraising _x.
                else:
                    _x = None # No current exception.
                    _y = _m(s)
except StopIteration as _e:
    if _e is _x:
        # If _e was just thrown in, reraise it.  If the exception has been forwarded to the subiterator,
        # the subiterator is assumed closed.  In that case _m will be non-None, so the subiterator will not be
        # closed again by the finally clause.  Conversely, if the exception was not forwarded _m will be None
        # and the finally clause takes care of closing it before reraising the exception.
        raise
    # Normal return.  If we get here, the StopIteration was raised by a __next__(), send() or throw() on the
    # subiterator which will therefore already be closed.  In this case either _x is None or _m is not None, so
    # the the subiterator will not be closed again by the finally clause.
    RESULT = _e.value
finally:
    if _x is not None and _m is None:
        # An exception is active and was not raised by the subiterator.  Explicitly call close before the 
        # exception is automatically reraised by the finally clause.  If close raises an exception, that will
        # take over.
        _m = getattr(_i, 'close', None)
        if _m is not None:
            _m()