From tjreedy at udel.edu  Wed Feb  1 01:29:36 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 31 Jan 2012 19:29:36 -0500
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
Message-ID: <jga11j$2gt$1@dough.gmane.org>

On 1/31/2012 1:06 PM, Eric Snow wrote:

> +1 for reconsidering the d.[name] / d.(name) / d!name syntax.

d.[name] is too much like d[name] The . that modifies the meaning of 
'name' is too far away.

d.(name) is like d.name except to me the () means to use the value of 
name rather than 'name' itself. This is just what you are trying to say. 
I believe () is used elsewhere with that meaning. I could live with this.

d!name has the advantage? of no brackets, but just looks crazy since ! 
meant 'not' in Python.

-- 
Terry Jan Reedy


From ethan at stoneleaf.us  Wed Feb  1 01:42:24 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 31 Jan 2012 16:42:24 -0800
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <jga11j$2gt$1@dough.gmane.org>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>	<20120130200226.0a5ab1d8@pitrou.net>	<4F26EE37.2040104@stoneleaf.us>
	<jg8448$ioq$1@dough.gmane.org>	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org>
Message-ID: <4F288A70.5000705@stoneleaf.us>

Terry Reedy wrote:
> On 1/31/2012 1:06 PM, Eric Snow wrote:
> 
>> +1 for reconsidering the d.[name] / d.(name) / d!name syntax.
> 
> d.[name] is too much like d[name] The . that modifies the meaning of 
> 'name' is too far away.
> 
> d.(name) is like d.name except to me the () means to use the value of 
> name rather than 'name' itself. This is just what you are trying to say. 
> I believe () is used elsewhere with that meaning. I could live with this.
> 
> d!name has the advantage? of no brackets, but just looks crazy since ! 
> meant 'not' in Python.


I'm not a fan of any of the .[], .(), .{} patterns, nor of .! .

What about the colon?

d:name  #use the value of name

~Ethan~


From python at mrabarnett.plus.com  Wed Feb  1 02:06:18 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 01 Feb 2012 01:06:18 +0000
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <4F288A70.5000705@stoneleaf.us>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>	<20120130200226.0a5ab1d8@pitrou.net>	<4F26EE37.2040104@stoneleaf.us>
	<jg8448$ioq$1@dough.gmane.org>	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
Message-ID: <4F28900A.3080505@mrabarnett.plus.com>

On 01/02/2012 00:42, Ethan Furman wrote:
> Terry Reedy wrote:
>>  On 1/31/2012 1:06 PM, Eric Snow wrote:
>>
>>>  +1 for reconsidering the d.[name] / d.(name) / d!name syntax.
>>
>>  d.[name] is too much like d[name] The . that modifies the meaning of
>>  'name' is too far away.
>>
>>  d.(name) is like d.name except to me the () means to use the value of
>>  name rather than 'name' itself. This is just what you are trying to say.
>>  I believe () is used elsewhere with that meaning. I could live with this.
>>
>>  d!name has the advantage? of no brackets, but just looks crazy since !
>>  meant 'not' in Python.
>
>
> I'm not a fan of any of the .[], .(), .{} patterns, nor of .! .
>
.() looks the most sensible to me.

> What about the colon?
>
> d:name  #use the value of name
>
Surely you jest? :-)


From simon.sapin at kozea.fr  Wed Feb  1 09:06:33 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Wed, 01 Feb 2012 09:06:33 +0100
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <4F28900A.3080505@mrabarnett.plus.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>	<20120130200226.0a5ab1d8@pitrou.net>	<4F26EE37.2040104@stoneleaf.us>
	<jg8448$ioq$1@dough.gmane.org>	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com>
Message-ID: <4F28F289.3080407@kozea.fr>

Le 01/02/2012 02:06, MRAB a ?crit :
>> >  I'm not a fan of any of the .[], .(), .{} patterns, nor of .! .
>> >
> .() looks the most sensible to me.
>

If .[] looks like indexing, .() looks like calling. (I?m not for or 
against either of these, just pointing out that they have the same problem.)

Regards,

-- 
Simon Sapin


From jsbueno at python.org.br  Wed Feb  1 12:33:46 2012
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Wed, 1 Feb 2012 09:33:46 -0200
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <4F28F289.3080407@kozea.fr>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com> <4F28F289.3080407@kozea.fr>
Message-ID: <CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>

On Wed, Feb 1, 2012 at 6:06 AM, Simon Sapin <simon.sapin at kozea.fr> wrote:
> Le 01/02/2012 02:06, MRAB a ?crit :
>
>>> > ?I'm not a fan of any of the .[], .(), .{} patterns, nor of .! .
>>> >
>>
>> .() looks the most sensible to me.
>>
>
> If .[] looks like indexing, .() looks like calling. (I?m not for or against
> either of these, just pointing out that they have the same problem.)

Still, there should be something with a closing token.
Try to imagine three of these in a chain, if the syntax is a colon:

name1:name2:name3:name4 -> which could mean either of:

name1:(name2:(name3:name4)), (name1:name2):(name3.name4)
and so on  - (not to mention other expressions involving names, though these
would be less ambiguous due to to operator precedence.

  js
 -><-


>
> Regards,
>
> --
> Simon Sapin
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From p.f.moore at gmail.com  Wed Feb  1 14:05:44 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 1 Feb 2012 13:05:44 +0000
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com> <4F28F289.3080407@kozea.fr>
	<CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>
Message-ID: <CACac1F8Fryvc7H6cJZjhHVqLJm+3x_DDevQWyaMRvvPQAKpvtg@mail.gmail.com>

On 1 February 2012 11:33, Joao S. O. Bueno <jsbueno at python.org.br> wrote:
> Still, there should be something with a closing token.
> Try to imagine three of these in a chain, if the syntax is a colon:
>
> name1:name2:name3:name4 -> which could mean either of:
>
> name1:(name2:(name3:name4)), (name1:name2):(name3.name4)
> and so on ?- (not to mention other expressions involving names, though these
> would be less ambiguous due to to operator precedence.

No more so than a.b.c.d.e

I would expect a:b to behave exactly the same as a.b, Except that it
uses getitem rather than getattr under the hood (was that the
proposal? I'm completely confused by now as to what this new syntax is
intended to achieve...)

But I don't like the idea in any case, so I remain -1 on the whole proposal.

Paul.


From massimo.dipierro at gmail.com  Wed Feb  1 14:32:45 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Wed, 1 Feb 2012 07:32:45 -0600
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <CACac1F8Fryvc7H6cJZjhHVqLJm+3x_DDevQWyaMRvvPQAKpvtg@mail.gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com> <4F28F289.3080407@kozea.fr>
	<CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>
	<CACac1F8Fryvc7H6cJZjhHVqLJm+3x_DDevQWyaMRvvPQAKpvtg@mail.gmail.com>
Message-ID: <BBB07151-BBDE-42B3-9150-469B3EB1915F@gmail.com>

Using x:[....] wouldn't it create ambiguities when parsing (lambda x:[....])?

How about x. as a shortcut for x.__dict__ so we can do

x.key -> x.__dict__['key'] 
x.[key] -> x.__dict__[key]
x..keys() -> x.__dict__.keys()
x..values() -> x.__dict__.values()

for attribute in x.:
     print 'x.'+attribute

and leave open the possibility of 3 dots for for ranges
1...5 -> range(1,5)
1,2...10 -> range(1,10,2-1)


On Feb 1, 2012, at 7:05 AM, Paul Moore wrote:

> On 1 February 2012 11:33, Joao S. O. Bueno <jsbueno at python.org.br> wrote:
>> Still, there should be something with a closing token.
>> Try to imagine three of these in a chain, if the syntax is a colon:
>> 
>> name1:name2:name3:name4 -> which could mean either of:
>> 
>> name1:(name2:(name3:name4)), (name1:name2):(name3.name4)
>> and so on  - (not to mention other expressions involving names, though these
>> would be less ambiguous due to to operator precedence.
> 
> No more so than a.b.c.d.e
> 
> I would expect a:b to behave exactly the same as a.b, Except that it
> uses getitem rather than getattr under the hood (was that the
> proposal? I'm completely confused by now as to what this new syntax is
> intended to achieve...)
> 
> But I don't like the idea in any case, so I remain -1 on the whole proposal.
> 
> Paul.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From masklinn at masklinn.net  Wed Feb  1 14:44:09 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 1 Feb 2012 14:44:09 +0100
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <BBB07151-BBDE-42B3-9150-469B3EB1915F@gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com> <4F28F289.3080407@kozea.fr>
	<CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>
	<CACac1F8Fryvc7H6cJZjhHVqLJm+3x_DDevQWyaMRvvPQAKpvtg@mail.gmail.com>
	<BBB07151-BBDE-42B3-9150-469B3EB1915F@gmail.com>
Message-ID: <88A63247-5EBD-4213-B065-36A81B4610F6@masklinn.net>

On 2012-02-01, at 14:32 , Massimo Di Pierro wrote:
> Using x:[....] wouldn't it create ambiguities when parsing (lambda x:[....])?
> 
> How about x. as a shortcut for x.__dict__ so we can do
> 
> x.key -> x.__dict__['key'] 
> x.[key] -> x.__dict__[key]
> x..keys() -> x.__dict__.keys()
> x..values() -> x.__dict__.values()
> 
> for attribute in x.:
>     print 'x.'+attribute
> 
> and leave open the possibility of 3 dots for for ranges
> 1...5 -> range(1,5)
> 1,2...10 -> range(1,10,2-1)

Yeah, readability schmeadability.

Also, 

    >>> 1..__int__()
    1

that's going to look good.

From g.brandl at gmx.net  Wed Feb  1 20:35:04 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 01 Feb 2012 20:35:04 +0100
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <BBB07151-BBDE-42B3-9150-469B3EB1915F@gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
	<4F28900A.3080505@mrabarnett.plus.com> <4F28F289.3080407@kozea.fr>
	<CAH0mxTQvnkfDL0eV0wPygig_0O04c-sJ5eAit8W60bbkNRvWuw@mail.gmail.com>
	<CACac1F8Fryvc7H6cJZjhHVqLJm+3x_DDevQWyaMRvvPQAKpvtg@mail.gmail.com>
	<BBB07151-BBDE-42B3-9150-469B3EB1915F@gmail.com>
Message-ID: <jgc453$p7u$1@dough.gmane.org>

Am 01.02.2012 14:32, schrieb Massimo Di Pierro:
> Using x:[....] wouldn't it create ambiguities when parsing (lambda x:[....])?
> 
> How about x. as a shortcut for x.__dict__ so we can do
> 
> x.key -> x.__dict__['key'] 
> x.[key] -> x.__dict__[key]
> x..keys() -> x.__dict__.keys()
> x..values() -> x.__dict__.values()
> 
> for attribute in x.:
>      print 'x.'+attribute
> 
> and leave open the possibility of 3 dots for for ranges
> 1...5 -> range(1,5)

Actually no, because 1. is a float literal. So

1...keys()

would already be valid, and you have to use four dots for ranges.
I would suggest five to be on the safe side (plus it has as many
dots as there are letters in "range", therefore easy to remember).

SCNR,
Georg


From jeanpierreda at gmail.com  Thu Feb  2 07:40:14 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Thu, 2 Feb 2012 01:40:14 -0500
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <4F288A70.5000705@stoneleaf.us>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F288A70.5000705@stoneleaf.us>
Message-ID: <CABicbJJ_iR5HC_JiXZnZFRsDDExm5FN4sR9NYudtv+R-p6fzaA@mail.gmail.com>

On Tue, Jan 31, 2012 at 7:42 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> What about the colon?

The colon would be confusing in some circumstances. It's already used
inside dict literals and slices.

-- Devin


From techtonik at gmail.com  Thu Feb  2 09:41:39 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 2 Feb 2012 11:41:39 +0300
Subject: [Python-ideas] PEP x: Static module/package inspection
In-Reply-To: <CAOFbRmLZQvUdf=C+ga8SVkbwhp-eYmy3v3h9HXny3cB1w=hMWQ@mail.gmail.com>
References: <29228470.233.1324982829840.JavaMail.geo-discussion-forums@yqbl25>
	<20544069.58.1325067324546.JavaMail.geo-discussion-forums@yqiz15>
	<CAOFbRmLZQvUdf=C+ga8SVkbwhp-eYmy3v3h9HXny3cB1w=hMWQ@mail.gmail.com>
Message-ID: <CAPkN8xJVdnzS3397rx=tgZ4kTdnK=Ek2=kMUrdxPicFMF+FodA@mail.gmail.com>

A rather user friendly proof of the concept with `ast` module is ready.
http://pypi.python.org/pypi/astdump/

`astdump` contains get_top_vars() method, which extracts sufficient
information from module's AST to generate setup.py for itself. This
capability can already be reused for plugin version discovery mechanisms.
ISTM the working library should motivate authors better than a PEP
convention. =)

`astdump` doesn't provide complete module introspection capabilities. I've
primarily focused on getting the output done, so for a proper API it would
be nice to study use case examples first. `astdump` contains tree walker
with filtering capabilities by node type and level. What "python-object"
should expose and how to make this convenient is not completely clear for
me.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120202/a112eaa4/attachment.html>

From ncoghlan at gmail.com  Thu Feb  2 14:35:10 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 2 Feb 2012 23:35:10 +1000
Subject: [Python-ideas] PEP x: Static module/package inspection
In-Reply-To: <CAKCKLWx5b1zkUw97+6k4R1EwU3hydj_852gUkfb31bzM7h9L9g@mail.gmail.com>
References: <29228470.233.1324982829840.JavaMail.geo-discussion-forums@yqbl25>
	<20544069.58.1325067324546.JavaMail.geo-discussion-forums@yqiz15>
	<CAKCKLWx5b1zkUw97+6k4R1EwU3hydj_852gUkfb31bzM7h9L9g@mail.gmail.com>
Message-ID: <CADiSq7cBu6mCjji8U9_B7V20mochbg+sZkNiOp6HbLOxwcS+ag@mail.gmail.com>

On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord <fuzzyman at gmail.com> wrote:
> On a simple level, all of this is already "obtainable" by using the ast
> module that can parse Python code. I would love to see a "python-object"
> layer on top of this that will take an ast for a module (or other object)
> and return something that represents the same object as the ast.
>
> So all module level objects will have corresponding objects - where they are
> Python objects (builtin-literals) then they will represented exactly. For
> classes and functions you'll get an object back that has the same attributes
> plus some metadata (e.g. for functions /? methods what arguments they take
> etc).
>
> That is certainly doable and would make introspecting-without-executing a
> lot simpler.

The existing 'clbr' (class browser) module in the stdlib also attempts
to play in this same space. I wouldn't say it does it particularly
*well* (since it's easy to confuse with valid Python constructs), but
it tries.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From yselivanov.ml at gmail.com  Fri Feb  3 16:09:49 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 3 Feb 2012 10:09:49 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
Message-ID: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>

Hello,

With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.

Consider the following snipped of code:

  ctxs = ()
  if args.profile:
      ctxs += (ApplicationProfilerContext(),)
  if args.logging:
      ctxs += (ApplicationLoggingContext(),)
  with *ctxs:
      Application.run()

As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.

What do you think guys?

Thanks,
Yury

From grosser.meister.morti at gmx.net  Fri Feb  3 17:30:12 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Fri, 03 Feb 2012 17:30:12 +0100
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
Message-ID: <4F2C0B94.6090005@gmx.net>

Of course there is something to replace nested:

 >>> with open("egg.txt","w") as egg, open("spam.txt","w") as spam:
 >>>     egg.write("egg")
 >>>     spam.write("spam")

The nested function was removed because it is broken. E.g. take this:

 >>> with nested(open("egg.txt","w"), open("spam.txt","w")) as egg, spam:
 >>>     egg.write("egg")
 >>>     barspamwrite("spam")

What if opening of spam.txt produces an exception? Then egg.txt will never be closed! The new with 
syntax takes care of this. It basically rewrites it as:

 >>> with open("egg.txt","w") as egg:
 >>>     with open("spam.txt","w") as spam:
 >>>         egg.write("egg")
 >>>         spam.write("spam")

On 02/03/2012 04:09 PM, Yury Selivanov wrote:
> Hello,
>
> With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.
>
> Consider the following snipped of code:
>
>    ctxs = ()
>    if args.profile:
>        ctxs += (ApplicationProfilerContext(),)
>    if args.logging:
>        ctxs += (ApplicationLoggingContext(),)
>    with *ctxs:
>        Application.run()
>
> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.
>
> What do you think guys?
>
> Thanks,
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From grosser.meister.morti at gmx.net  Fri Feb  3 17:35:10 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Fri, 03 Feb 2012 17:35:10 +0100
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
Message-ID: <4F2C0CBE.5010708@gmx.net>

Oh, wait. You do something a bit different. Hm, yes, when you have a list of context managers its 
something different. Still, I'm not sure if it is a good thing to do it like you've proposed. After 
all, usually the constructor of a nested context manager shall only be called if the parent context 
could be entered. You would construct all context managers before you enter any. Maybe it's ok for 
for your case, but it might send the wrong signal to the developers and might be used like nested 
was (see my other mail).

On 02/03/2012 04:09 PM, Yury Selivanov wrote:
> Hello,
>
> With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.
>
> Consider the following snipped of code:
>
>    ctxs = ()
>    if args.profile:
>        ctxs += (ApplicationProfilerContext(),)
>    if args.logging:
>        ctxs += (ApplicationLoggingContext(),)
>    with *ctxs:
>        Application.run()
>
> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.
>
> What do you think guys?
>
> Thanks,
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From yselivanov.ml at gmail.com  Fri Feb  3 17:36:53 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 3 Feb 2012 11:36:53 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <4F2C0B94.6090005@gmx.net>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<4F2C0B94.6090005@gmx.net>
Message-ID: <EB295595-7881-455E-9CCF-190A89478CEA@gmail.com>

This is not about explicitly writing comma-separated list of context
managers in the statement, but rather about ability to compose this list
dynamically.

With unpacking you won't have the problem of uncaught exception in
__new__/__init__, because such exception would propagate on the
stage of constructing the list of managers.  Exceptions occurred
during the with statement execution, i.e. in __enter__ and __exit__
methods will work just fine, or am I missing something?

On 2012-02-03, at 11:30 AM, Mathias Panzenb?ck wrote:

> Of course there is something to replace nested:
> 
> >>> with open("egg.txt","w") as egg, open("spam.txt","w") as spam:
> >>>     egg.write("egg")
> >>>     spam.write("spam")
> 
> The nested function was removed because it is broken. E.g. take this:
> 
> >>> with nested(open("egg.txt","w"), open("spam.txt","w")) as egg, spam:
> >>>     egg.write("egg")
> >>>     barspamwrite("spam")
> 
> What if opening of spam.txt produces an exception? Then egg.txt will never be closed! The new with syntax takes care of this. It basically rewrites it as:
> 
> >>> with open("egg.txt","w") as egg:
> >>>     with open("spam.txt","w") as spam:
> >>>         egg.write("egg")
> >>>         spam.write("spam")
> 
> On 02/03/2012 04:09 PM, Yury Selivanov wrote:
>> Hello,
>> 
>> With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.
>> 
>> Consider the following snipped of code:
>> 
>>   ctxs = ()
>>   if args.profile:
>>       ctxs += (ApplicationProfilerContext(),)
>>   if args.logging:
>>       ctxs += (ApplicationLoggingContext(),)
>>   with *ctxs:
>>       Application.run()
>> 
>> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.
>> 
>> What do you think guys?
>> 
>> Thanks,
>> Yury
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From yselivanov.ml at gmail.com  Fri Feb  3 17:47:29 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 3 Feb 2012 11:47:29 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <4F2C0CBE.5010708@gmx.net>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<4F2C0CBE.5010708@gmx.net>
Message-ID: <CDAC68B4-5C21-4348-843F-21B421E9F98B@gmail.com>

Well, I bet most of the developers will continue using explicit 
syntax as it is just more convenient.  Unpacking is just a specific
feature to address some specific needs, where the case about
"not executing constructor in case of parent context fault" may not
be applicable.  "With" statement is far more now than just about
opening files after all ;)

On 2012-02-03, at 11:35 AM, Mathias Panzenb?ck wrote:

> Oh, wait. You do something a bit different. Hm, yes, when you have a list of context managers its something different. Still, I'm not sure if it is a good thing to do it like you've proposed. After all, usually the constructor of a nested context manager shall only be called if the parent context could be entered. You would construct all context managers before you enter any. Maybe it's ok for for your case, but it might send the wrong signal to the developers and might be used like nested was (see my other mail).
> 
> On 02/03/2012 04:09 PM, Yury Selivanov wrote:
>> Hello,
>> 
>> With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.
>> 
>> Consider the following snipped of code:
>> 
>>   ctxs = ()
>>   if args.profile:
>>       ctxs += (ApplicationProfilerContext(),)
>>   if args.logging:
>>       ctxs += (ApplicationLoggingContext(),)
>>   with *ctxs:
>>       Application.run()
>> 
>> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.
>> 
>> What do you think guys?
>> 
>> Thanks,
>> Yury
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>> 
> 
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From fuzzyman at gmail.com  Fri Feb  3 18:50:06 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Fri, 3 Feb 2012 17:50:06 +0000
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
Message-ID: <CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>

On 3 February 2012 15:09, Yury Selivanov <yselivanov.ml at gmail.com> wrote:

> Hello,
>
> With the removal of "contextlib.nested" in python 3.2 nothing was
> introduced to replace it.  However, I found it pretty useful, despite the
> fact that it had its own quirks.  These quirks can (at least partially) be
> addressed by allowing unpacking syntax in the context manager.
>
> Consider the following snipped of code:
>
>  ctxs = ()
>  if args.profile:
>      ctxs += (ApplicationProfilerContext(),)
>  if args.logging:
>      ctxs += (ApplicationLoggingContext(),)
>  with *ctxs:
>      Application.run()
>


Well, I quite like this syntax and it does allow you to do something not
currently easily possible:

with *ctxs as tuple_of_results:
   ...

The use case is reasonably obscure however, and should this be possible:

with *ctx, other as tuple_of_results, another:
    ...

Michael


>
> As of now, without "nested" we have either option of reimplementing it, or
> to write lots of ugly code with nested 'try..except's.  So the feature was
> taken out, but nothing replaced it.
>
> What do you think guys?
>
> Thanks,
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120203/4afa057f/attachment.html>

From yselivanov.ml at gmail.com  Fri Feb  3 19:06:56 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 3 Feb 2012 13:06:56 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
Message-ID: <9BC442E1-569C-43D5-8A70-07B72ECCE50B@gmail.com>

On 2012-02-03, at 12:50 PM, Michael Foord wrote:

> with *ctxs as tuple_of_results:

This is not necessary, as 'ctxs' already holds all instances of
all context managers; so the 'ctxs' would be equal to 'tuple_of_results'

> with *ctx, other as tuple_of_results, another:
>     ...

Looks useful to me.

-
Yury


From fuzzyman at gmail.com  Fri Feb  3 19:09:56 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Fri, 3 Feb 2012 18:09:56 +0000
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <9BC442E1-569C-43D5-8A70-07B72ECCE50B@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
	<9BC442E1-569C-43D5-8A70-07B72ECCE50B@gmail.com>
Message-ID: <CAKCKLWzE=4Ty9A+rEP1agJdkLviELszdr7dSQSUM6HysxkJsTg@mail.gmail.com>

On 3 February 2012 18:06, Yury Selivanov <yselivanov.ml at gmail.com> wrote:

> On 2012-02-03, at 12:50 PM, Michael Foord wrote:
>
> > with *ctxs as tuple_of_results:
>
> This is not necessary, as 'ctxs' already holds all instances of
> all context managers; so the 'ctxs' would be equal to 'tuple_of_results'
>

The results are whatever is returned by ctx.__enter__(), not the context
manager itself.

Michael


>
> > with *ctx, other as tuple_of_results, another:
> >     ...
>
> Looks useful to me.
>
> -
> Yury
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120203/98897afd/attachment.html>

From yselivanov.ml at gmail.com  Fri Feb  3 19:11:03 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 3 Feb 2012 13:11:03 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <CAKCKLWzE=4Ty9A+rEP1agJdkLviELszdr7dSQSUM6HysxkJsTg@mail.gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
	<9BC442E1-569C-43D5-8A70-07B72ECCE50B@gmail.com>
	<CAKCKLWzE=4Ty9A+rEP1agJdkLviELszdr7dSQSUM6HysxkJsTg@mail.gmail.com>
Message-ID: <AB605C6E-EE89-4418-BD5B-A704934F5FDA@gmail.com>

Ah, yes, my bad.  Then I'm +1 on that one ;)

On 2012-02-03, at 1:09 PM, Michael Foord wrote:

> 
> 
> On 3 February 2012 18:06, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-02-03, at 12:50 PM, Michael Foord wrote:
> 
> > with *ctxs as tuple_of_results:
> 
> This is not necessary, as 'ctxs' already holds all instances of
> all context managers; so the 'ctxs' would be equal to 'tuple_of_results'
> 
> The results are whatever is returned by ctx.__enter__(), not the context manager itself.
> 
> Michael
>  
> 
> > with *ctx, other as tuple_of_results, another:
> >     ...
> 
> Looks useful to me.
> 
> -
> Yury
> 
> 
> 
> -- 
> http://www.voidspace.org.uk/
> 
> May you do good and not evil
> May you find forgiveness for yourself and forgive others
> 
> May you share freely, never taking more than you give.
> 
> -- the sqlite blessing http://www.sqlite.org/different.html
> 


From ncoghlan at gmail.com  Sat Feb  4 07:22:48 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 4 Feb 2012 16:22:48 +1000
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
Message-ID: <CADiSq7cY62XMAJvS_vbNggNgdh9uMSsoqrq1cUvdNKeD_etYVA@mail.gmail.com>

On Sat, Feb 4, 2012 at 1:09 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's. ?So the feature was taken out, but nothing replaced it.
>
> What do you think guys?

I think you should try contextlib2 :)

Specifically, ContextStack:
http://contextlib2.readthedocs.org/en/latest/index.html#contextlib2.ContextStack

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From storchaka at gmail.com  Sat Feb  4 22:17:49 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sat, 04 Feb 2012 23:17:49 +0200
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
Message-ID: <jgk7a4$s8e$1@dough.gmane.org>

03.02.12 17:09, Yury Selivanov ???????(??):
> With the removal of "contextlib.nested" in python 3.2 nothing was introduced to replace it.  However, I found it pretty useful, despite the fact that it had its own quirks.  These quirks can (at least partially) be addressed by allowing unpacking syntax in the context manager.
> 
> Consider the following snipped of code:
> 
>    ctxs = ()
>    if args.profile:
>        ctxs += (ApplicationProfilerContext(),)
>    if args.logging:
>        ctxs += (ApplicationLoggingContext(),)
>    with *ctxs:
>        Application.run()
> 
> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.


  class EmptyContext:
      def __enter__(self): return self
      def __exit__(self, exc_type, exc_value, traceback): pass

  with ApplicationProfilerContext() if args.profile else EmptyContext():
      with ApplicationLoggingContext() if args.logging else EmptyContext():
          Application.run()

Of cause, it will be better to use some special singleton value (None, False or ellipsis) instead EmptyContext(). If any false value would mean an empty context, we will be able to use "with args.profile and ApplicationProfilerContext()" idiom.


From python at 2sn.net  Sat Feb  4 23:12:53 2012
From: python at 2sn.net (Alexander Heger)
Date: Sat, 04 Feb 2012 16:12:53 -0600
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CAKCKLWyDp45svdPZQhqfdjYiJ0jJxkf9kvOvtrKg=ZY6-9muVw@mail.gmail.com>
Message-ID: <4F2DAD65.1020708@2sn.net>

> Well, I quite like this syntax and it does allow you to do something not
> currently easily possible:
>
> with *ctxs as tuple_of_results:
>     ...
>
> The use case is reasonably obscure however, and should this be possible:
>
> with *ctx, other as tuple_of_results, another:

wouldn't it be

with *ctx, other as *tuple_of_results, another:

to allow more general forms like

with *ctx, other as first, *some_in_the_middle, last:

-Alexander


From python at 2sn.net  Sat Feb  4 22:59:00 2012
From: python at 2sn.net (Alexander Heger)
Date: Sat, 04 Feb 2012 15:59:00 -0600
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <jga11j$2gt$1@dough.gmane.org>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org>
Message-ID: <4F2DAA24.6000300@2sn.net>

>> +1 for reconsidering the d.[name] / d.(name) / d!name syntax.
>
> d.[name] is too much like d[name] The . that modifies the meaning of
> 'name' is too far away.

I think this is the best choice.

 > d.(name)

I think 'x' and ('x') should remain the same

-Alexander


From ncoghlan at gmail.com  Sun Feb  5 01:20:30 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 5 Feb 2012 10:20:30 +1000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
Message-ID: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>

Rather than throwing out random ideas in a popularity contest, it's
important to carefully review what it is that people don't like about
the current state of affairs.

Currently, when dealing with attributes that are statically
determined, your code looks like this:

    x.attr  # Reference
    x.attr = y  # Bind
    del x.attr  # Unbind

Now, suppose for some reason we want to determine the attributes
*dynamically*. The reason for doing this could be as simple as wanting
to avoid code duplication when performing the same operation on
multiple attributes (i.e. "for attr in 'attr1 attr2 attr3'.split():
...").

At this point, dedicated syntactic support disappears and we're now
using builtin functions instead:

    getattr(x, attr)  # Reference
    setattr(x, attr, y)  # Bind
    delattr(x, attr)  # Unbind
    hasattr(x, attr) # Existence query (essentially a shorthand for
getattr() in a try/except block)

So, that's the status quo any proposals are competing against. It's
easy enough to write, easy to read and easy to look up if you don't
already know what it does (an often underestimated advantage of
builtin operations over syntax is that the former are generally *much*
easier to look up in the documentation).

However, it can start to look rather clumsy when multiple dynamic
attribute operations are chained together.

Compare this static code:

    x.attr1 = y.attr1
    x.attr2 = y.attr2
    x.attr3 = y.attr3

With the following dynamic code:

    for attr in "attr1 attr2 attr3".split():
        setattr(x, attr, getattr(y, attr))

The inner assignment in that loop is *very* noisy for a simple
assignment. Splitting out a temporary variable cleans things up a bit,
but it's still fairly untidy:

    for attr in "attr1 attr2 attr3".split():
        val = getattr(y, attr)
        setattr(x, attr, val)

It would be a *lot* cleaner if we could just use a normal assignment
statement instead of builtin functions to perform the name binding. As
it turns out, for ordinary instances, we can already do exactly that:

    for attr in "attr1 attr2 attr3".split():
        vars(x)[attr] = vars(y)[attr]

In short, I think proposals for dedicated syntax for dynamic attribute
access are misguided - instead, such efforts should go into enhancing
vars() to return objects that support *full* dict-style access to the
underlying object's attribute namespace (with descriptor protocol
support and all).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sun Feb  5 01:38:40 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 5 Feb 2012 10:38:40 +1000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <CADiSq7do2ZL1VS+So4kM1GMp48eY2O7fVSEA7WKeH_czxAiobQ@mail.gmail.com>

On Sun, Feb 5, 2012 at 10:20 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
>
> ? ?for attr in "attr1 attr2 attr3".split():
> ? ? ? ?vars(x)[attr] = vars(y)[attr]

That can obviously also be written:

    xa, ya = vars(x), vars(y)
    for attr in "attr1 attr2 attr3".split():
        va[attr] = ya[attr]

In other words, don't think about new syntax. Think about how to
correctly implement a full object proxy that provides the
MutableMapping interface, with get/set/delitem on the proxy
corresponding with get/set/delattr on the underlying object. Then
think about whether or not returning such an object from vars() would
be backwards compatible, or whether a new API would be needed to
create one (e.g. attrview(x)).

Finally, such an object can be prototyped quite happily outside the
standard library, so consider writing it and publishing it on PyPI as
a standalone module.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From tjreedy at udel.edu  Sun Feb  5 02:13:19 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 04 Feb 2012 20:13:19 -0500
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <jgkl3p$fan$1@dough.gmane.org>

On 2/4/2012 7:20 PM, Nick Coghlan wrote:
> Rather than throwing out random ideas in a popularity contest, it's
> important to carefully review what it is that people don't like about
> the current state of affairs.
>
> Currently, when dealing with attributes that are statically
> determined, your code looks like this:
>
>      x.attr  # Reference
>      x.attr = y  # Bind
>      del x.attr  # Unbind
>
> Now, suppose for some reason we want to determine the attributes
> *dynamically*. The reason for doing this could be as simple as wanting
> to avoid code duplication when performing the same operation on
> multiple attributes (i.e. "for attr in 'attr1 attr2 attr3'.split():
> ...").
>
> At this point, dedicated syntactic support disappears and we're now
> using builtin functions instead:
>
>      getattr(x, attr)  # Reference
>      setattr(x, attr, y)  # Bind
>      delattr(x, attr)  # Unbind
>      hasattr(x, attr) # Existence query (essentially a shorthand for
> getattr() in a try/except block)
>
> So, that's the status quo any proposals are competing against. It's
> easy enough to write, easy to read and easy to look up if you don't
> already know what it does (an often underestimated advantage of
> builtin operations over syntax is that the former are generally *much*
> easier to look up in the documentation).

Also, functions can be passed as arguments, whereas syntax cannot, which 
is why we have the operator module.

> However, it can start to look rather clumsy when multiple dynamic
> attribute operations are chained together.
>
> Compare this static code:
>
>      x.attr1 = y.attr1
>      x.attr2 = y.attr2
>      x.attr3 = y.attr3
>
> With the following dynamic code:
>
>      for attr in "attr1 attr2 attr3".split():
>          setattr(x, attr, getattr(y, attr))
>
> The inner assignment in that loop is *very* noisy for a simple
> assignment. Splitting out a temporary variable cleans things up a bit,
> but it's still fairly untidy:
>
>      for attr in "attr1 attr2 attr3".split():
>          val = getattr(y, attr)
>          setattr(x, attr, val)
>
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
>
>      for attr in "attr1 attr2 attr3".split():
>          vars(x)[attr] = vars(y)[attr]
>
> In short, I think proposals for dedicated syntax for dynamic attribute
> access are misguided - instead, such efforts should go into enhancing
> vars() to return objects that support *full* dict-style access to the
> underlying object's attribute namespace (with descriptor protocol
> support and all).
>
> Cheers,
> Nick.
>


-- 
Terry Jan Reedy


From nathan.alexander.rice at gmail.com  Sun Feb  5 03:03:03 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Sat, 4 Feb 2012 21:03:03 -0500
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <4F2DAA24.6000300@2sn.net>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F2DAA24.6000300@2sn.net>
Message-ID: <CAOFbRmJBB0DeD76t2nc5va_Lk_10sZF=FAWUeYzgRMZ9XYsscA@mail.gmail.com>

I think .() is the nicest of the suggestions thus far.  I don't mind
the <- and -> syntax so much either, I could live with obj<-foo.

Nathan


From cs at zip.com.au  Sun Feb  5 03:17:44 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 5 Feb 2012 13:17:44 +1100
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
	Dict-like object	with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <20120205021744.GA8647@cskk.homeip.net>

On 05Feb2012 10:20, Nick Coghlan <ncoghlan at gmail.com> wrote:
[...]
| In short, I think proposals for dedicated syntax for dynamic attribute
| access are misguided - instead, such efforts should go into enhancing
| vars() to return objects that support *full* dict-style access to the
| underlying object's attribute namespace (with descriptor protocol
| support and all).

+10

_Where_ do you people find the time to write these well thought out posts?

I'm very much for making vars() better supported (the docs have caveats
about assigning to it). All the syntax suggestions I've seen look
cumbersome or ugly and some are actively misleading to my eye (did I really
see an "<-" in there?)

 I see my random sig quote picker has worked well again:-)

 Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

A strong conviction that something must be done is the parent of many
bad measures.   - Daniel Webster


From ironfroggy at gmail.com  Sun Feb  5 04:00:10 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sat, 4 Feb 2012 22:00:10 -0500
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <CAGaVwhQS1MSbA0xvW9K3G+odtVvSDZwqyA29wf7nhrfDwTyQfw@mail.gmail.com>

On Sat, Feb 4, 2012 at 7:20 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
>
> ? ?for attr in "attr1 attr2 attr3".split():
> ? ? ? ?vars(x)[attr] = vars(y)[attr]
>
> In short, I think proposals for dedicated syntax for dynamic attribute
> access are misguided - instead, such efforts should go into enhancing
> vars() to return objects that support *full* dict-style access to the
> underlying object's attribute namespace (with descriptor protocol
> support and all).

I love the idea, and I think such a solution is much more straight forward than
any syntax change. While it would be great to extend the functionality
of vars(),
it would be easier to add a new builtin that returns some kind of
proxy. If vars()
was changed to return this proxy, it could potentially break a lot of
existing code
mutating the dict returned by vars.

The question is: Is yet another builtin or the messy compatibility change the
worse option?

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From ncoghlan at gmail.com  Sun Feb  5 07:23:50 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 5 Feb 2012 16:23:50 +1000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CAGaVwhQS1MSbA0xvW9K3G+odtVvSDZwqyA29wf7nhrfDwTyQfw@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<CAGaVwhQS1MSbA0xvW9K3G+odtVvSDZwqyA29wf7nhrfDwTyQfw@mail.gmail.com>
Message-ID: <CADiSq7ePxbtpt86BdRDk9kyS6QdmnUT3UmBiKrtYOp34CmGy7Q@mail.gmail.com>

On Sun, Feb 5, 2012 at 1:00 PM, Calvin Spealman <ironfroggy at gmail.com> wrote:
> The question is: Is yet another builtin or the messy compatibility change the
> worse option?

That's actually a question for (much) further down the road. The
*current* question is whether anyone is interested enough in the
concept to prototype it as a PyPI module. That's a lot more work than
just posting suggestions here :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Sun Feb  5 08:40:30 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sun, 5 Feb 2012 00:40:30 -0700
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <CALFfu7ByhOZJnq4TSW4Nb1=g-fw8ahcQ7z09h2xytQWTgrf7oQ@mail.gmail.com>

On Sat, Feb 4, 2012 at 5:20 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Rather than throwing out random ideas in a popularity contest, it's
> important to carefully review what it is that people don't like about
> the current state of affairs.
>
[snipped]
>
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
>
> ? ?for attr in "attr1 attr2 attr3".split():
> ? ? ? ?vars(x)[attr] = vars(y)[attr]
>
> In short, I think proposals for dedicated syntax for dynamic attribute
> access are misguided - instead, such efforts should go into enhancing
> vars() to return objects that support *full* dict-style access to the
> underlying object's attribute namespace (with descriptor protocol
> support and all).

Good call on this, Nick.  :)

-eric


From storchaka at gmail.com  Sun Feb  5 13:46:00 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 05 Feb 2012 14:46:00 +0200
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
Message-ID: <jgltmf$kk0$1@dough.gmane.org>

05.02.12 02:20, Nick Coghlan ???????(??):
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
>
>      for attr in "attr1 attr2 attr3".split():
>          vars(x)[attr] = vars(y)[attr]
>
> In short, I think proposals for dedicated syntax for dynamic attribute
> access are misguided - instead, such efforts should go into enhancing
> vars() to return objects that support *full* dict-style access to the
> underlying object's attribute namespace (with descriptor protocol
> support and all).

One-liner "def vars(v): return v.__dict__"?


From yoavglazner at gmail.com  Sun Feb  5 13:58:08 2012
From: yoavglazner at gmail.com (yoav glazner)
Date: Sun, 5 Feb 2012 12:58:08 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <jgltmf$kk0$1@dough.gmane.org>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
Message-ID: <CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>

On Sun, Feb 5, 2012 at 12:46 PM, Serhiy Storchaka <storchaka at gmail.com>wrote:

> 05.02.12 02:20, Nick Coghlan ???????(??):
>
>> It would be a *lot* cleaner if we could just use a normal assignment
>> statement instead of builtin functions to perform the name binding. As
>> it turns out, for ordinary instances, we can already do exactly that:
>>
>>     for attr in "attr1 attr2 attr3".split():
>>         vars(x)[attr] = vars(y)[attr]
>>
>> In short, I think proposals for dedicated syntax for dynamic attribute
>> access are misguided - instead, such efforts should go into enhancing
>> vars() to return objects that support *full* dict-style access to the
>> underlying object's attribute namespace (with descriptor protocol
>> support and all).
>>
>
> One-liner "def vars(v): return v.__dict__"?
>

This does't work for properties:
>>> class p:
@property
def pop(self): return 'corn'

>>> def vars(x): return x.__dict__

>>> p().pop
'corn'
>>> vars(p())['pop']
Traceback (most recent call last):
  File "<pyshell#14>", line 1, in <module>
    vars(p())['pop']
KeyError: 'pop'
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120205/f8b9dc9b/attachment.html>

From cmjohnson.mailinglist at gmail.com  Sun Feb  5 14:25:04 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sun, 5 Feb 2012 03:25:04 -1000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
	Dict-like object with property access)
In-Reply-To: <CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
Message-ID: <C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>


On Feb 5, 2012, at 2:58 AM, yoav glazner wrote:

> 
> 
> On Sun, Feb 5, 2012 at 12:46 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 05.02.12 02:20, Nick Coghlan ???????(??):
> It would be a *lot* cleaner if we could just use a normal assignment
> statement instead of builtin functions to perform the name binding. As
> it turns out, for ordinary instances, we can already do exactly that:
> 
>     for attr in "attr1 attr2 attr3".split():
>         vars(x)[attr] = vars(y)[attr]
> 
> In short, I think proposals for dedicated syntax for dynamic attribute
> access are misguided - instead, such efforts should go into enhancing
> vars() to return objects that support *full* dict-style access to the
> underlying object's attribute namespace (with descriptor protocol
> support and all).
> 
> One-liner "def vars(v): return v.__dict__"?
> 
> This does't work for properties:

It's not that hard to make something that basically works with properties:


>>> class vars2:
...     def __init__(self, obj):
...         self.obj = obj
...     
...     def __getitem__(self, key):
...         return getattr(self.obj, key)
...     
...     def __setitem__(self, key, value):
...         setattr(self.obj, key, value)
...     
...     def __delitem__(self, key):
...         delattr(self.obj, key)
... 
>>> class P:
...     def __init__(self):
...         self.value = 1
...     
...     @property
...     def pop(self): return 'corn'
...     
...     @property
...     def double(self):
...         return self.value * 2
...     
...     @double.setter
...     def double(self, value):
...         self.value = value/2
... 
>>> p = P()
>>> p.pop
'corn'
>>> v  = vars2(p)
>>> v['pop']
'corn'
>>> v['value']
1
>>> v['double']
2
>>> v['double'] = 4
>>> v['value']
2.0
>>> v['double']
4.0

In a real module, you'd probably want to be more thorough about emulating a __dict__ dictionary though by adding item() and keys() etc.

From p.f.moore at gmail.com  Sun Feb  5 15:06:10 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 5 Feb 2012 14:06:10 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
Message-ID: <CACac1F_iTs65gaKb0jz-rEY2w_f6+8=bVjpKnbOQAR1fj7c5NQ@mail.gmail.com>

2012/2/5 Carl M. Johnson <cmjohnson.mailinglist at gmail.com>:
> It's not that hard to make something that basically works with properties:

Indeed...

> In a real module, you'd probably want to be more thorough about emulating a __dict__ dictionary though by adding item() and keys() etc.

... and precisely! The discussions so far have concentrated on the
"easy" side of things. Writing a working module would ensure that all
the corner cases get covered. And as a benefit, would provide an
implementation that could be taken straight into the core/stdlib,
hugely reducing the core developer effort that is otherwise needed to
take even the best thought out proposal into reality.

Paul.


From storchaka at gmail.com  Sun Feb  5 15:28:35 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 05 Feb 2012 16:28:35 +0200
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
Message-ID: <jgm3mq$rnk$1@dough.gmane.org>

05.02.12 15:25, Carl M. Johnson ???????(??):
> On Feb 5, 2012, at 2:58 AM, yoav glazner wrote:
>> This does't work for properties:
> It's not that hard to make something that basically works with properties:

del v['pop']
AttributeError: P instance has no attribute 'pop'


> In a real module, you'd probably want to be more thorough about emulating a __dict__ dictionary though by adding item() and keys() etc.

It's impossible in general.

class A:
    def __getattr__(self, name):
        return len(name)


From p.f.moore at gmail.com  Sun Feb  5 16:16:40 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 5 Feb 2012 15:16:40 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <jgm3mq$rnk$1@dough.gmane.org>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
Message-ID: <CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>

2012/2/5 Serhiy Storchaka <storchaka at gmail.com>:
> 05.02.12 15:25, Carl M. Johnson ???????(??):
>> On Feb 5, 2012, at 2:58 AM, yoav glazner wrote:
>>> This does't work for properties:
>> It's not that hard to make something that basically works with properties:
>
> del v['pop']
> AttributeError: P instance has no attribute 'pop'
>
>
>> In a real module, you'd probably want to be more thorough about emulating a __dict__ dictionary though by adding item() and keys() etc.
>
> It's impossible in general.
>
> class A:
> ? ?def __getattr__(self, name):
> ? ? ? ?return len(name)


>>> class proxy:
...   def __init__(self, orig):
...     self._orig = orig
...   def __getitem__(self, attr):
...     return getattr(self._orig,attr)
...
>>> class A:
...   def __getattr__(self, name):
...     return len(name)
...
>>> a = A()
>>> proxy(a)['hello']
5
>>>

Extending the proxy class to include setting, deleting, and various
corner cases, is left as an exercise for the reader :-)

Paul


From yoavglazner at gmail.com  Sun Feb  5 16:33:48 2012
From: yoavglazner at gmail.com (yoav glazner)
Date: Sun, 5 Feb 2012 15:33:48 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
Message-ID: <CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>

>
> >> In a real module, you'd probably want to be more thorough about
> emulating a __dict__ dictionary though by adding item() and keys() etc.
> >
> > It's impossible in general.
> >
> > class A:
> >    def __getattr__(self, name):
> >        return len(name)
>
>
> >>> class proxy:
> ...   def __init__(self, orig):
> ...     self._orig = orig
> ...   def __getitem__(self, attr):
> ...     return getattr(self._orig,attr)
> ...
> >>> class A:
> ...   def __getattr__(self, name):
> ...     return len(name)
> ...
> >>> a = A()
> >>> proxy(a)['hello']
> 5
> >>>
>
> Extending the proxy class to include setting, deleting, and various
> corner cases, is left as an exercise for the reader :-)


>>> proxy(a).keys()
?!?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120205/3db0a8c0/attachment.html>

From p.f.moore at gmail.com  Sun Feb  5 17:02:54 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 5 Feb 2012 16:02:54 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
Message-ID: <CACac1F-XTo=8uL9yEs0nMpfKcL3Fx1R9z449BhU=Q2w8meG_zQ@mail.gmail.com>

On 5 February 2012 15:33, yoav glazner <yoavglazner at gmail.com> wrote:
>> Extending the proxy class to include setting, deleting, and various
>> corner cases, is left as an exercise for the reader :-)
>
>
>>>>?proxy(a).keys()
> ?!?

One of the exercises :-)

Paul


From tjreedy at udel.edu  Sun Feb  5 19:05:42 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 05 Feb 2012 13:05:42 -0500
Subject: [Python-ideas] Dict-like object with property access
In-Reply-To: <CAOFbRmJBB0DeD76t2nc5va_Lk_10sZF=FAWUeYzgRMZ9XYsscA@mail.gmail.com>
References: <CAPkN8x+i4X1a_8a_FB8QmUB328uoWQXP6ALQYJq6b8Lb9Q8HrQ@mail.gmail.com>
	<CAP7+vJJuDMva6FNA9xph-DrXckO2oa7xAtD-C3MRQ3YmEBOigw@mail.gmail.com>
	<DBEB52B3-8F25-46FF-8501-CC1472C33494@gmail.com>
	<17E44F92-A978-4E93-9713-7D542E61ED10@masklinn.net>
	<CCACB5E5-6753-404C-9A4A-62489D99FF0A@gmail.com>
	<CAJ6cK1bK-=9AaTsOZQ6CiDo77Ya_ie4kWVFp5MXcxGnTTmYGdg@mail.gmail.com>
	<CALFfu7AArrX7wnUGjaN2HtmxsWqYQ7ktK7xy+atZBn0zKKM8Dg@mail.gmail.com>
	<3DFDD08E-D82B-4706-8DAB-9F06B9E1F403@gmail.com>
	<20120130200226.0a5ab1d8@pitrou.net>
	<4F26EE37.2040104@stoneleaf.us> <jg8448$ioq$1@dough.gmane.org>
	<CALFfu7C=pwkwjBxukaY1w7_wVPPQA36efqoCAJTzNFfttuCmGw@mail.gmail.com>
	<jga11j$2gt$1@dough.gmane.org> <4F2DAA24.6000300@2sn.net>
	<CAOFbRmJBB0DeD76t2nc5va_Lk_10sZF=FAWUeYzgRMZ9XYsscA@mail.gmail.com>
Message-ID: <jgmge0$gk1$2@dough.gmane.org>

On 2/4/2012 9:03 PM, Nathan Rice wrote:
> I think .() is the nicest of the suggestions thus far.  I don't mind

Except, as someone said, x.(n) looks like calling x. with arg n.
So I am withdrawing my support for that and agree with what Nick wrote a 
day or two ago.

-- 
Terry Jan Reedy


From simon.sapin at kozea.fr  Sun Feb  5 21:45:38 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Sun, 05 Feb 2012 21:45:38 +0100
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
Message-ID: <4F2EEA72.7080104@kozea.fr>

Le 05/02/2012 16:33, yoav glazner a ?crit :
>
>
>     >>> class proxy:
>     ...   def __init__(self, orig):
>     ...     self._orig = orig
>     ...   def __getitem__(self, attr):
>     ...     return getattr(self._orig,attr)
>     ...
>     >>> class A:
>     ...   def __getattr__(self, name):
>     ...     return len(name)
>     ...
>     >>> a = A()
>     >>> proxy(a)['hello']
>     5
>     >>>
>
>     Extending the proxy class to include setting, deleting, and various
>     corner cases, is left as an exercise for the reader :-)
>
>
> >>> proxy(a).keys()
> ?!?

Hi,

+1 on extending vars(). I like this idea much more than adding syntax.

In this case, proxy(a).key() would be based on dir(a) (or something 
similar) and have the same (documented) limitations. I think this is 
acceptable, and the proxy object is still useful.

Regards,

-- 
Simon Sapin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120205/3bb3ff98/attachment.html>

From p.f.moore at gmail.com  Sun Feb  5 22:18:09 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 5 Feb 2012 21:18:09 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <4F2EEA72.7080104@kozea.fr>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
	<4F2EEA72.7080104@kozea.fr>
Message-ID: <CACac1F99E4j+PAk58qFqGikDwe_n5EqK-oX5W_eKF+zZ84bqBQ@mail.gmail.com>

On 5 February 2012 20:45, Simon Sapin <simon.sapin at kozea.fr> wrote:
> +1 on extending vars(). I like this idea much more than adding syntax.
>
> In this case, proxy(a).key() would be based on dir(a) (or something similar)
> and have the same (documented) limitations. I think this is acceptable, and
> the proxy object is still useful.

vars() and dir() do very different things:

>>> class A:
...   pass
...
>>> a = A()
>>> a.a = 1
>>> dir(a)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__h
ash__', '__init__', '__le__', '__lt__', '__module__', '__ne__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '_
_setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a']
>>> vars(a)
{'a': 1}
>>>

In my view, following the spec of vars() is far more useful and
matches better the original requirement, which was to simulate
javascript's index/attribute duality. Methods (and even more so
special methods) don't really fit in here.

I'd argue for the definition:

proxy(obj)['a'] <=> obj.a
proxy(obj)['a' = val] <=> obj.a = val
del proxy(obj)['a'] <=> del obj.a
'a' in proxy(obj) <=> hasattr(obj, 'a')
proxy(obj).keys() <=> vars(obj).keys()
len(proxy(obj)) <=> len(vars(obj))

In other words, indexing defers to getattr/setattr/delattr,
containment uses hasattr, but anything else goes via vars. In terms of
ABCs, Sized/Iterable behaviour comes from vars(),
Container/Mapping/MutableMapping behaviour comes from
{has,get,set,del}attr.

It's mildly inconsistent for objects which implement their own
attribute access, but those aren't the key use case, and the behaviour
is well defined even for those.

Paul


From simon.sapin at kozea.fr  Sun Feb  5 22:30:38 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Sun, 05 Feb 2012 22:30:38 +0100
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CACac1F99E4j+PAk58qFqGikDwe_n5EqK-oX5W_eKF+zZ84bqBQ@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
	<4F2EEA72.7080104@kozea.fr>
	<CACac1F99E4j+PAk58qFqGikDwe_n5EqK-oX5W_eKF+zZ84bqBQ@mail.gmail.com>
Message-ID: <4F2EF4FE.8030202@kozea.fr>

Le 05/02/2012 22:18, Paul Moore a ?crit :
> In my view, following the spec of vars() is far more useful and
> matches better the original requirement, which was to simulate
> javascript's index/attribute duality. Methods (and even more so
> special methods) don't really fit in here.
>
> I'd argue for the definition:
>
> proxy(obj)['a']<=>  obj.a
> proxy(obj)['a' = val]<=>  obj.a = val
> del proxy(obj)['a']<=>  del obj.a
> 'a' in proxy(obj)<=>  hasattr(obj, 'a')
> proxy(obj).keys()<=>  vars(obj).keys()
> len(proxy(obj))<=>  len(vars(obj))

I?m fine with that too and I agree it is probably better. My point was 
that not all keys that can be used in proxy(a)[key] without KeyError 
will be in proxy(a).keys(), but that?s okay because the same already 
happens with getattr() and dir()

By the way, the proxy should also turn AttributeError into KeyError, for 
consistency with other Mapping types.

Regards,

-- 
Simon Sapin


From p.f.moore at gmail.com  Mon Feb  6 00:47:15 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 5 Feb 2012 23:47:15 +0000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <4F2EF4FE.8030202@kozea.fr>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<CAJ78kjMiHSY437dbyXvwhwas5x0k_vz-2=WvU0RM39YMmaxLBg@mail.gmail.com>
	<4F2EEA72.7080104@kozea.fr>
	<CACac1F99E4j+PAk58qFqGikDwe_n5EqK-oX5W_eKF+zZ84bqBQ@mail.gmail.com>
	<4F2EF4FE.8030202@kozea.fr>
Message-ID: <CACac1F_u9KchWq5Xji3FrFpwMYBBjPSxfjrMuV2Gao8dhFP_RQ@mail.gmail.com>

On 5 February 2012 21:30, Simon Sapin <simon.sapin at kozea.fr> wrote:
> By the way, the proxy should also turn AttributeError into KeyError, for
> consistency with other Mapping types.

Clearly. And arguably, this is a good case for the new "raise KeyError
from None" form to suppress exception chaining...

Paul.


From grosser.meister.morti at gmx.net  Mon Feb  6 02:07:55 2012
From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=)
Date: Mon, 06 Feb 2012 02:07:55 +0100
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
Message-ID: <4F2F27EB.2070807@gmx.net>

On 02/05/2012 04:16 PM, Paul Moore wrote:
> 2012/2/5 Serhiy Storchaka<storchaka at gmail.com>:
>> 05.02.12 15:25, Carl M. Johnson ???????(??):
>>> On Feb 5, 2012, at 2:58 AM, yoav glazner wrote:
>>>> This does't work for properties:
>>> It's not that hard to make something that basically works with properties:
>>
>> del v['pop']
>> AttributeError: P instance has no attribute 'pop'
>>
>>
>>> In a real module, you'd probably want to be more thorough about emulating a __dict__ dictionary though by adding item() and keys() etc.
>>
>> It's impossible in general.
>>
>> class A:
>>     def __getattr__(self, name):
>>         return len(name)
>
>
>>>> class proxy:
> ...   def __init__(self, orig):
> ...     self._orig = orig
> ...   def __getitem__(self, attr):
> ...     return getattr(self._orig,attr)
> ...
>>>> class A:
> ...   def __getattr__(self, name):
> ...     return len(name)
> ...
>>>> a = A()
>>>> proxy(a)['hello']
> 5
>>>>
>
> Extending the proxy class to include setting, deleting, and various
> corner cases, is left as an exercise for the reader :-)
>

class attrs(object):
	__slots__ = 'obj',

	def __init__(self,obj):
		self.obj = obj

	def __getitem__(self, key):
		try:
			return getattr(self.obj, key)
		except AttributeError:
			raise KeyError(key)

	def __setitem__(self, key, value):
		try:
			setattr(self.obj, key, value)
		except AttributeError:
			raise KeyError(key)

	def __delitem__(self, key):
		try:
			delattr(self.obj, key)
		except AttributeError:
			raise KeyError(key)

	def __contains__(self, key):
		return hasattr(self.obj, key)

	def get(self, key, default=None):
		try:
			return getattr(self.obj, key, default)
		except AttributeError:
			raise KeyError(key)

	def keys(self):
		return iter(dir(self.obj))

	def values(self):
		for key in dir(self.obj):
			yield getattr(self.obj, key)

	def items(self):
		for key in dir(self.obj):
			yield key, getattr(self.obj, key)

	def __len__(self):
		return len(dir(self.obj))

	def __iter__(self):
		return iter(dir(self.obj))

	def __repr__(self):
		return repr(dict(self))


From ncoghlan at gmail.com  Mon Feb  6 02:41:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 6 Feb 2012 11:41:38 +1000
Subject: [Python-ideas] Shorthand syntax for get/set/delattr (was Re:
 Dict-like object with property access)
In-Reply-To: <4F2F27EB.2070807@gmx.net>
References: <CADiSq7cTM+TLvof8BYfVroHbZqZUYH9bHVOzAdauux13hLa2ag@mail.gmail.com>
	<jgltmf$kk0$1@dough.gmane.org>
	<CAJ78kjMO956CtE66cnkM=O=_qdT_=Vo4hE7W2QEt2nJkWBGtdw@mail.gmail.com>
	<C042F63B-4F83-43A9-BCDB-3F5B15D691BB@gmail.com>
	<jgm3mq$rnk$1@dough.gmane.org>
	<CACac1F-jh+BUSorMtazAX_XzqMOtYfYQZhrVz_Er-x9zwLd7WQ@mail.gmail.com>
	<4F2F27EB.2070807@gmx.net>
Message-ID: <CADiSq7fWn1K2EsCQf2Q-kqhd0_HWR8JnCJutDRJgQk3pZwaYww@mail.gmail.com>

On Mon, Feb 6, 2012 at 11:07 AM, Mathias Panzenb?ck
<grosser.meister.morti at gmx.net> wrote:

This is a good start, but still has a few issues.

> ? ? ? ?def get(self, key, default=None):
> ? ? ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ? ? ? ? ?return getattr(self.obj, key, default)
> ? ? ? ? ? ? ? ?except AttributeError:
> ? ? ? ? ? ? ? ? ? ? ? ?raise KeyError(key)

This will never raise KeyError. It needs to use a dedicated sentinel
object so it can tell the difference between "default=None" and
"default not supplied" and invoke getattr() accordingly.

> ? ? ? ?def keys(self):
> ? ? ? ? ? ? ? ?return iter(dir(self.obj))
>
> ? ? ? ?def values(self):
> ? ? ? ? ? ? ? ?for key in dir(self.obj):
> ? ? ? ? ? ? ? ? ? ? ? ?yield getattr(self.obj, key)
>
> ? ? ? ?def items(self):
> ? ? ? ? ? ? ? ?for key in dir(self.obj):
> ? ? ? ? ? ? ? ? ? ? ? ?yield key, getattr(self.obj, key)

These 3 methods should return views with the appropriate APIs rather
than iterators.

> ? ? ? ?def __repr__(self):
> ? ? ? ? ? ? ? ?return repr(dict(self))

The appropriate output for str() and repr() is definitely open for
question. Interaction with serialisation APIs such as pickle and json
will also need investigation.

These kinds of question are why I think it is well-worth exploring
this concept on PyPI.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From julien at tayon.net  Mon Feb  6 21:01:29 2012
From: julien at tayon.net (julien tayon)
Date: Mon, 6 Feb 2012 21:01:29 +0100
Subject: [Python-ideas] matrix operations on dict :)
Message-ID: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>

Hello,

Proposing vector operations on dict, and acknowledging there was an
homeomorphism from rooted n-ary trees to dict, was inducing the
possibility of making matrix of dict / trees.

Since, linear algebrae on dict was coldly welcomed, I waited to have
some code to back me up to push my reasoning furhter, and it happily
worked the way books predicted.


This was the reasoning :
- dict <=> vector
- Vectors + linear algebrae <=> matrix
- Most of Rooted Trees  <=> dict( dict( ... ) ) **
- Matrix * Vector  = Vector2 <=> Matrix * tree1 = Tree2

** see here for a coded explanation
http://readthedocs.org/docs/vectordict/en/latest/intro.html#homeomorphism-between-dict-and-k-ary-rooted-tree


for a sample of API, code, and result see here.
http://readthedocs.org/docs/vectordict/en/latest/matrix.html#api

dict of dict might not be the best way to make trees but, having
matrix operations on dict is being able to transform trees in trees
natively.

The module is still quite a proof of concept, and it is not the
implementation I advocate, but rather the idea. Because : isn't
transforming trees into trees quite a recurrent task in modern
Computer Science with key value database ?

Plus matrix * tree being side effect free, it is a good candidate for
a canonical way to tranform tree in a parallelisable way.

And by the way I implemented matrix as vectordict so ... we have
matrix operations on matrix. ^_^ (Brace thourself, InceptionMatrix are
coming)


For the ?not faint of heart? that are able to read un Perlish un PEP8 code :
http://pypi.python.org/pypi/VectorDict/0.3.0


my 2 euro cents (which of course worth more than 2 US cents <:o) ),

Cheers,

PS : I am not sure that using defaultdict as a backend was the best
idea of the century, but keys appearing in a dict after an addition
where not very much in my idea of how a normal python dict should
behave.
PPS : I will -if I still have time- code sets operations on dict,
issubset, diff, union, intersection. These are quite easy, but so
unfun. Since I am not very gifted at explaining, I prefer to code and
show the result later.

-- 
Julien Tayon


From yselivanov.ml at gmail.com  Mon Feb  6 21:08:50 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 6 Feb 2012 15:08:50 -0500
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <CADiSq7cY62XMAJvS_vbNggNgdh9uMSsoqrq1cUvdNKeD_etYVA@mail.gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CADiSq7cY62XMAJvS_vbNggNgdh9uMSsoqrq1cUvdNKeD_etYVA@mail.gmail.com>
Message-ID: <D5B9BF66-C954-4780-A11B-829929341E41@gmail.com>

Well, native syntax would be much useful, but ContextStack seems like
a decent workaround.  Will it be included in the stdlib (py3.3)?

On 2012-02-04, at 1:22 AM, Nick Coghlan wrote:

> On Sat, Feb 4, 2012 at 1:09 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> As of now, without "nested" we have either option of reimplementing it, or to write lots of ugly code with nested 'try..except's.  So the feature was taken out, but nothing replaced it.
>> 
>> What do you think guys?
> 
> I think you should try contextlib2 :)
> 
> Specifically, ContextStack:
> http://contextlib2.readthedocs.org/en/latest/index.html#contextlib2.ContextStack
> 
> Cheers,
> Nick.
> 
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From ncoghlan at gmail.com  Mon Feb  6 21:54:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 7 Feb 2012 06:54:38 +1000
Subject: [Python-ideas] unpacking context managers in WITH statement
In-Reply-To: <D5B9BF66-C954-4780-A11B-829929341E41@gmail.com>
References: <6BCA6FFD-7B32-4AA1-949C-B41CE932471F@gmail.com>
	<CADiSq7cY62XMAJvS_vbNggNgdh9uMSsoqrq1cUvdNKeD_etYVA@mail.gmail.com>
	<D5B9BF66-C954-4780-A11B-829929341E41@gmail.com>
Message-ID: <CADiSq7eSLJirqgJwi7v03YQSDy7yvpKu0iecFnEY2TfnyWzr3A@mail.gmail.com>

On Tue, Feb 7, 2012 at 6:08 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> Well, native syntax would be much useful, but ContextStack seems like
> a decent workaround. ?Will it be included in the stdlib (py3.3)?

Most likely (I'm the primary maintainer of contextlib, so it's
basically my call).

Feedback on what it's like to use in practice would definitely help
with that - I put it up on PyPI as contextlib2 so people could try it
out and help me avoid repeating the mistakes we made with nested()
(which was an error prone bug trap).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From simon.sapin at kozea.fr  Mon Feb  6 23:18:30 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Mon, 06 Feb 2012 23:18:30 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
Message-ID: <4F3051B6.5060801@kozea.fr>

Le 06/02/2012 21:01, julien tayon a ?crit :
> Proposing vector operations on dict, and acknowledging there was an
> homeomorphism from rooted n-ary trees to dict, was inducing the
> possibility of making matrix of dict / trees.

Hi,

I studied linear algebra and I think I understand it fairly well. 
However, after reading your email and the linked documentation, I?m just 
confused. I really don?t know what this is about.

I *think* that you are defining something like a mathematical group[1] 
or ring[2], but:

* Over what elements? (Any dict? dicts with some property?)
* How exactly are your "addition" and "multiplication" (if any) defined?
* Why? I?m sure I could come up with a well-defined but absurd (and 
useless) "group", but why is yours interesting?

[1] http://en.wikipedia.org/wiki/Group_%28mathematics%29
[2] http://en.wikipedia.org/wiki/Ring_%28mathematics%29

Regards,

-- 
Simon Sapin


From simon.sapin at kozea.fr  Mon Feb  6 23:21:13 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Mon, 06 Feb 2012 23:21:13 +0100
Subject: [Python-ideas] A key parameter for heapq.merge
In-Reply-To: <4F0AC6B6.2040403@trueblade.com>
References: <4F0A116D.3030202@kozea.fr>
	<73792DF0-F128-437E-8AC8-A9F34D042FF4@gmail.com>
	<4F0AC596.60906@kozea.fr> <4F0AC6B6.2040403@trueblade.com>
Message-ID: <4F305259.3060705@kozea.fr>

Le 09/01/2012 11:51, Eric V. Smith a ?crit :
> On 1/9/2012 5:46 AM, Simon Sapin wrote:
>> I just opened http://bugs.python.org/issue13742 , but I can?t assign it.
>> (New account on the tracker.)
> I assigned it to Raymond.

Hi,

I think my latest patch on #13742 looks good. Is something else missing?

Thanks,

-- 
Simon Sapin


From julien at tayon.net  Tue Feb  7 02:00:24 2012
From: julien at tayon.net (julien tayon)
Date: Tue, 7 Feb 2012 02:00:24 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <4F3051B6.5060801@kozea.fr>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<4F3051B6.5060801@kozea.fr>
Message-ID: <CAFpLVkwGr10wHeuVGmXCpJK6QbnjdO66D5k5hgMF-39Cyn_hsA@mail.gmail.com>

2012/2/6 Simon Sapin <simon.sapin at kozea.fr>:
> Le 06/02/2012 21:01, julien tayon a ?crit :
>
>> Proposing vector operations on dict, and acknowledging there was an
>> homeomorphism from rooted n-ary trees to dict, was inducing the
>> possibility of making matrix of dict / trees.
>
>
> Hi,
>
> I studied linear algebra and I think I understand it fairly well. However,
> after reading your email and the linked documentation, I?m just confused. I
> really don?t know what this is about.
>
Because there is some dust under the carpet.

Lets define the notion of dict(dict()) (rooted k-ary trees) as a
vector. Imagine :
tree A
{ a:
    { b : 1,
      c : 2
    },
  e : 3.0
}

This is the same as
vector B
dict(
 tuple([ 'a', 'b' ]) = 1,
 tuple([ 'a', 'c' ]) = 2,
 tuple([ e ]) = 3.0
)
By collapsing the path of intricated dict to a single key (made of the
ordered list of keys to the value)  you always fall back on  a dict of
depth 1.

I can construct A from B and B from A without any loose of properties.
Thus it is equivalent.

(Sparse) Matrix are therefore build this way
dict( tuple( source_tuple, destination_tuple ) = function )
(I could not resolve myself to code stupid matrices and  that have
only magnitude instead of function, therefore my matrices are not the
one of linear algebrae)

so any dict of dict is the same as a  one depth  dict.
A path to a value  defines a dimension, each paths are considered orthogonal.

So further reasoning will be made on vectors / 1 depth level dict.

There are two problems :
1) we can generate an infinite number of key so these are implicitly
incomplete vectors on an infinite base
it is as if when you define :
dict( x = 1) you also mean dict( x = 1, y = null element for
multiplication and neutral for addition , dn = null element for
multiplication and neutral for addtion   .... )
and I hear the tao of python saying explicit is better than implicit.
But a dict is explicitly an incomplete vector on an infinite base.

2) The problem is in the algebrae of the values/leaves because the
implementation is made by delegating addition all the way up to the
value.

Normaly, we think of values of a vector as scalars/magnitude
(float/int for instance), and addition should have  no more
consistency as the consistency of all the additions in a dict for all
the values. Unless you need something else.

I intended to be a little more consistent and to enforce the fact that
no values should have additions properties different than the
properties of a ring.

However, I found no easy way to do this. I would need all the classes
to tell wich algebrae they support by the mean of a property  that
would tell if
Class A  + Class B is commutative/associative/distributive. Still I
can do interesting thing, so I don't really want to castrate the
beast. (Plus, it would mean writing a PEP proposal, which is beyond my
abilities)


> I *think* that you are defining something like a mathematical group[1] or
> ring[2], but:

As far as I am concerned I am pretty confident in having all the
properties of the ring with + & * on dict. And I think I do.
Tell me if I need other tests here :
https://github.com/jul/ADictAdd_iction/blob/master/vector_dict/ConsistentAlgebrae.py#L169
or if I misunderstood any properties. I may have a good intuition, I
may recognize things when I see them, but I am clumsy with words.

>
> * Over what elements? (Any dict? dicts with some property?)

Well, you achieve better algebraic properties if leaves belongs to a
ring at least (float, numpy.array). But I have funny results with
records too.

Since dimension can be considered independant as long as values have
the same type all goes well as long as you do operations supported by
the leaves.
And  if you know what you are doing when mixing up dimension, all goes well.
for instance if with a matrix you multiply a nump.array with a weight
(float) for instance, it has sense.
if you multiply a string by a string, well, you go into trouble, and I
cowardly let the excpetion raise.

For distance of vectors I also run in trouble when dealing with
records like algebrae for + and *  ( list and string ) I see an
elegant workaround wich would be to know what is distance(record1 -
record2) even though I ignore what record1 - record2 is. For instance
with two string or ordered list, the distance can be defined as
edition distance even though string1 - string2 is a nonsense. But I
dont know yet how to make it fit in the puzzle.
Because sometimes norm( A - B ) has more sense than A - B I may need
to refactor the norm method. At this point I may also need
cooperations from the other classes :)

I may admit, I have not thought of everything and there can be some
holes in the racket, but, it is promising, and I had much fun using it
so far.

> * How exactly are your "addition" and "multiplication" (if any) defined?
by doing the following rules :

Addition :
given two vectors (as prior defined), we make the assumption that
these are vectors on an infinit base (made of all the possible paths),
and that when two vectors are added there are two possibles cases :
* if keys exists in both dict : add leaf
* if key in one dict only : create a leaf in the resulting dict with
the value (therefore assuming that undefined keys are neutral to
addition)

For multiplication I just accecpt either dict multiplication (non
existent path default value being the null element)  or
scalar/magnitude multiplication

* if you do 2  * dict( x =  val, y = val2 ) it will do
dict( x =  2 *val, y = 2 *val2 )

if you do dict( x =  val, y = val2 ) * 2  it will do
dict( x =  val * 2, y = val2 * 2 )
(as with vectors)

* if you do dict1 * dict2
any non common keys being implicitly the zero element of
multiplication  it will be
  * for each common keys thre resulting value is  the product of the
leaves of this key.
  * if a key is present in only one dict, the resulting dict get
pruned of the key (unexpressed path are set to the zero of
multiplication)


(it is achived with a silly overloading of + - / * of default dict no
magic is made here).


> * Why? I?m sure I could come up with a well-defined but absurd (and useless)
> "group", but why is yours interesting?
>
* it is fun, (not an argument, I do agree)

* ruby and Perl dont have it :) and I am close to come up with a
jqueryish like grammar of manipulation on tree made of imbricated
dict, (well real tree implemented with a parent property for each node
and attributes and value might be better suited)


* it gives results in the key/value database context and jsonish stuff
? la mongodb
   if you consider what map reduce in key/value database is. It boils
down to retrieving a dict of dict emitting a document (dict of dict)
by the mean of  a projection/or a matrix, and aggregating results
(addition for instance is vastly used) in a reduce operation.
For this it may be quite usefull, it factorizes code in mattrix that
can be stored, combined, added ...

  - in web/text indexing you can split a text to a serie of invariant
form of words with their associated frequency and measure how close
they are from a keyword by using either jaccard or cosinus similarity.
You can already do it, out of the box. I am now fighting my way to see
if I can easily build correlation matrices from two dicts.

   - matrix being vector, and matrix changing trees into trees you can
set a matrix as a matrix value  making a transformation in a subtree,
this is equivalent and  easier as composing functions.

   - if you have database of graphs that have no loops and a root, you
can query for similar path, of find paths in the path (as long as they
can be expressed in the form of dict of dict ..)
(I had too much time I coded a method to do it )

* it has applications  :
   - with the find method + projection + match_subtree , we have
static code analysis.

Imagine an AST. If I don't mistakes it can fit in  a tree (as a dict of dict).
you can find any exactly matching tree (for instance a bad design
pattern)  or close enough tree (using jaccard or cos) and says there
might be a problem in the resulting path of the code.
You can also transform AST in AST thus doing funny things such as on
the fly transorfmation of AST.

* would you like a rotation matrix  for dict( x = , y = , z = )  ? or
a polar transformation ?
* would you like a map/reduce of a tree of products that gives the
total,  the average, and deviance in one pass ?


I can see many applications in transforming trees to trees. I can see
some quirks though (since dict are unordered, in a matrix if a
destination is included in an already existing destination, it should
be forbidden since we cannot ensure the order of the operations, and
this dooms the concept of matrix in matrix).


I may lack a little precision in the wording still.

A langage is as strong as its base types. Giving testoterone to a type
(or creating a new powerfull one) is de facto strengthening a langage.
;)


Regards,
-- 
Julien


From steve at pearwood.info  Tue Feb  7 02:12:39 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 7 Feb 2012 12:12:39 +1100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
Message-ID: <20120207011239.GC28570@ando>

On Mon, Feb 06, 2012 at 09:01:29PM +0100, julien tayon wrote:
> Hello,
> 
> Proposing vector operations on dict, and acknowledging there was an
> homeomorphism from rooted n-ary trees to dict, was inducing the
> possibility of making matrix of dict / trees.

This seems interesting to me, but I don't see that they are important 
enough to be built-in to dicts.

At most, this could be a module in the standard library, but before that 
happens, you would have to prove the usefulness of the module. I suggest 
polishing it to a fit state to use in production, including tests, and 
putting it on PyPI. Once you can demonstrate some interest for it, then 
you can propose it gets added to the std lib.

Otherwise, this looks rather like a library of functions looking for a 
use. It might help if you demonstrate what concrete problems this helps 
you solve.

-- 
Steven


From raymond.hettinger at gmail.com  Tue Feb  7 03:17:09 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 6 Feb 2012 18:17:09 -0800
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
Message-ID: <84655D8D-B4D6-40AE-AA10-07781006E8C9@gmail.com>


On Feb 6, 2012, at 12:01 PM, julien tayon wrote:

> Proposing vector operations on dict, and acknowledging there was an
> homeomorphism from rooted n-ary trees to dict, was inducing the
> possibility of making matrix of dict / trees.

And if you add tensor operations, the implementation can remain
independent of the system of reference :-)

Contravariantly yours,


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120206/ced8118c/attachment.html>

From massimo.dipierro at gmail.com  Tue Feb  7 03:43:44 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Mon, 6 Feb 2012 20:43:44 -0600
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <84655D8D-B4D6-40AE-AA10-07781006E8C9@gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<84655D8D-B4D6-40AE-AA10-07781006E8C9@gmail.com>
Message-ID: <42BF1377-E42F-416A-BFEA-6048F3632A81@gmail.com>

On Feb 6, 2012, at 8:17 PM, Raymond Hettinger wrote:

> 
> On Feb 6, 2012, at 12:01 PM, julien tayon wrote:
> 
>> Proposing vector operations on dict, and acknowledging there was an
>> homeomorphism from rooted n-ary trees to dict, was inducing the
>> possibility of making matrix of dict / trees.
> 
> And if you add tensor operations, the implementation can remain
> independent of the system of reference :-)
> 
> Contravariantly yours,
> 
> 
> Raymond

+1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120206/343c7603/attachment.html>

From tjreedy at udel.edu  Tue Feb  7 04:20:14 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 06 Feb 2012 22:20:14 -0500
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkwGr10wHeuVGmXCpJK6QbnjdO66D5k5hgMF-39Cyn_hsA@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<4F3051B6.5060801@kozea.fr>
	<CAFpLVkwGr10wHeuVGmXCpJK6QbnjdO66D5k5hgMF-39Cyn_hsA@mail.gmail.com>
Message-ID: <jgq59s$53e$1@dough.gmane.org>

On 2/6/2012 8:00 PM, julien tayon wrote:

> Lets define the notion of dict(dict()) (rooted k-ary trees) as a
> vector. Imagine :
> tree A
> { a:
>      { b : 1,
>        c : 2
>      },
>    e : 3.0
> }
>
> This is the same as
> vector B
> dict(
>   tuple([ 'a', 'b' ]) = 1,
>   tuple([ 'a', 'c' ]) = 2,
>   tuple([ e ]) = 3.0
> )

Adding quotes makes it different from tree A. The difference is 
important because a dict simulates a (sparse) array only if the keys are 
ordered (or are ordered sequences of ordered objects). Strings are 
ordered, but not general objects.

There is no need to write tuples as tuple(somelist)
dict(
   ('a', 'b') = 1
   ('a', 'c') = 2
   ('e',) = 3.0
)

Since your definitions of + and * on dicts does not use order, using the 
terms 'vector' and 'matrix' just seem distracting to me. The only thing 
you are extracting from them is the idea of component-wise operations on 
collections. What is important is whether the operations apply to the 
*values*.

Whenever one has a dict whose values are lists, it is common to start 
with empty lists and add items to the list for each key with
    d[key].append(val)
You could imagine this operation as performing your dict addition
    d + {key:[val]}
in place and then performing standard list addition in place (.extend).
But thinking this way has limited use.

In actual application, the code is likely to be something like:
   for key,val in source:
     d.get(key,[]).append(val)

There are three points here.
1. These patterns are specific to subcategories of dicts.
2. For dicts (and sets and lists), in-place modification is more common 
than creating new dicts (or sets or lists). Python is not mathematics, 
and it is not a functional, side-effect-free language.
3. The source of modifiers is usually an iterator -- a category rather 
than a class. The iterator does not have to be based on a dict and 
typically is not.

The same points apply to lists. list1 + list2 is rare compared to 
list1.append(item) and list1.extend(iterable_of_items). And of course, 
both apply to all lists and objects and iterables, rather than 
specialized subcategories.

-- 
Terry Jan Reedy


From julien at tayon.net  Tue Feb  7 11:24:17 2012
From: julien at tayon.net (julien tayon)
Date: Tue, 7 Feb 2012 11:24:17 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <20120207011239.GC28570@ando>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<20120207011239.GC28570@ando>
Message-ID: <CAFpLVkwuX=HMSh_BSa3Xwa4nSw_q_X5EgyJ99d0VfJZN2aOETw@mail.gmail.com>

2012/2/7 Steven D'Aprano <steve at pearwood.info>:

> This seems interesting to me, but I don't see that they are important
> enough to be built-in to dicts.
>
> At most, this could be a module in the standard library, but before that
> happens, you would have to prove the usefulness of the module. I suggest
> polishing it to a fit state to use in production, including tests, and
> putting it on PyPI. Once you can demonstrate some interest for it, then
> you can propose it gets added to the std lib.
>
Of course, it's already on pypi, the unittest are being buit up, I
just coded way too much stuff, so code coverage is slowly increasing.
Since it's 90% syntaxic sugar, it is just a commodity for syntax of
tree manipulation. I can improve the readability though.

But,

> Otherwise, this looks rather like a library of functions looking for a
> use. It might help if you demonstrate what concrete problems this helps
> you solve.

Since 95% of the functions are method of a dict, I guess, we may call
it an object.


Cheers,


From julien at tayon.net  Tue Feb  7 11:45:04 2012
From: julien at tayon.net (julien tayon)
Date: Tue, 7 Feb 2012 11:45:04 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <jgq59s$53e$1@dough.gmane.org>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<4F3051B6.5060801@kozea.fr>
	<CAFpLVkwGr10wHeuVGmXCpJK6QbnjdO66D5k5hgMF-39Cyn_hsA@mail.gmail.com>
	<jgq59s$53e$1@dough.gmane.org>
Message-ID: <CAFpLVkwVL9hZzoNuq1LDUZZyVj-djMZkzqON4F=drhfVqXp8dw@mail.gmail.com>

2012/2/7 Terry Reedy <tjreedy at udel.edu>:
>
> Since your definitions of + and * on dicts does not use order, using the
> terms 'vector' and 'matrix' just seem distracting to me. The only thing you
> are extracting from them is the idea of component-wise operations on
> collections. What is important is whether the operations apply to the
> *values*.

I have checked I can go back and forth without problem. but oky I
forgot the quote in A. I know it may have been disturbing, and I
apologize.


Well, order in actual notation is a commodity for not repeating the
dimension's name.
it's easier to write [ 1 , 2, 3 ] then to always repeat [ x = 1, y =
2, z = 3 ] I am just going back to the basis.

Our disagreement mainly comes from me forgetting the quote in A.


>
> In actual application, the code is likely to be something like:
> ?for key,val in source:
> ? ?d.get(key,[]).append(val)
>
It does not propagate recursively though.
so adding d1 =  { 'a' , { 'b' , { 'c' : 1 } , 'd' : 2 } with d2 =   {
'a' , { 'b' , { 'c' : 2 } , 'd' : 1 } wont work with yout example,
but will work with my definition of d1 + d2.


> There are three points here.
> 1. These patterns are specific to subcategories of dicts.
It has sense too for non scalar value, it was  just already tough
trying to explain with scalars, so I limited myself to the simple
case.

> 2. For dicts (and sets and lists), in-place modification is more common than
> creating new dicts (or sets or lists). Python is not mathematics, and it is
> not a functional, side-effect-free language.

I just have to switch a Flag in the matrix operator to make it operate
in place. I was not sure wich option was best.

> 3. The source of modifiers is usually an iterator -- a category rather than
> a class. The iterator does not have to be based on a dict and typically is
> not.

well, you are definitely right on this point.

>
> The same points apply to lists. list1 + list2 is rare compared to
> list1.append(item) and list1.extend(iterable_of_items). And of course, both
> apply to all lists and objects and iterables, rather than specialized
> subcategories.

I also provide an iterator in the form [ ( (path_tovalue), value ), ... ] ^_^

since it does recursive calls, I don't like it, and I try not to make
it too obvious.

Cheers,
Julien


From sturla at molden.no  Tue Feb  7 19:19:36 2012
From: sturla at molden.no (Sturla Molden)
Date: Tue, 07 Feb 2012 19:19:36 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
Message-ID: <4F316B38.7020608@molden.no>

On 06.02.2012 21:01, julien tayon wrote:
> Hello,
>
> Proposing vector operations on dict, and acknowledging there was an
> homeomorphism from rooted n-ary trees to dict, was inducing the
> possibility of making matrix of dict / trees.
>
> Since, linear algebrae on dict was coldly welcomed, I waited to have
> some code to back me up to push my reasoning furhter, and it happily
> worked the way books predicted.

Why would you want to use a hash table (Python dict) for linear algebra?
Not sure I can think of a worse datastructure for the purpose.

There are NumPy...
And in the standard library there is an array module...

For matrix multiplication you can use DGEMM from any LAPACK library if 
you don't like NumPy (e.g. by means of ctypes).

What really should be discussed is inclusion of NumPy in the standard 
library (that is NumPy, not SciPy).

Sturla


From ubershmekel at gmail.com  Wed Feb  8 12:27:48 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Wed, 8 Feb 2012 13:27:48 +0200
Subject: [Python-ideas] Add a recursive function to the glob package
Message-ID: <CANSw7KzmpA5+C0cUwqKG73JO-48yJ7sWBZf6UyNksLu5tgsTuQ@mail.gmail.com>

Many times I've wanted glob to give me all the "*.zip" or "*.py" or "*.h"
files in a directory *and subdirectories* ever since I started using python
7 years ago.

I don't know if I'm the only one or not but here's a patch:
http://bugs.python.org/issue13968

I'd love to hear feedback on the notion and implementation,

Yuval Greenfield
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120208/42b48a14/attachment.html>

From ncoghlan at gmail.com  Wed Feb  8 13:08:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 8 Feb 2012 22:08:38 +1000
Subject: [Python-ideas] Add a recursive function to the glob package
In-Reply-To: <CANSw7KzmpA5+C0cUwqKG73JO-48yJ7sWBZf6UyNksLu5tgsTuQ@mail.gmail.com>
References: <CANSw7KzmpA5+C0cUwqKG73JO-48yJ7sWBZf6UyNksLu5tgsTuQ@mail.gmail.com>
Message-ID: <CADiSq7c6NS9CYceMyEsArXBQ8o0MNdVwN4KSpUd6KCUewD=4Dg@mail.gmail.com>

On Wed, Feb 8, 2012 at 9:27 PM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
> Many times I've wanted glob to give me all the "*.zip" or "*.py" or "*.h"
> files in a directory and subdirectories ever since I started using python 7
> years ago.
>
> I don't know if I'm the only one or not but?here's a patch:
> http://bugs.python.org/issue13968
>
> I'd love to hear feedback on the notion and implementation,

walkdir [1] is designed to handle that use case and more.

>>> from walkdir import file_paths, filtered_walk
>>> paths = file_paths(filtered_walk('.', included_files=['*.py']))
>>> print('\n'.join(sorted(paths)))
./dist/walkdir-0.2.1/build/lib.linux-x86_64-2.7/walkdir.py
./dist/walkdir-0.2.1/docs/conf.py
./dist/walkdir-0.2.1/setup.py
./dist/walkdir-0.2.1/test_walkdir.py
./dist/walkdir-0.2.1/walkdir.py
./docs/conf.py
./setup.py
./test_walkdir.py
./walkdir.py

It's not completely certain yet, but there's a fair chance I'll be
adding at least a subset of the walkdir API to shutil in 3.3 (the idea
actually started as just adding os.filtered_walk() to shutil, but I
moved it to PyPI to give people an opportunity to try out the API.

[1] http://walkdir.readthedocs.org

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From julien at tayon.net  Wed Feb  8 13:12:32 2012
From: julien at tayon.net (julien tayon)
Date: Wed, 8 Feb 2012 13:12:32 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <4F316B38.7020608@molden.no>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<4F316B38.7020608@molden.no>
Message-ID: <CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>

> Why would you want to use a hash table (Python dict) for linear algebra?

* Because it naturally provides matrix. And matrix are an easy way to
formalize and standardize tree manipulations which are a growing
concern in real life computer craft.

* Because actual CS is precise but not exact, and that metrics on
objects enable more exact comparison :

== is the actual way to compare it is precise
, but metrics (cos, norm, dot) enable
is_close_to( Value , modulo error )

For instance no actual langage can tell if two floats are equal,
because, there are error margins.

pi != 3.14159
pi is close to  3.1 [+-.05]

Exactitude and precision are not the same.

> Not sure I can think of a worse datastructure for the purpose.

Well, it is the other way round.
The least surprise principle is that + - * / behave the way it usually
does 90% of the time

at my very personnal opinion, mathematical signs should have been
reserved in all langages to operation analog to mathematics. And
linear algebrae is one of the most accpeted behaviour for these
symbols.

Since there are more than one way to add / mul / div /sub, at my very
own opinion every and each class defining these signs *should* tell
which arithmetics they supports, so that we can  predict the behaviour
of the composition of these operations and conflicts.

We have same symbols, with different meaning, it is a degenerescence
that should disambiguized(? not sure of the orthograph). For instance
in order to raise inconsistency exceptions.

>
> There are NumPy...
> And in the standard library there is an array module...

Which at the opposite of list() supports + - * / in the algebraic sense.

>
> For matrix multiplication you can use DGEMM from any LAPACK library if you
> don't like NumPy (e.g. by means of ctypes).
>
It is stupid to code matrix with an hash, I just say as there is a
strong analogy between dict and vectors, as a result matrix that
operates on dict exists and I can give them a meaning of transforming
rooted trees in rooted trees.

> What really should be discussed is inclusion of NumPy in the standard
> library (that is NumPy, not SciPy).
>
+1 for the inclusion of numpy in stdlib :)
Even though I think it would need a little syntaxic sugar to make it
more pythonic.

Cheers,

Jul


From ncoghlan at gmail.com  Wed Feb  8 13:36:55 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 8 Feb 2012 22:36:55 +1000
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>
	<4F316B38.7020608@molden.no>
	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
Message-ID: <CADiSq7c3pRFzO09Op19Y4djchOyE3HsWsdwsEw+TzO7237VfOw@mail.gmail.com>

On Wed, Feb 8, 2012 at 10:12 PM, julien tayon <julien at tayon.net> wrote:
>> What really should be discussed is inclusion of NumPy in the standard
>> library (that is NumPy, not SciPy).
>>
> +1 for the inclusion of numpy in stdlib :)

There's more to stdlib inclusion than "hey, wouldn't it be nice if <X>
was part of the stdlib?". It needs to make sense to do so, usually by
providing a tangible benefit to the overall Python ecosystem. For
smaller projects (especially predominantly single person projects),
stdlib adoption comes with a guarantee of some level of long term
support (in particular, making sure the module continues to work with
newer versions of Python and on newer operating system releases).

That isn't really the case with NumPy - it has a sizable developer
base of its own, along with solid backing from Enthought.
Incorporation into the standard library would be a *lot* of pain for
minimal gain. If it helps, just consider SciPy Python's "stdlib++" if
you're doing any kind of heavy number crunching with Python. There's a
reason the PyPy folks were able to raise money to sponsor their
NumPyPy compatibility effort - it's because the SciPy ecosystem is
centred around NumPy, and NumPyPy promises to let developers enjoy the
benefit's of PyPy without losing access to SciPy.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From sturla at molden.no  Wed Feb  8 17:08:29 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 08 Feb 2012 17:08:29 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CADiSq7c3pRFzO09Op19Y4djchOyE3HsWsdwsEw+TzO7237VfOw@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
	<CADiSq7c3pRFzO09Op19Y4djchOyE3HsWsdwsEw+TzO7237VfOw@mail.gmail.com>
Message-ID: <4F329DFD.5000109@molden.no>

On 08.02.2012 13:36, Nick Coghlan wrote:

> That isn't really the case with NumPy - it has a sizable developer
> base of its own, along with solid backing from Enthought.

I thing you got this the wrong way. Inclusion in the stdlib requires 
long-term support, it is not a way to ensure long-term support for 
projects that don't have it. (And NumPy is likely to be supported for a 
very long time.)

> Incorporation into the standard library would be a *lot* of pain for
> minimal gain. If it helps, just consider SciPy Python's "stdlib++" if
> you're doing any kind of heavy number crunching with Python. There's a
> reason the PyPy folks were able to raise money to sponsor their
> NumPyPy compatibility effort - it's because the SciPy ecosystem is
> centred around NumPy, and NumPyPy promises to let developers enjoy the
> benefit's of PyPy without losing access to SciPy.

NumPy is not just for number-crunching. It is also a general memory 
abstraction, a mutable container for any kind of binary data, for any 
kind of bit and byte fiddling, reading and parsing binary data, memory 
mapping binary files, etc. Another use-cases are computer graphics and 
image processing.

Sturla


From sturla at molden.no  Wed Feb  8 17:21:12 2012
From: sturla at molden.no (Sturla Molden)
Date: Wed, 08 Feb 2012 17:21:12 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>
	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
Message-ID: <4F32A0F8.5070408@molden.no>

On 08.02.2012 13:12, julien tayon wrote:

> * Because it naturally provides matrix. And matrix are an easy way to
> formalize and standardize tree manipulations which are a growing
> concern in real life computer craft.

No, it naturally provides a hash-table, which is a simple in-memory 
database, not a matrix.

> at my very personnal opinion, mathematical signs should have been
> reserved in all langages to operation analog to mathematics. And
> linear algebrae is one of the most accpeted behaviour for these
> symbols.

There is a world beyond linear algebra. Sometimes we need to do things 
that cannot easily be fit into the semantics of matrix operations. And 
for those that only can think in terms of matrices there are languages 
called Matlab, Scilab, and Octave.


> It is stupid to code matrix with an hash, I just say as there is a
> strong analogy between dict and vectors,

No there is not. A vector is ordered, a hash-table (dict) is unordered.

- In a vectorlike structure, e.g. a Python list, element i+1 is stored 
subsequently to element i.

- In a hash-table, e.g. a Python dict, element hash(i+1) is not stored 
subsequently to element hash(i).


Sturla


From masklinn at masklinn.net  Wed Feb  8 23:00:39 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 8 Feb 2012 23:00:39 +0100
Subject: [Python-ideas] Optional key to `bisect`'s functions?
Message-ID: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>

The ``bisect`` stuff is pretty neat, although probably underused
(especially the insorts), but their usefulness is limited by the
requirement that the lists directly contain sortable items, as opposed
to ``sorted`` or ``list.sort``.

It's possible to "use" them by copy/pasting the (Python) functions
into the project/library code and adding either a custom key directly
or a key function, but while this can still yield an
order-of-magnitude speed gain over post-sorting sequences, it's
cumbersome and it loses the advantage of _bisect's accelerators.

Therefore, I believe it would be pretty neat to add an optional
``key=`` keyword (only?) argument, with the same semantics as in
``sorted``. It would make ``bisect`` much easier to use especially
in stead of append + sorted combinations. The key should work for
both insertion functions and bisection search ones.

Thoughts?


From amauryfa at gmail.com  Wed Feb  8 23:18:54 2012
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 8 Feb 2012 23:18:54 +0100
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
Message-ID: <CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>

Hi,

2012/2/8 Masklinn <masklinn at masklinn.net>

> The ``bisect`` stuff is pretty neat, although probably underused
> (especially the insorts), but their usefulness is limited by the
> requirement that the lists directly contain sortable items, as opposed
> to ``sorted`` or ``list.sort``.
>
> It's possible to "use" them by copy/pasting the (Python) functions
> into the project/library code and adding either a custom key directly
> or a key function, but while this can still yield an
> order-of-magnitude speed gain over post-sorting sequences, it's
> cumbersome and it loses the advantage of _bisect's accelerators.
>
> Therefore, I believe it would be pretty neat to add an optional
> ``key=`` keyword (only?) argument, with the same semantics as in
> ``sorted``. It would make ``bisect`` much easier to use especially
> in stead of append + sorted combinations. The key should work for
> both insertion functions and bisection search ones.
> bisect key


This was proposed several times on the issue tracker (search for "bisect
key"),
and these proposals have always been rejected:
http://bugs.python.org/issue4356
http://bugs.python.org/issue1451588
http://bugs.python.org/issue3374
The last one summarizes the reasons of the rejection.
The documentation (http://docs.python.org/library/bisect.html, "see also")
contains a link to a "SortedCollection" recipe.

I haven't looked at the SortedCollection class in detail, but you could try
to have it included in the stdlib...

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120208/510a247e/attachment.html>

From dreamingforward at gmail.com  Thu Feb  9 01:03:57 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Wed, 8 Feb 2012 17:03:57 -0700
Subject: [Python-ideas] [Python-Dev]  matrix operations on dict :)
In-Reply-To: <CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
Message-ID: <CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>

On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien at tayon.net> wrote:

> 2012/2/7 Mark Janssen <dreamingforward at gmail.com>:
> > On Mon, Feb 6, 2012 at 6:12 PM, Steven D'Aprano <steve at pearwood.info
> > wrote:
> >
> > I have the problem looking for this solution!
> >
>
{ "a" : 1 } + { "a" : { "b" : 1 } } == KABOOM. This a counter example
> proving it does not handle all structures.
>
> Ah, but I already anticipated this.  One just has to decide the
relationship between the *group* and the *atomic*.  (These are key words
that you can find out about at pangaia.sf.net "grouping model").

Admittedly, this might be arbitrary, but once decided you get the full
power of the recursive data structure.   It's kind of like defining the
base case of factorial.  The math (in my world) simply decided that
factorial(0)=1 as the convention of "an empty product"
(Wikipedia::Factorial).

But, in theory, it should work and provide considerable power.  Since it's
all arbitrary one shouldn't get hung up too much on which convention is
adopted, even though it will have to be followed thereafter.  But "practice
beats purity", as they say... :)

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120208/de484c26/attachment.html>

From dreamingforward at gmail.com  Thu Feb  9 01:08:42 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Wed, 8 Feb 2012 17:08:42 -0700
Subject: [Python-ideas] [Python-Dev]  matrix operations on dict :)
In-Reply-To: <CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
Message-ID: <CAMjeLr99frh6uH2-yPX5aDV4XxBHg+KSNDONMtvDrhZAsZ9hSQ@mail.gmail.com>

I wrote:

> But, in theory, it should work and provide considerable power.  Since it's
> all arbitrary one shouldn't get hung up too much on which convention is
> adopted, even though it will have to be followed thereafter.  But "practice
> beats purity", as they say... :)
>
>
Oh, I should give my suggestion:  That when a "non-named" atomic constant
is added to a grouping (i.e. dict), a special key called "anon" (or perhaps
the bulit-in None as the special key would actually work without ambiguity
to other parts of python) is created with that holds the constant.

Cheers!

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120208/65ad12e4/attachment.html>

From guido at python.org  Thu Feb  9 01:25:27 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 8 Feb 2012 16:25:27 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
Message-ID: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>

Hmm... I disagree with Raymond's rejection of the proposed feature. I have
come across use cases for this functionality multiple time in real life.

Basically Raymond says "bisect can call the key() function many times,
which leads to bad design". His alternative, to use a list of (key, value)
tuples, is often a bit clumsy when passing the sorted list to another
function (e.g. for printing); having to transform the list using e.g. [v
for (k, v) in a] feels clumsy and suboptimal. So I'm not sure that refusing
the key= option always leads to the best design (in the sense of the most
readable code).

Adding key= is particularly attractive since the current invariant is
something like "if a == sorted(a) before the operation, then a == sorted(a)
after the operation". Adding a key= option would simply change that to
sorted(a, key=key) on both counts.

Also note that "many times" is actually O(log N) per insertion, which isn't
so bad. The main use case for bisect() is to manage a list that sees
updates *and* iterations -- otherwise building the list unsorted and
sorting it at the end would make more sense. The key= option provides a
balance between the cost/elegance for updates and for iterations.

--Guido

On Wed, Feb 8, 2012 at 2:18 PM, Amaury Forgeot d'Arc <amauryfa at gmail.com>wrote:

> Hi,
>
> 2012/2/8 Masklinn <masklinn at masklinn.net>
>
>>  The ``bisect`` stuff is pretty neat, although probably underused
>> (especially the insorts), but their usefulness is limited by the
>> requirement that the lists directly contain sortable items, as opposed
>> to ``sorted`` or ``list.sort``.
>>
>> It's possible to "use" them by copy/pasting the (Python) functions
>> into the project/library code and adding either a custom key directly
>> or a key function, but while this can still yield an
>> order-of-magnitude speed gain over post-sorting sequences, it's
>> cumbersome and it loses the advantage of _bisect's accelerators.
>>
>> Therefore, I believe it would be pretty neat to add an optional
>> ``key=`` keyword (only?) argument, with the same semantics as in
>> ``sorted``. It would make ``bisect`` much easier to use especially
>> in stead of append + sorted combinations. The key should work for
>> both insertion functions and bisection search ones.
>> bisect key
>
>
> This was proposed several times on the issue tracker (search for "bisect
> key"),
> and these proposals have always been rejected:
> http://bugs.python.org/issue4356
> http://bugs.python.org/issue1451588
> http://bugs.python.org/issue3374
> The last one summarizes the reasons of the rejection.
> The documentation (http://docs.python.org/library/bisect.html, "see also")
> contains a link to a "SortedCollection" recipe.
>
> I haven't looked at the SortedCollection class in detail, but you could
> try to have it included in the stdlib...
>
> --
> Amaury Forgeot d'Arc
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120208/1bf34ef2/attachment.html>

From steve at pearwood.info  Thu Feb  9 01:51:17 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 09 Feb 2012 11:51:17 +1100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <4F32A0F8.5070408@molden.no>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
	<4F32A0F8.5070408@molden.no>
Message-ID: <4F331885.3000305@pearwood.info>

Sturla Molden wrote:
> On 08.02.2012 13:12, julien tayon wrote:

>> It is stupid to code matrix with an hash, I just say as there is a
>> strong analogy between dict and vectors,
> 
> No there is not. A vector is ordered, a hash-table (dict) is unordered.
> 
> - In a vectorlike structure, e.g. a Python list, element i+1 is stored 
> subsequently to element i.

Not necessarily. There is nothing in the API for Python lists that *requires* 
that elements are stored in one continuous array. That's a side-effect of the 
implementation.


> - In a hash-table, e.g. a Python dict, element hash(i+1) is not stored 
> subsequently to element hash(i).

You are focusing too much on accidental implementation details and not enough 
on the fundamental concept of "vector" or "hash table".

Fundamentally, a "dict" is a data structure that associates arbitrary keys to 
values, such that each key is unique but values may not be. Note that the use 
of a hash table for dicts (mappings) is just one possible implementation.

Fundamentally a list is a mapping from sequential (and therefore unique) 
integer keys (the indexes) to values. Note that a linear array with the key 
(index) being implicit rather than explicit is just one possible implementation.

I have no opinion on whether Julien's proposal is useful or not, but vectors 
(lists) can be implemented using mappings (dicts). Lua is proof of this: their 
table type operates as both list and dict.

http://lua-users.org/wiki/TablesTutorial


-- 
Steven


From stephen at xemacs.org  Thu Feb  9 03:13:40 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 09 Feb 2012 11:13:40 +0900
Subject: [Python-ideas] [Python-Dev]  matrix operations on dict :)
In-Reply-To: <CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
Message-ID: <87d39oycrv.fsf@uwakimon.sk.tsukuba.ac.jp>

Mark Janssen writes:

 > The math (in my world) simply decided that factorial(0)=1 as the
 > convention of "an empty product" (Wikipedia::Factorial).

In modern math (ie, post-Eilenberg-Mac Lane), it's not really a
convention (unlike, say, Euclid's Parallel Postulate); it's the only
way to go if you want the idea of product to generalize.  If you don't
understand that, I have serious doubts that you know what you're
talking about.  If you do understand that, please take care to be more
precise.


From tjreedy at udel.edu  Thu Feb  9 03:39:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 08 Feb 2012 21:39:49 -0500
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
Message-ID: <jgvbm5$f9s$1@dough.gmane.org>

On 2/8/2012 7:25 PM, Guido van Rossum wrote:
> Hmm... I disagree with Raymond's rejection of the proposed feature. I
> have come across use cases for this functionality multiple time in real
> life.
>
> Basically Raymond says "bisect can call the key() function many times,
> which leads to bad design". His alternative, to use a list of (key,
> value) tuples, is often a bit clumsy when passing the sorted list to
> another function (e.g. for printing); having to transform the list using
> e.g. [v for (k, v) in a] feels clumsy and suboptimal. So I'm not sure
> that refusing the key= option always leads to the best design (in the
> sense of the most readable code).

An alternative to the n x 2 array is two n-arrays or a 2 x n array. Then 
there is no problem using either the keys or the values array.
To use insort_right or insort_left for this, they would have to return 
the insertion position instead of None. Right now one must use 
bisect_right or _left and then .insert into both arrays instead of just 
the vals array.

> Adding key= is particularly attractive since the current invariant is
> something like "if a == sorted(a) before the operation, then a ==
> sorted(a) after the operation". Adding a key= option would simply change
> that to sorted(a, key=key) on both counts.
>
> Also note that "many times" is actually O(log N) per insertion, which
> isn't so bad. The main use case for bisect() is to manage a list that
> sees updates *and* iterations -- otherwise building the list unsorted
> and sorting it at the end would make more sense. The key= option
> provides a balance between the cost/elegance for updates and for iterations.

For *large enough* lists, the O(n*n) cost of insertions will dominate 
the O(n*logN) key() calls, so reducing the latter to O(n) key calls will 
not matter.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Feb  9 03:42:53 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 08 Feb 2012 21:42:53 -0500
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
Message-ID: <jgvbru$gc2$1@dough.gmane.org>

On 2/8/2012 5:18 PM, Amaury Forgeot d'Arc wrote:

> This was proposed several times on the issue tracker (search for "bisect
> key"),
> and these proposals have always been rejected:
> http://bugs.python.org/issue4356
> http://bugs.python.org/issue1451588
> http://bugs.python.org/issue3374

Do these all suggest a specific api and if so, do they agree?

> The last one summarizes the reasons of the rejection.
> The documentation (http://docs.python.org/library/bisect.html, "see also")
> contains a link to a "SortedCollection" recipe.
>
> I haven't looked at the SortedCollection class in detail, but you could
> try to have it included in the stdlib...
>

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu Feb  9 03:44:56 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 08 Feb 2012 21:44:56 -0500
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <4F331885.3000305@pearwood.info>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
	<4F32A0F8.5070408@molden.no> <4F331885.3000305@pearwood.info>
Message-ID: <jgvbvo$gc2$2@dough.gmane.org>

On 2/8/2012 7:51 PM, Steven D'Aprano wrote:

> Not necessarily. There is nothing in the API for Python lists that
> *requires* that elements are stored in one continuous array. That's a
> side-effect of the implementation.

I believe NumPy uses multiple blocks, as does deque.

-- 
Terry Jan Reedy


From masklinn at masklinn.net  Thu Feb  9 09:45:24 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 09:45:24 +0100
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
Message-ID: <490602D8-AF8E-420D-ADAD-A3B7A8171E75@masklinn.net>

On 2012-02-09, at 01:25 , Guido van Rossum wrote:
> Hmm... I disagree with Raymond's rejection of the proposed feature. I have
> come across use cases for this functionality multiple time in real life.
> 
> Basically Raymond says "bisect can call the key() function many times,
> which leads to bad design". His alternative, to use a list of (key, value)
> tuples, is often a bit clumsy when passing the sorted list to another
> function (e.g. for printing); having to transform the list using e.g. [v
> for (k, v) in a] feels clumsy and suboptimal.

Yes, this is the kind of things which prompted my original email. It is even
clumsier when there are many (smaller) lists to manipulate and insert into
in turn, and requires two verbose and potentially expensive phases of
decoration and undecoration.

Using two separate lists has similar (though simpler) issues, especially
when producing API-related structures as the "helper" collection must be
cleaned up during an undecoration phase.

And as Terry notes, this gets very clumsy as each candidate insertion now
requires three statements (a call to bisect_right to get the insertion
index followed by an insertion into the helper list and an other one for
the actual list).


From masklinn at masklinn.net  Thu Feb  9 09:53:48 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 09:53:48 +0100
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <jgvbru$gc2$1@dough.gmane.org>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<jgvbru$gc2$1@dough.gmane.org>
Message-ID: <A4CAF38D-0958-4E76-8AC3-16E06DD31705@masklinn.net>

On 2012-02-09, at 03:42 , Terry Reedy wrote:
> On 2/8/2012 5:18 PM, Amaury Forgeot d'Arc wrote:
>> This was proposed several times on the issue tracker (search for "bisect
>> key"),
>> and these proposals have always been rejected:
>> http://bugs.python.org/issue4356
>> http://bugs.python.org/issue1451588
>> http://bugs.python.org/issue3374
> 
> Do these all suggest a specific api and if so, do they agree?

http://bugs.python.org/issue4356

Suggests a ``key=`` argument behaving as with ``sorted`` and
``list.sort``: collection values are decorated with the key before
comparisons. This is exactly my original email.

http://bugs.python.org/issue1451588

Suggests a ``cmp=`` argument (proposal precedes Python 3 and ``key=``
taking over) to use instead of the built-in comparison operator.

http://bugs.python.org/issue3374

Suggests all of ``cmp=`` (this again being opposed in ``cmp=`` having
been dropped from Python 3), ``key=`` and ``reverse=``.


In summary, all three suggest following the existing API of
``list.sort`` and ``sorted``, and at least implementing its ``key=``
argument (I am taking issue1451588 as doing so, since it suggests the
mechanism and argument which preceded ``key``)


From robert.kern at gmail.com  Thu Feb  9 12:20:38 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 09 Feb 2012 11:20:38 +0000
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <jgvbvo$gc2$2@dough.gmane.org>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>
	<4F32A0F8.5070408@molden.no> <4F331885.3000305@pearwood.info>
	<jgvbvo$gc2$2@dough.gmane.org>
Message-ID: <jh0a66$hjp$1@dough.gmane.org>

On 2/9/12 2:44 AM, Terry Reedy wrote:
> On 2/8/2012 7:51 PM, Steven D'Aprano wrote:
>
>> Not necessarily. There is nothing in the API for Python lists that
>> *requires* that elements are stored in one continuous array. That's a
>> side-effect of the implementation.
>
> I believe NumPy uses multiple blocks, as does deque.

numpy uses uniformly strided memory starting from a single memory location, 
which is not quite the same as using multiple blocks.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From arnodel at gmail.com  Thu Feb  9 15:27:20 2012
From: arnodel at gmail.com (Arnaud Delobelle)
Date: Thu, 9 Feb 2012 14:27:20 +0000
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
Message-ID: <CAJ6cK1bW7P_8tG=kgRZRTygBzbXeVS7=c4TQSpWaxqbr5=Rfng@mail.gmail.com>

On 9 February 2012 00:25, Guido van Rossum <guido at python.org> wrote:
> Basically Raymond says "bisect can call the key() function many times, which
> leads to bad design". His alternative, to use a list of (key, value) tuples,
> is often a bit clumsy when passing the sorted list to another function (e.g.
> for printing); having to transform the list using e.g. [v for (k, v) in a]
> feels clumsy and suboptimal.

Also, in Python 3 one can't assume that values will be comparable so
the (key, value) tuple trick won't work: comparing the tuples may well
throw a TypeError.  Here's a simple example below.  The class 'Person'
has no natural order, but we may want to keep a list of people sorted
by iq:

>>> class Person:
...     def __init__(self, height, iq):
...         self.height = height
...         self.iq = iq
...
>>> arno = Person(184, 101)
>>> guido = Person(179, 185)
>>> steve = Person(168, 101)
>>> key = lambda p: p.iq
>>> people = []
>>> bisect.insort(people, (key(arno), arno))
>>> bisect.insort(people, (key(guido), guido))
>>> bisect.insort(people, (key(steve), steve))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: Person() < Person()
>>>

-- 
Arnaud


From sturla at molden.no  Thu Feb  9 15:27:29 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 15:27:29 +0100
Subject: [Python-ideas] matrix operations on dict :)
In-Reply-To: <jgvbvo$gc2$2@dough.gmane.org>
References: <CAFpLVkzLw6RY6z_ZDo43kaj=N+XH2j=D8EcMRvCapeKn4VX82g@mail.gmail.com>	<4F316B38.7020608@molden.no>	<CAFpLVkz1B_gwcv8M0aP2Hd4me+rs6Ce2j6Rr-jq3R+_g=9iB+g@mail.gmail.com>	<4F32A0F8.5070408@molden.no>
	<4F331885.3000305@pearwood.info> <jgvbvo$gc2$2@dough.gmane.org>
Message-ID: <4F33D7D1.5060800@molden.no>

On 09.02.2012 03:44, Terry Reedy wrote:

> I believe NumPy uses multiple blocks, as does deque.

No it does not (see Robert Kern's reply).

But a lot of numerical codes in C or Java do, using an array of pointers 
(C) or an array of arrays (Java) to emulate a two dimensional array, 
particularly numerical code written by amateurs. I've also seen this in 
Python, using (heaven forbid) lists of lists as a 2D array replacement. 
It is sad that Numerical Receipes encourages this coding style. 
(Actually the third edition does not, but it is not sufficient to remedy 
the damage.) Those who don't understand why "jagged arrays" can be a 
problem should stick to Matlab, Fortran or NumPy.

Sturla


From techtonik at gmail.com  Thu Feb  9 15:36:40 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 9 Feb 2012 17:36:40 +0300
Subject: [Python-ideas] Python 3000 TIOBE -3%
Message-ID: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>

Hi,

I didn't want to grow FUD on python-dev, but a FUD there seems to be a good
topic for discussion here.
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html

As you may see, Python is losing its positions. I blame Python 3 and that
Python development is not concentrating on users enough [1], and that there
is a big resistance in getting the things done (/moin/ prefix story) and
the whole communication process is a bit discouraging. If it is not the
cause, then the cause is the lack of visibility into the real problem, but
what the real problem is?

I guess the topic is for upcoming language summit at PyCon, but it will be
hard for me to get there this year from Belarus, so it would be nice to
read some opinions here.


1. http://python-for-humans.heroku.com/
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/cf5e316b/attachment.html>

From masklinn at masklinn.net  Thu Feb  9 16:05:09 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 16:05:09 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
Message-ID: <0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>

On 2012-02-09, at 15:36 , anatoly techtonik wrote:
> Hi,
> 
> I didn't want to grow FUD on python-dev, but a FUD there seems to be a good
> topic for discussion here.
> http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html

1. Python-ideas is not the right place for this stuff (neither is Python-dev, by the way)
2. Why would anybody care exactly?

From ubershmekel at gmail.com  Thu Feb  9 16:13:03 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Thu, 9 Feb 2012 17:13:03 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>
Message-ID: <CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>

On Thu, Feb 9, 2012 at 5:05 PM, Masklinn <masklinn at masklinn.net> wrote:

> On 2012-02-09, at 15:36 , anatoly techtonik wrote:
> > Hi,
> >
> > I didn't want to grow FUD on python-dev, but a FUD there seems to be a
> good
> > topic for discussion here.
> > http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
>
> 1. Python-ideas is not the right place for this stuff (neither is
> Python-dev, by the way)
> 2. Why would anybody care exactly?
>


   1. Where would be the correct place to talk about a grand state of
   python affairs?
   2. Like it or not, many use such ratings to decide which language to
   learn, which language to use for their next project and whether or not to
   be proud of their language of choice.

I think it's important for python to be popular and good. One without the
other isn't too useful.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/14b5d951/attachment.html>

From solipsis at pitrou.net  Thu Feb  9 16:19:23 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 9 Feb 2012 16:19:23 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
Message-ID: <20120209161923.4417cbae@pitrou.net>

On Thu, 9 Feb 2012 17:36:40 +0300
anatoly techtonik <techtonik at gmail.com>
wrote:
> Hi,
> 
> I didn't want to grow FUD on python-dev, but a FUD there seems to be a good
> topic for discussion here.

It isn't.


From stefan_ml at behnel.de  Thu Feb  9 16:24:53 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 09 Feb 2012 16:24:53 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>
	<CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
Message-ID: <jh0og5$4k7$1@dough.gmane.org>

Yuval Greenfield, 09.02.2012 16:13:
> On Thu, Feb 9, 2012 at 5:05 PM, Masklinn wrote:
>> On 2012-02-09, at 15:36 , anatoly techtonik wrote:
>>> I didn't want to grow FUD on python-dev, but a FUD there seems to be a
>>> good topic for discussion here.
>>> http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
>>
>> 1. Python-ideas is not the right place for this stuff (neither is
>> Python-dev, by the way)
>> 2. Why would anybody care exactly?
> 
>    1. Where would be the correct place to talk about a grand state of
>    python affairs?

The right place to discuss "most things Python" is python-list, aka.
comp.lang.python.

Stefan


From nathan.alexander.rice at gmail.com  Thu Feb  9 16:31:51 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Thu, 9 Feb 2012 10:31:51 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>
	<CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
Message-ID: <CAOFbRmKNt=t5tATV29u2RA-60C6jXUywz7iHPGi8611VRMGtpQ@mail.gmail.com>

> Where would be the correct place to talk about a grand state of python
> affairs?
> Like it or not, many use such ratings to decide which language to learn,
> which language to use for their next project and whether or not to be proud
> of their language of choice.
>
> I think it's important for python to be popular and good. One without the
> other isn't too useful.

The reason python is slipping in the index is the same reason that its
popularity doesn't matter (much).  Wrapper generating tools, cross
language interfaces and whatnot are making "polyglot" programming a
pretty simple affair these days...

The TIOBE index for the most part has two distinct groups:  Languages
that people use at work, where risk aversion are large driving forces
(see java, c/++, php) and languages people use personally because they
enjoy programming in them.   Because the library issue for a new or
less popular language is not as big a deal as it once was, people have
more freedom in their choice, and that is reflected in the
diversification of "fun" languages.  Javascript is an outlier here,
you don't have a choice if you target the browser.


Nathan


From solipsis at pitrou.net  Thu Feb  9 16:47:17 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 9 Feb 2012 16:47:17 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<20120209161923.4417cbae@pitrou.net>
Message-ID: <20120209164717.237d2c5a@pitrou.net>

On Thu, 9 Feb 2012 16:19:23 +0100
Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Thu, 9 Feb 2012 17:36:40 +0300
> anatoly techtonik <techtonik at gmail.com>
> wrote:
> > Hi,
> > 
> > I didn't want to grow FUD on python-dev, but a FUD there seems to be a good
> > topic for discussion here.
> 
> It isn't.

And to elaborate a bit, here's the description of the python-ideas list:

?This list is to contain discussion of speculative language ideas for
Python for possible inclusion into the language. If an idea gains
traction it can then be discussed and honed to the point of becoming a
solid proposal to put to either python-dev or python-3000 as
appropriate.? (*)

python-ideas is not a catchall for random opinions about Python.


(*) someone should really remove that python-3000 reference


From benjamin at python.org  Thu Feb  9 16:50:10 2012
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 9 Feb 2012 15:50:10 +0000 (UTC)
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
Message-ID: <loom.20120209T164928-951@post.gmane.org>

anatoly techtonik <techtonik at ...> writes:
> As you may see, Python is losing its positions. I blame Python 3 and that
Python development is not concentrating on users enough [1], and that there is a
big resistance in getting the things done (/moin/ prefix story) and the whole
communication process is a bit discouraging.

Indeed. What would you suggest to alleviate that?


From ehlesmes at gmail.com  Thu Feb  9 17:40:11 2012
From: ehlesmes at gmail.com (Edward Lesmes)
Date: Thu, 9 Feb 2012 11:40:11 -0500
Subject: [Python-ideas] map iterator
Message-ID: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>

An iterator version of map should be available for large sets of data.

-- 
Edward Lesmes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/64aab206/attachment.html>

From guido at python.org  Thu Feb  9 17:41:44 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 08:41:44 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAJ6cK1bW7P_8tG=kgRZRTygBzbXeVS7=c4TQSpWaxqbr5=Rfng@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAJ6cK1bW7P_8tG=kgRZRTygBzbXeVS7=c4TQSpWaxqbr5=Rfng@mail.gmail.com>
Message-ID: <CAP7+vJKbx+SqORpQ-OQobBwfuHBOhQWLogfKup8sJykmV=zqcQ@mail.gmail.com>

Bingo. That clinches it. We need to add key=.

On Thu, Feb 9, 2012 at 6:27 AM, Arnaud Delobelle <arnodel at gmail.com> wrote:

> On 9 February 2012 00:25, Guido van Rossum <guido at python.org> wrote:
> > Basically Raymond says "bisect can call the key() function many times,
> which
> > leads to bad design". His alternative, to use a list of (key, value)
> tuples,
> > is often a bit clumsy when passing the sorted list to another function
> (e.g.
> > for printing); having to transform the list using e.g. [v for (k, v) in
> a]
> > feels clumsy and suboptimal.
>
> Also, in Python 3 one can't assume that values will be comparable so
> the (key, value) tuple trick won't work: comparing the tuples may well
> throw a TypeError.  Here's a simple example below.  The class 'Person'
> has no natural order, but we may want to keep a list of people sorted
> by iq:
>
> >>> class Person:
> ...     def __init__(self, height, iq):
> ...         self.height = height
> ...         self.iq = iq
> ...
> >>> arno = Person(184, 101)
> >>> guido = Person(179, 185)
> >>> steve = Person(168, 101)
> >>> key = lambda p: p.iq
> >>> people = []
> >>> bisect.insort(people, (key(arno), arno))
> >>> bisect.insort(people, (key(guido), guido))
> >>> bisect.insort(people, (key(steve), steve))
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> TypeError: unorderable types: Person() < Person()
> >>>
>
> --
> Arnaud
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/630a2dcf/attachment.html>

From phd at phdru.name  Thu Feb  9 17:47:03 2012
From: phd at phdru.name (Oleg Broytman)
Date: Thu, 9 Feb 2012 20:47:03 +0400
Subject: [Python-ideas] map iterator
In-Reply-To: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>
References: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>
Message-ID: <20120209164703.GB15324@iskra.aviel.ru>

On Thu, Feb 09, 2012 at 11:40:11AM -0500, Edward Lesmes wrote:
> An iterator version of map should be available for large sets of data.

   itertools.imap

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From malaclypse2 at gmail.com  Thu Feb  9 17:48:27 2012
From: malaclypse2 at gmail.com (Jerry Hill)
Date: Thu, 9 Feb 2012 11:48:27 -0500
Subject: [Python-ideas] map iterator
In-Reply-To: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>
References: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>
Message-ID: <CADwdpyYVQ0LThhwq-FKfhTBTKRS4hhmT+=in6OOA9Y3uN1L0pg@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:40 AM, Edward Lesmes <ehlesmes at gmail.com> wrote:

> An iterator version of map should be available for large sets of data.
>

The python time machine strikes again.  In python 2, this is available as
itertools.imap.  In python 3, this is the default behavior of the map()
function.

-- 
Jerry
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/6f366254/attachment.html>

From massimo.dipierro at gmail.com  Thu Feb  9 17:49:29 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Thu, 9 Feb 2012 10:49:29 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
Message-ID: <AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>

Here is another data point:
http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/

Unfortunately the TIOBE index does matter. I can speak for python in  
education and trends I seen.

Python is and remains the easiest language to teach but it is no  
longer true that getting Python to run is easer than alternatives (not  
for the average undergrad student). It used to be you download python  
2.5 and you were in business. Now you have to make a choice 2.x or  
3.x. 20% of the students cannot tell one from the other (even after  
been told repeatedly which one to use). Three weeks into the class  
they complain with "the class code won't compile" (the same 20% cannot  
tell a compiler form an interpreter).

50+% of the students have a mac and an increasing number of packages  
depend on numpy. Installing numpy on mac is a lottery.

Those who do not have a mac have windows and they expect an IDE like  
eclipse. I know you can use Python with eclipse but they do not. They  
download Python and complain that IDLE has no autocompletion, no line  
numbers, no collapsible functions/classes.

 From the hard core computer scientists prospective there are usually  
three objections to using Python:
- Most software engineers think we should only teach static type  
languages
- Those who care about scalability complain about the GIL
- The programming language purists complain about the use of reference  
counting instead of garbage collection

The net result is that people cannot agree and it is getting  
increasingly difficult to make the case for the use of Python in intro  
CS courses. For some reason javaScript seems to win these days.

Massimo


On Feb 9, 2012, at 8:36 AM, anatoly techtonik wrote:

> Hi,
>
> I didn't want to grow FUD on python-dev, but a FUD there seems to be  
> a good topic for discussion here.
> http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
>
> As you may see, Python is losing its positions. I blame Python 3 and  
> that Python development is not concentrating on users enough [1],  
> and that there is a big resistance in getting the things done (/ 
> moin/ prefix story) and the whole communication process is a bit  
> discouraging. If it is not the cause, then the cause is the lack of  
> visibility into the real problem, but what the real problem is?
>
> I guess the topic is for upcoming language summit at PyCon, but it  
> will be hard for me to get there this year from Belarus, so it would  
> be nice to read some opinions here.
>
>
> 1. http://python-for-humans.heroku.com/
> -- 
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/1f716d43/attachment.html>

From ehlesmes at gmail.com  Thu Feb  9 18:12:01 2012
From: ehlesmes at gmail.com (Edward Lesmes)
Date: Thu, 9 Feb 2012 12:12:01 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
Message-ID: <CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>

Massimo Di Pierro <massimo.dipierro at ...> writes:
> 50+% of the students have a mac and an increasing number of packages
depend on numpy. Installing numpy on mac is a lottery.

About the numpy dependency, I think is a reason to integrate numpy in
python.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/d961a1f3/attachment.html>

From anacrolix at gmail.com  Thu Feb  9 18:21:26 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 01:21:26 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
Message-ID: <CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>

This
On Feb 10, 2012 12:49 AM, "Massimo Di Pierro" <massimo.dipierro at gmail.com>
wrote:

> Here is another data point:
> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
>
> Unfortunately the TIOBE index does matter. I can speak for python in
> education and trends I seen.
>
> Python is and remains the easiest language to teach but it is no longer
> true that getting Python to run is easer than alternatives (not for the
> average undergrad student). It used to be you download python 2.5 and you
> were in business. Now you have to make a choice 2.x or 3.x. 20% of the
> students cannot tell one from the other (even after been told repeatedly
> which one to use). Three weeks into the class they complain with "the class
> code won't compile" (the same 20% cannot tell a compiler form an
> interpreter).
>
> 50+% of the students have a mac and an increasing number of packages
> depend on numpy. Installing numpy on mac is a lottery.
>
> Those who do not have a mac have windows and they expect an IDE like
> eclipse. I know you can use Python with eclipse but they do not. They
> download Python and complain that IDLE has no autocompletion, no line
> numbers, no collapsible functions/classes.
>
> From the hard core computer scientists prospective there are usually three
> objections to using Python:
> - Most software engineers think we should only teach static type languages
> - Those who care about scalability complain about the GIL
> - The programming language purists complain about the use of reference
> counting instead of garbage collection
>
> The net result is that people cannot agree and it is getting increasingly
> difficult to make the case for the use of Python in intro CS courses. For
> some reason javaScript seems to win these days.
>
> Massimo
>
>
> On Feb 9, 2012, at 8:36 AM, anatoly techtonik wrote:
>
> Hi,
>
> I didn't want to grow FUD on python-dev, but a FUD there seems to be a
> good topic for discussion here.
> http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
>
> As you may see, Python is losing its positions. I blame Python 3 and that
> Python development is not concentrating on users enough [1], and that there
> is a big resistance in getting the things done (/moin/ prefix story) and
> the whole communication process is a bit discouraging. If it is not the
> cause, then the cause is the lack of visibility into the real problem, but
> what the real problem is?
>
> I guess the topic is for upcoming language summit at PyCon, but it will be
> hard for me to get there this year from Belarus, so it would be nice to
> read some opinions here.
>
>
> 1. http://python-for-humans.heroku.com/
> --
> anatoly t.
>  _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/bd932dc8/attachment.html>

From raymond.hettinger at gmail.com  Thu Feb  9 18:27:25 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 9 Feb 2012 09:27:25 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
Message-ID: <FCAAB119-097A-4F92-87F0-DCB96E4B92CE@rcn.com>


On Feb 8, 2012, at 4:25 PM, Guido van Rossum wrote:

> Also note that "many times" is actually O(log N) per insertion, which isn't so bad. The main use case for bisect() is to manage a list that sees updates *and* iterations -- otherwise building the list unsorted and sorting it at the end would make more sense. The key= option provides a balance between the cost/elegance for updates and for iterations.

Would you be open to introducing a SortedList class to encapsulate the data so that key functions get applied no more than once per record and the sort order is maintained as new items are inserted?

ISTM, the whole problem with bisect is that the underlying list is naked, leaving no way to easily correlate the sort keys with the corresponding sorted records.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/203d9849/attachment.html>

From anacrolix at gmail.com  Thu Feb  9 18:35:17 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 01:35:17 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
Message-ID: <CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>

>From my own observations, the recent drop is sure to uncertainty with
Python 3, and an increase of alternatives on server side, such as Node.

The transition is only going to get more painful as system critical
software lags on 2.x while users clamour for 3.x. I understand there are
some fundamental problems in running both simultaneously which makes
gradual integration not a possibility. Dynamic typing also doesn't help,
making it very hard to automatically port, and update dependencies.

Lesser reasons include an increasing gap in scalability to multicore
compared with other languages (the GIL being the gorilla here,
multiprocessing is unacceptable as long as native threading is the only
supported concurrency mechanism), and a lack of enthusiasm from key
technologies and vendors: GAE, gevent, matplotlib, are a few encountered
personally.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/89fc9215/attachment.html>

From tjreedy at udel.edu  Thu Feb  9 18:43:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 12:43:49 -0500
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJKbx+SqORpQ-OQobBwfuHBOhQWLogfKup8sJykmV=zqcQ@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAJ6cK1bW7P_8tG=kgRZRTygBzbXeVS7=c4TQSpWaxqbr5=Rfng@mail.gmail.com>
	<CAP7+vJKbx+SqORpQ-OQobBwfuHBOhQWLogfKup8sJykmV=zqcQ@mail.gmail.com>
Message-ID: <jh10l7$97b$1@dough.gmane.org>

On 2/9/2012 11:41 AM, Guido van Rossum wrote:
> Bingo. That clinches it. We need to add key=.
I reopened http://bugs.python.org/issue4356
with the above quoted. It has a patch with tests ready for review.

-- 
Terry Jan Reedy


From massimo.dipierro at gmail.com  Thu Feb  9 18:46:45 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Thu, 9 Feb 2012 11:46:45 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
Message-ID: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>

I think if easy_install, gevent, numpy (*), and win32 extensions where  
included in 3.x, together with a slightly better Idle (still based on  
Tkinter, with multiple pages, autocompletion, collapsible, line  
numbers, better printing with syntax highlitghing), and if  
easy_install were accessible via Idle, this would be a killer version.

Longer term removing the GIL and using garbage collection should be a  
priority. I am not sure what is involved and how difficult it is but  
perhaps this is what PyCon money can be used for. If this cannot be  
done without breaking backward compatibility again, then 3.x should be  
considered an experimental branch, people should be advised to stay  
with 2.7 (2.8?) and then skip to 4.x directly when these problems are  
resolved. Python should not make a habit of breaking backward  
compatibility.

It would be really nice if it were to include an async web server  
(based on gevent for example) and better parser for HTTP headers and a  
python based template language (like mako or the web2py one) not just  
for the web but for document generation in general.

Massimo

On Feb 9, 2012, at 11:12 AM, Edward Lesmes wrote:

> Massimo Di Pierro <massimo.dipierro at ...> writes:
> > 50+% of the students have a mac and an increasing number of  
> packages depend on numpy. Installing numpy on mac is a lottery.
>
> About the numpy dependency, I think is a reason to integrate numpy  
> in python.
>
>


From guido at python.org  Thu Feb  9 18:48:18 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 09:48:18 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <FCAAB119-097A-4F92-87F0-DCB96E4B92CE@rcn.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<FCAAB119-097A-4F92-87F0-DCB96E4B92CE@rcn.com>
Message-ID: <CAP7+vJJE0zhWquQdSi5QYg876BPRgYMDanw_JTd89xtH5NE1TQ@mail.gmail.com>

On Thu, Feb 9, 2012 at 9:27 AM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

>
> On Feb 8, 2012, at 4:25 PM, Guido van Rossum wrote:
>
> Also note that "many times" is actually O(log N) per insertion, which
> isn't so bad. The main use case for bisect() is to manage a list that sees
> updates *and* iterations -- otherwise building the list unsorted and
> sorting it at the end would make more sense. The key= option provides a
> balance between the cost/elegance for updates and for iterations.
>
>
> Would you be open to introducing a SortedList class to encapsulate the
> data so that key functions get applied no more than once per record and the
> sort order is maintained as new items are inserted?
>

Hm. A good implementation of such a thing would probably require a B-tree
implementation (or some other tree). That sounds like a good data type to
have in the collections module, but doesn't really address the desire to
use bisect on a list. Also it further erodes my desire not to bother the
programmer with subtle decisions about the choice of data type: the most
basic types (string, number, list, dict) are easy enough to distinguish,
and further choices between subclasses or alternatives are usually a
distraction.


> ISTM, the whole problem with bisect is that the underlying list is naked,
> leaving no way to easily correlate the sort keys with the corresponding
> sorted records.
>

That same "problem" would exist for sorted() and list.sort(), and the
solution there (parameterize it with a key= option) easily generalizes to
bisect (and to heapq, for that matter).

The more fundamental "conflict" here seems to be between algorithms and
classes. list.sort(), bisect and heapq focus on the algorithm. In some
sense they reflect the state of the world before object-oriented
programming was invented. Sometimes it is useful to encapsulate these in
classes. Other times, the encapsulation doesn't add to the clarity of the
program.

One more thing: bisect.py doesn't only apply to insertions. It is also
useful to find a "nearest" elements in a pre-sorted list. Probably that
list was sorted using list.sort(), possibly using key=...

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/ad2dce99/attachment.html>

From stutzbach at google.com  Thu Feb  9 18:50:19 2012
From: stutzbach at google.com (Daniel Stutzbach)
Date: Thu, 9 Feb 2012 09:50:19 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
Message-ID: <CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>

On Wed, Feb 8, 2012 at 4:25 PM, Guido van Rossum <guido at python.org> wrote:

> Also note that "many times" is actually O(log N) per insertion, which
> isn't so bad. The main use case for bisect() is to manage a list that sees
> updates *and* iterations -- otherwise building the list unsorted and
> sorting it at the end would make more sense. The key= option provides a
> balance between the cost/elegance for updates and for iterations.
>

Maintaining a sorted list using Python's list type is a trap.  The bisect
is O(log n), but insertion and deletion are still O(n).

A SortedList class that provides O(log n) insertions is useful from time to
time.  There are several existing implementations available (I wrote one of
them, on top of my blist type), each with their pros and cons.

-- 
Daniel Stutzbach
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/cfe76a42/attachment.html>

From tjreedy at udel.edu  Thu Feb  9 19:02:09 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 13:02:09 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
Message-ID: <jh11nj$hn1$1@dough.gmane.org>

On 2/9/2012 12:12 PM, Edward Lesmes wrote:
> Massimo Di Pierro <massimo.dipierro at ...> writes:
>  > 50+% of the students have a mac and an increasing number of packages
> depend on numpy. Installing numpy on mac is a lottery.
>
> About the numpy dependency, I think is a reason to integrate numpy in
> python.

And make installing Python on the Mac a lottery?

-- 
Terry Jan Reedy


From steve at pearwood.info  Thu Feb  9 19:03:45 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Feb 2012 05:03:45 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
Message-ID: <4F340A81.60300@pearwood.info>

Massimo Di Pierro wrote:
> Here is another data point:
> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
> 
> Unfortunately the TIOBE index does matter. I can speak for python in 
> education and trends I seen.
> 
> Python is and remains the easiest language to teach but it is no longer 
> true that getting Python to run is easer than alternatives (not for the 
> average undergrad student).

Is that a commentary on Python, or the average undergrad student?


> It used to be you download python 2.5 and 
> you were in business. Now you have to make a choice 2.x or 3.x. 20% of 
> the students cannot tell one from the other (even after been told 
> repeatedly which one to use). Three weeks into the class they complain 
> with "the class code won't compile" (the same 20% cannot tell a compiler 
> form an interpreter).

Python has a compiler. The "c" in .pyc files stands for "compiled" and Python 
has a built-in function called "compile". It just happens to compile to byte 
code that runs on a virtual machine, not machine code running on physical 
hardware. PyPy takes it even further, with a JIT compiler that operates on the 
byte code.


> 50+% of the students have a mac and an increasing number of packages 
> depend on numpy. Installing numpy on mac is a lottery.
> 
> Those who do not have a mac have windows and they expect an IDE like 
> eclipse. I know you can use Python with eclipse but they do not. They 
> download Python and complain that IDLE has no autocompletion, no line 
> numbers, no collapsible functions/classes.
> 
>  From the hard core computer scientists prospective there are usually 
> three objections to using Python:
> - Most software engineers think we should only teach static type languages
> - Those who care about scalability complain about the GIL

How is that relevant to a language being taught to undergrads? Sounds more 
like an excuse to justify dislike of teaching Python rather than an actual 
reason to dislike Python.


> - The programming language purists complain about the use of reference 
> counting instead of garbage collection

The programming language purists should know better than that. The choice of 
which garbage collection implementation (ref counting is garbage collection) 
is a quality of implementation detail, not a language feature.


-- 
Steven


From masklinn at masklinn.net  Thu Feb  9 19:14:39 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 19:14:39 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F340A81.60300@pearwood.info>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
Message-ID: <4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>

On 2012-02-09, at 19:03 , Steven D'Aprano wrote:
> The choice of which garbage collection implementation (ref counting is garbage collection) is a quality of implementation detail, not a language feature.

That's debatable, it's an implementation detail with very different semantics which tends to leak out into usage patterns of the language (as it did with CPython, which basically did not get fixed in the community until Pypy started ascending), especially when the language does not provide "better" ways to handle things (as Python finally did by adding context managers in 2.5).

So theoretically, automatic refcounting is a detail, but practically it influences language usage differently than most other GC techniques (when it'd the only GC strategy in the language anyway)

From guido at python.org  Thu Feb  9 19:22:47 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:22:47 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F340A81.60300@pearwood.info>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
Message-ID: <CAP7+vJ+rsLHXARiVYFh16zQ4wCVu9VH4EX2d4UV3wePAC-KTFw@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:03 AM, Steven D'Aprano <steve at pearwood.info>wrote:

> Massimo Di Pierro wrote:
>
>> Here is another data point:
>> http://redmonk.com/sogrady/**2012/02/08/language-rankings-**2-2012/<http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/>
>>
>> Unfortunately the TIOBE index does matter. I can speak for python in
>> education and trends I seen.
>>
>> Python is and remains the easiest language to teach but it is no longer
>> true that getting Python to run is easer than alternatives (not for the
>> average undergrad student).
>>
>
> Is that a commentary on Python, or the average undergrad student?


Well either way it's depressing...


>  It used to be you download python 2.5 and you were in business. Now you
>> have to make a choice 2.x or 3.x. 20% of the students cannot tell one from
>> the other (even after been told repeatedly which one to use). Three weeks
>> into the class they complain with "the class code won't compile" (the same
>> 20% cannot tell a compiler form an interpreter).
>>
>
> Python has a compiler. The "c" in .pyc files stands for "compiled" and
> Python has a built-in function called "compile". It just happens to compile
> to byte code that runs on a virtual machine, not machine code running on
> physical hardware. PyPy takes it even further, with a JIT compiler that
> operates on the byte code.


Not sure how that's relevant. Massimo used "won't compile" as a shorthand
for "has a syntax error".

 50+% of the students have a mac and an increasing number of packages
> depend on numpy. Installing numpy on mac is a lottery.
>

But that was the same in the 2.5 days. The problem is worse now because (a)
numpy is going mainstream, and (b) Macs don't come with a C compiler any
more.

I think the answer will have to be in making an effort to produce robust
and frequently updated downloads of numpy to match various popular Python
versions and platforms. This is a major pain (packaging always is) so maybe
some incentive is necessary (just like ActiveState has its Python distros).


> Those who do not have a mac have windows and they expect an IDE like
> eclipse. I know you can use Python with eclipse but they do not. They
> download Python and complain that IDLE has no autocompletion, no line
> numbers, no collapsible functions/classes.
>

Hm. I know a fair number of people who use Eclipse to edit Python (there's
some plugin). This seems easy enough to address by just pointing people to
the plugin, I don't think Python itself is to blame here.

  From the hard core computer scientists prospective there are usually
> three objections to using Python:
> - Most software engineers think we should only teach static type languages
> - Those who care about scalability complain about the GIL
>

How is that relevant to a language being taught to undergrads? Sounds more
> like an excuse to justify dislike of teaching Python rather than an actual
> reason to dislike Python.


I can see the discomfort if the other professors keep bringing this up. It
is, sadly, a very effective troll. (Before it was widely know, the most
common troll was the whitespace. People would declare it to be ridiculous
without ever having tried it. Same with the GIL.)

 - The programming language purists complain about the use of reference
> counting instead of garbage collection
>

The programming language purists should know better than that. The choice
> of which garbage collection implementation (ref counting is garbage
> collection) is a quality of implementation detail, not a language feature.
>

Yeah, trolls are a pain. We need to start spreading more effective
counter-memes.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/a455dde4/attachment.html>

From massimo.dipierro at gmail.com  Thu Feb  9 19:25:18 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Thu, 9 Feb 2012 12:25:18 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F340A81.60300@pearwood.info>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
Message-ID: <10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>


On Feb 9, 2012, at 12:03 PM, Steven D'Aprano wrote:

> Massimo Di Pierro wrote:
>> Here is another data point:
>> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
>> Unfortunately the TIOBE index does matter. I can speak for python  
>> in education and trends I seen.
>> Python is and remains the easiest language to teach but it is no  
>> longer true that getting Python to run is easer than alternatives  
>> (not for the average undergrad student).
>
> Is that a commentary on Python, or the average undergrad student?

I teach so the average student is my benchmark. Please do not  
misunderstand. While some may be lazy, but the average CS undergrad is  
not stupid but quite intelligent. They just do not like wasting time  
with setups and I sympathize with that. Batteries included is the  
Python motto.

>> It used to be you download python 2.5 and you were in business. Now  
>> you have to make a choice 2.x or 3.x. 20% of the students cannot  
>> tell one from the other (even after been told repeatedly which one  
>> to use). Three weeks into the class they complain with "the class  
>> code won't compile" (the same 20% cannot tell a compiler form an  
>> interpreter).
>
> Python has a compiler. The "c" in .pyc files stands for "compiled"  
> and Python has a built-in function called "compile". It just happens  
> to compile to byte code that runs on a virtual machine, not machine  
> code running on physical hardware. PyPy takes it even further, with  
> a JIT compiler that operates on the byte code.
>
>
>> 50+% of the students have a mac and an increasing number of  
>> packages depend on numpy. Installing numpy on mac is a lottery.
>> Those who do not have a mac have windows and they expect an IDE  
>> like eclipse. I know you can use Python with eclipse but they do  
>> not. They download Python and complain that IDLE has no  
>> autocompletion, no line numbers, no collapsible functions/classes.
>> From the hard core computer scientists prospective there are  
>> usually three objections to using Python:
>> - Most software engineers think we should only teach static type  
>> languages
>> - Those who care about scalability complain about the GIL
>
> How is that relevant to a language being taught to undergrads?  
> Sounds more like an excuse to justify dislike of teaching Python  
> rather than an actual reason to dislike Python.
>
>
>> - The programming language purists complain about the use of  
>> reference counting instead of garbage collection
>
> The programming language purists should know better than that. The  
> choice of which garbage collection implementation (ref counting is  
> garbage collection) is a quality of implementation detail, not a  
> language feature.

Don't shoot the messenger please.

You can dismiss or address the problem. Anyway... undergrads do care  
because they will take 4 years to grade and they do not want to come  
out with obsolete skills. Our undergrads learn Python, Ruby, Java,  
Javascript and C++. Many know other languages which they learn on  
their own (Scala and Clojure are popular). They all agree multi-core  
is the future and whichever language can deal with them better is the  
future too.

As masklinn says, the difference between garbage collection and  
reference counting is more than an implementation issue.


> -- 
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From guido at python.org  Thu Feb  9 19:26:07 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:26:07 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
Message-ID: <CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:14 AM, Masklinn <masklinn at masklinn.net> wrote:

> On 2012-02-09, at 19:03 , Steven D'Aprano wrote:
> > The choice of which garbage collection implementation (ref counting is
> garbage collection) is a quality of implementation detail, not a language
> feature.
>
> That's debatable, it's an implementation detail with very different
> semantics which tends to leak out into usage patterns of the language (as
> it did with CPython, which basically did not get fixed in the community
> until Pypy started ascending),


I think it was actually Jython that first sensitized the community to this
issue.


> especially when the language does not provide "better" ways to handle
> things (as Python finally did by adding context managers in 2.5).
>
> So theoretically, automatic refcounting is a detail, but practically it
> influences language usage differently than most other GC techniques (when
> it'd the only GC strategy in the language anyway)
>

Are there still Python idioms/patterns/recipes around that depend on
refcounting? (There also used to be some well-known anti-patterns that were
only bad because of the refcounting, mostly around saving exceptions. But
those should all have melted away -- CPython has had auxiliary GC for over
a decade.)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/6a0a6322/attachment.html>

From sturla at molden.no  Thu Feb  9 19:30:55 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 19:30:55 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh11nj$hn1$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<jh11nj$hn1$1@dough.gmane.org>
Message-ID: <4F3410DF.30602@molden.no>

On 09.02.2012 19:02, Terry Reedy wrote:

> And make installing Python on the Mac a lottery?

Or a subset of NumPy?

The main offender is numpy.linalg, with needs a BLAS library that should 
be tuned to the hardware. (There is a reason NumPy and SciPy binary 
installers on Windows are bloated.) And from what I have seen on 
complaints building NumPy on Mav it tends to be the BLAS/LAPACK stuff 
that drives people crazy, particularly those who want to use ATLAS 
(Which is a bit stupid, as OpenBLAS/GotoBLAS2 is easier to build and 
much faster.) If Python comes with NumPy built against Netlib reference 
BLAS, there will be lots of complaints that "Matlab is so much faster 
then Python" when it is actually the BLAS libraries that are different. 
But I am not sure we want 50-100 MB of bloat in the Python binary 
installer just to cover all possible cases of CPU-tuned 
OpenBLAS/GotoBLAS2 or ATLAS libraries.

Sturla


From guido at python.org  Thu Feb  9 19:34:50 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:34:50 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
Message-ID: <CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:25 AM, Massimo Di Pierro <
massimo.dipierro at gmail.com> wrote:

>
> On Feb 9, 2012, at 12:03 PM, Steven D'Aprano wrote:
>
>  Massimo Di Pierro wrote:
>>
>>> Here is another data point:
>>> http://redmonk.com/sogrady/**2012/02/08/language-rankings-**2-2012/<http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/>
>>> Unfortunately the TIOBE index does matter. I can speak for python in
>>> education and trends I seen.
>>> Python is and remains the easiest language to teach but it is no longer
>>> true that getting Python to run is easer than alternatives (not for the
>>> average undergrad student).
>>>
>>
>> Is that a commentary on Python, or the average undergrad student?
>>
>
> I teach so the average student is my benchmark. Please do not
> misunderstand. While some may be lazy, but the average CS undergrad is not
> stupid but quite intelligent. They just do not like wasting time with
> setups and I sympathize with that. Batteries included is the Python motto.
>
>
>  It used to be you download python 2.5 and you were in business. Now you
>>> have to make a choice 2.x or 3.x. 20% of the students cannot tell one from
>>> the other (even after been told repeatedly which one to use). Three weeks
>>> into the class they complain with "the class code won't compile" (the same
>>> 20% cannot tell a compiler form an interpreter).
>>>
>>
>> Python has a compiler. The "c" in .pyc files stands for "compiled" and
>> Python has a built-in function called "compile". It just happens to compile
>> to byte code that runs on a virtual machine, not machine code running on
>> physical hardware. PyPy takes it even further, with a JIT compiler that
>> operates on the byte code.
>>
>>
>>  50+% of the students have a mac and an increasing number of packages
>>> depend on numpy. Installing numpy on mac is a lottery.
>>> Those who do not have a mac have windows and they expect an IDE like
>>> eclipse. I know you can use Python with eclipse but they do not. They
>>> download Python and complain that IDLE has no autocompletion, no line
>>> numbers, no collapsible functions/classes.
>>> From the hard core computer scientists prospective there are usually
>>> three objections to using Python:
>>> - Most software engineers think we should only teach static type
>>> languages
>>> - Those who care about scalability complain about the GIL
>>>
>>
>> How is that relevant to a language being taught to undergrads? Sounds
>> more like an excuse to justify dislike of teaching Python rather than an
>> actual reason to dislike Python.
>>
>>
>>  - The programming language purists complain about the use of reference
>>> counting instead of garbage collection
>>>
>>
>> The programming language purists should know better than that. The choice
>> of which garbage collection implementation (ref counting is garbage
>> collection) is a quality of implementation detail, not a language feature.
>>
>
> Don't shoot the messenger please.
>
> You can dismiss or address the problem. Anyway... undergrads do care
> because they will take 4 years to grade and they do not want to come out
> with obsolete skills. Our undergrads learn Python, Ruby, Java, Javascript
> and C++. Many know other languages which they learn on their own (Scala and
> Clojure are popular).


I'd give those students a bonus for being in touch with what's popular in
academia. Point them to Haskell next. They may amount to something.


> They all agree multi-core is the future and whichever language can deal
> with them better is the future too.
>

Surely not JavaScript (which is single-threaded and AFAIK also uses
refcounting :-). Also, AFAIK Ruby has a GIL much like Python. I think it's
time to start a PR offensive explaining why these are not the problem the
trolls make them out to be, and how you simply have to use different
patterns for scaling in some languages than in others. And note that a
single-threaded event-driven process can serve 100,000 open sockets --
while no JVM can create 100,000 threads.

As masklinn says, the difference between garbage collection and reference
> counting is more than an implementation issue.
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/4a55ee6b/attachment.html>

From guido at python.org  Thu Feb  9 19:36:41 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:36:41 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3410DF.30602@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<jh11nj$hn1$1@dough.gmane.org> <4F3410DF.30602@molden.no>
Message-ID: <CAP7+vJK-d1QmA09z2OfkmbXuEZSm=uFPej2KSPg045JVaUrxFA@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:30 AM, Sturla Molden <sturla at molden.no> wrote:

> On 09.02.2012 19:02, Terry Reedy wrote:
>
>  And make installing Python on the Mac a lottery?
>>
>
> Or a subset of NumPy?
>
> The main offender is numpy.linalg, with needs a BLAS library that should
> be tuned to the hardware. (There is a reason NumPy and SciPy binary
> installers on Windows are bloated.) And from what I have seen on complaints
> building NumPy on Mav it tends to be the BLAS/LAPACK stuff that drives
> people crazy, particularly those who want to use ATLAS (Which is a bit
> stupid, as OpenBLAS/GotoBLAS2 is easier to build and much faster.) If
> Python comes with NumPy built against Netlib reference BLAS, there will be
> lots of complaints that "Matlab is so much faster then Python" when it is
> actually the BLAS libraries that are different. But I am not sure we want
> 50-100 MB of bloat in the Python binary installer just to cover all
> possible cases of CPU-tuned OpenBLAS/GotoBLAS2 or ATLAS libraries.
>

I don't know much of this area, but maybe this is something where a dynamic
installer (along the lines of easy_install) might actually be handy?

The funny thing is that most Java software is even more bloated and you
rarely hear about that (at least not from Java users ;-).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/c597684d/attachment.html>

From masklinn at masklinn.net  Thu Feb  9 19:37:18 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 19:37:18 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
Message-ID: <FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>

On 2012-02-09, at 19:26 , Guido van Rossum wrote:
> On Thu, Feb 9, 2012 at 10:14 AM, Masklinn <masklinn at masklinn.net> wrote:
>> On 2012-02-09, at 19:03 , Steven D'Aprano wrote:
>>> The choice of which garbage collection implementation (ref counting is
>> garbage collection) is a quality of implementation detail, not a language
>> feature.
>> 
>> That's debatable, it's an implementation detail with very different
>> semantics which tends to leak out into usage patterns of the language (as
>> it did with CPython, which basically did not get fixed in the community
>> until Pypy started ascending),
> 
> I think it was actually Jython that first sensitized the community to this
> issue.
> 
The first one was Jython yes, of course, but I did not see the "movement"
gain much prominence before Pypy started looking like a serious CPython
alternative, before that there were a few voices lost in the desert.

>> especially when the language does not provide "better" ways to handle
>> things (as Python finally did by adding context managers in 2.5).
>> 
>> So theoretically, automatic refcounting is a detail, but practically it
>> influences language usage differently than most other GC techniques (when
>> it'd the only GC strategy in the language anyway)
> 
> Are there still Python idioms/patterns/recipes around that depend on
> refcounting?

There shouldn't be, but I'm not going to rule out reliance on automatic
resource cleanup just yet, I'm sure there are still significant pieces
of code using those in the wild.


From mwm at mired.org  Thu Feb  9 19:42:37 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 9 Feb 2012 10:42:37 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
Message-ID: <20120209104237.154be949@bhuda.mired.org>

On Fri, 10 Feb 2012 01:35:17 +0800
Matt Joiner <anacrolix at gmail.com> wrote:

> the GIL being the gorilla here, multiprocessing is unacceptable as
> long as native threading is the only supported concurrency mechanism

If threading is the only acceptable concurrency mechanism, then Python
is the wrong language to use. But you're also not building scaleable
systems, which is most of where it really matters. If you're willing
to consider things other than threading - and you have to if you want
to build scaleable systems - then Python makes a good choice.

Personally, I'd like to see a modern threading model in Python,
especially if it's tools can be extended to work with other
concurrency mechanisms. But that's a *long* way into the future.

As for "popular vs. good" - "good" is subjective measure. So the two
statements "anything popular is good" and "nothing popular was ever
good unless it had no competition" can both be true.

Personally, I lean toward the latter. I tend to find things that are
popular to not be very good, which makes me distrust the taste of the
populace. The python core developers, on the other hand, have an
excellent record when it comes to keeping the language good - and the
failures tend to be concessions to popularity! So I'd rather the
current system for adding features stay in place and *not* see the
language add features just to gain popularity. We already have Perl if
you want that kind of language.

That said, it's perfectly reasonable to suggest changes you think will
improve the popularity of the language. But be prepared to show that
they're actually good, as opposed to merely possibly popular.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From sturla at molden.no  Thu Feb  9 19:44:41 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 19:44:41 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
Message-ID: <4F341419.6030808@molden.no>

On 09.02.2012 19:25, Massimo Di Pierro wrote:

> As masklinn says, the difference between garbage collection and
> reference counting is more than an implementation issue.

Actually it is not.

The GIL is a problem for those who want to use threading.Thread and 
plain Python code for parallel processing. Those who think in those 
terms have typically prior experience with Java or .NET.

Processes are excellent for concurrency, cf. multiprocessing, os.fork 
and MPI. They actually are more efficient than threads (due to avoidance 
of false sharing cache lines) and safer (deadlock and livelocks are more 
difficult to produce). And I assume students who learn to use such tools 
from the start are not annoyed by the GIL.

The GIL annoys those who have learned to expect threading.Thread for CPU 
bound concurrency in advance -- which typically means prior experience 
with Java. Python threads are fine for their intended use -- e.g. I/O 
and background tasks in a GUI.

Sturla


From guido at python.org  Thu Feb  9 19:44:42 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:44:42 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
Message-ID: <CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:37 AM, Masklinn <masklinn at masklinn.net> wrote:

> On 2012-02-09, at 19:26 , Guido van Rossum wrote:
> > On Thu, Feb 9, 2012 at 10:14 AM, Masklinn <masklinn at masklinn.net> wrote:
> >> On 2012-02-09, at 19:03 , Steven D'Aprano wrote:
> >>> The choice of which garbage collection implementation (ref counting is
> >> garbage collection) is a quality of implementation detail, not a
> language
> >> feature.
> >>
> >> That's debatable, it's an implementation detail with very different
> >> semantics which tends to leak out into usage patterns of the language
> (as
> >> it did with CPython, which basically did not get fixed in the community
> >> until Pypy started ascending),
> >
> > I think it was actually Jython that first sensitized the community to
> this
> > issue.
> >
> The first one was Jython yes, of course, but I did not see the "movement"
> gain much prominence before Pypy started looking like a serious CPython
> alternative, before that there were a few voices lost in the desert.
>

I guess everyone has a different perspective.


>> especially when the language does not provide "better" ways to handle
>> things (as Python finally did by adding context managers in 2.5).
>>
>> So theoretically, automatic refcounting is a detail, but practically it
>> influences language usage differently than most other GC techniques (when
>> it'd the only GC strategy in the language anyway)
>
> Are there still Python idioms/patterns/recipes around that depend on
> refcounting?

 There shouldn't be, but I'm not going to rule out reliance on automatic
> resource cleanup just yet, I'm sure there are still significant pieces
> of code using those in the wild.
>

I am guessing in part that's a function of resistance to change, and in
part it means PyPy hasn't gotten enough mindshare yet. (Raise your hand if
you have PyPy installed on one of your systems. Raise your hand if you use
it. Raise your hand if you are a PyPy contributor. :-)

Anyway, the refcounting objection seems the least important one. The more
important trolls to fight are "static typing is always better" and "the GIL
makes Python multicore-unfriendly".

TBH, I see some movement in the static typing discussion, evidence that the
static typing zealots are considering a hybrid approach (e.g. C# dynamic,
and the optional static type checks in Dart).

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/1527941f/attachment.html>

From sturla at molden.no  Thu Feb  9 19:46:55 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 19:46:55 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJK-d1QmA09z2OfkmbXuEZSm=uFPej2KSPg045JVaUrxFA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<jh11nj$hn1$1@dough.gmane.org> <4F3410DF.30602@molden.no>
	<CAP7+vJK-d1QmA09z2OfkmbXuEZSm=uFPej2KSPg045JVaUrxFA@mail.gmail.com>
Message-ID: <4F34149F.5020909@molden.no>

On 09.02.2012 19:36, Guido van Rossum wrote:

> I don't know much of this area, but maybe this is something where a
> dynamic installer (along the lines of easy_install) might actually be handy?

That is what NumPy and SciPy does on Windows. But it also means the 
"superpack" installer is a very big download.

Sturla


From steve at pearwood.info  Thu Feb  9 19:50:09 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 10 Feb 2012 05:50:09 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
Message-ID: <4F341561.3050409@pearwood.info>

Massimo Di Pierro wrote:
> I think if easy_install, gevent, numpy (*), and win32 extensions where 
> included in 3.x, together with a slightly better Idle (still based on 
> Tkinter, with multiple pages, autocompletion, collapsible, line numbers, 
> better printing with syntax highlitghing), and if easy_install were 
> accessible via Idle, this would be a killer version.

IDLE does look a little long in the tooth.


> Longer term removing the GIL and using garbage collection should be a 
> priority. I am not sure what is involved and how difficult it is but 
> perhaps this is what PyCon money can be used for.

It isn't difficult to find out about previous attempts to remove the GIL. 
Googling for "python removing the gil" brings up plenty of links, including:

http://www.artima.com/weblogs/viewpost.jsp?thread=214235
http://dabeaz.blogspot.com.au/2011/08/inside-look-at-gil-removal-patch-of.html

Or just use Jython or IronPython, neither of which have a GIL. And since 
neither of them support Python 3 yet, you have no confusing choice of version 
to make.

I'm not sure if IronPython is suitable for teaching, if you have to support 
Macs as well as Windows, but as a counter-argument against GIL trolls, there 
are two successful implementations of Python without the GIL.

(And neither is as popular as CPython, which I guess says something about 
where people's priorities lie. If the GIL was as serious a problem in practice 
as people claim, there would be far more interest in Jython and IronPython.)


>  If this cannot be done 
> without breaking backward compatibility again, then 3.x should be 
> considered an experimental branch, people should be advised to stay with 
> 2.7 (2.8?) and then skip to 4.x directly when these problems are 
> resolved. Python should not make a habit of breaking backward 
> compatibility.

Python 4.x (Python 4000) is pure vapourware. It it irresponsible to tell 
people to stick to Python 2.7 (there will be no 2.8) in favour of something 
which may never exist.

http://www.python.org/dev/peps/pep-0404/


-- 
Steven


From masklinn at masklinn.net  Thu Feb  9 19:50:20 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 19:50:20 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
Message-ID: <F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>

On 2012-02-09, at 19:34 , Guido van Rossum wrote:
>> They all agree multi-core is the future and whichever language can deal
>> with them better is the future too.
>> 
> 
> Surely not JavaScript (which is single-threaded and AFAIK also uses
> refcounting :-).

I don't think I've seen a serious refcounted JS implementation in the last
decade. , although it is possible that JS runtimes have localized usage
of references and reference-counted resources. AFAIK all modern JS
runtimes are JITed which probably does not mesh well with refcounting.

In any case, V8 (Chrome's runtime) uses a stop-the-world generational
GC for sure[0], Mozilla's SpiderMonkey uses a GC as well[1] although
I'm not sure which type (the reference to JS_MarkGCThing indicates it
could be or at least use a mark-and-sweep amongst its strategies),
Webkit/Safari's JavaScriptCore uses a GC as well[2] and MSIE's JScript
used a mark-and-sweep GC back in 2003[3] (although the DOM itself was
in COM, and reference-counted).

> And note that a
> single-threaded event-driven process can serve 100,000 open sockets --
> while no JVM can create 100,000 threads.

Only because it's OS threads of course, Erlang is not evented and has no
problem spawning half a million (preempted) processes if there's RAM
enough to store them.

[0] http://code.google.com/apis/v8/design.html#garb_coll
[1] https://developer.mozilla.org/en/SpiderMonkey/1.8.5#Garbage_collection
[2] Since ~2009 http://www.masonchang.com/blog/2009/3/26/nitros-garbage-collector.html
[3] http://blogs.msdn.com/b/ericlippert/archive/2003/09/17/53038.aspx

From guido at python.org  Thu Feb  9 19:54:27 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 10:54:27 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
Message-ID: <CAP7+vJ+JDPfy_FMHfKZ9TaJchdjZ65XDdu8-coCcaLL5xQbTNA@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:50 AM, Masklinn <masklinn at masklinn.net> wrote:

> On 2012-02-09, at 19:34 , Guido van Rossum wrote:
> >> They all agree multi-core is the future and whichever language can deal
> >> with them better is the future too.
> >>
> >
> > Surely not JavaScript (which is single-threaded and AFAIK also uses
> > refcounting :-).
>
> I don't think I've seen a serious refcounted JS implementation in the last
> decade. , although it is possible that JS runtimes have localized usage
> of references and reference-counted resources. AFAIK all modern JS
> runtimes are JITed which probably does not mesh well with refcounting.
>
> In any case, V8 (Chrome's runtime) uses a stop-the-world generational
> GC for sure[0], Mozilla's SpiderMonkey uses a GC as well[1] although
> I'm not sure which type (the reference to JS_MarkGCThing indicates it
> could be or at least use a mark-and-sweep amongst its strategies),
> Webkit/Safari's JavaScriptCore uses a GC as well[2] and MSIE's JScript
> used a mark-and-sweep GC back in 2003[3] (although the DOM itself was
> in COM, and reference-counted).
>

I stand corrected (but I am right about the single-threadedness :-).

> And note that a
> single-threaded event-driven process can serve 100,000 open sockets --
> while no JVM can create 100,000 threads.

 Only because it's OS threads of course, Erlang is not evented and has no
> problem spawning half a million (preempted) processes if there's RAM
> enough to store them.
>

Sure. But the people complaining about the GIL come from Java, not from
Erlang. (Erlang users typically envy Python because of its superior
standard library. :-)


>
> [0] http://code.google.com/apis/v8/design.html#garb_coll
> [1] https://developer.mozilla.org/en/SpiderMonkey/1.8.5#Garbage_collection
> [2] Since ~2009
> http://www.masonchang.com/blog/2009/3/26/nitros-garbage-collector.html
> [3] http://blogs.msdn.com/b/ericlippert/archive/2003/09/17/53038.aspx


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/1d2f0d13/attachment.html>

From masklinn at masklinn.net  Thu Feb  9 19:53:47 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 19:53:47 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
Message-ID: <974D4AD7-6F17-4CEE-BDAD-C0D5B83D0F34@masklinn.net>

On 2012-02-09, at 19:44 , Guido van Rossum wrote:
> TBH, I see some movement in the static typing discussion, evidence that the
> static typing zealots are considering a hybrid approach (e.g. C# dynamic,
> and the optional static type checks in Dart).

These seem to be efforts of people trying for both sides (for various reasons)
more than people firmly rooted in one camp or another. Dart was widely panned
for its wonky approach to "static typing", which is generally considered a joke
amongst people looking for actual static type (in that they're about as useful
as Python 3's type annotations).

From sturla at molden.no  Thu Feb  9 19:57:20 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 19:57:20 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209104237.154be949@bhuda.mired.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
Message-ID: <4F341710.9030806@molden.no>

On 09.02.2012 19:42, Mike Meyer wrote:

> If threading is the only acceptable concurrency mechanism, then Python
> is the wrong language to use. But you're also not building scaleable
> systems, which is most of where it really matters. If you're willing
> to consider things other than threading - and you have to if you want
> to build scaleable systems - then Python makes a good choice.

Yes or no... Python is used for parallel computing on the biggest 
supercomputers, monsters like Cray and IBM blue genes with tens of 
thousands of CPUs. But what really fails to scale is the Python module 
loader! For example it can take hours to "import numpy" for 30,000 
Python processes on a blue gene. And yes, nobody would consider to use 
Java for such systems, even though Java does not have a GIL (well, 
theads do no matter that much on a cluster with distributed memory 
anyway). It is Python, C and Fortran that are popular. But that really 
disproves that Python sucks for big concurrency, except perhaps for the 
module loader.

Sturla


From amcnabb at mcnabbs.org  Thu Feb  9 19:58:10 2012
From: amcnabb at mcnabbs.org (Andrew McNabb)
Date: Thu, 9 Feb 2012 11:58:10 -0700
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
Message-ID: <20120209185810.GC20556@mcnabbs.org>

On Thu, Feb 09, 2012 at 10:44:42AM -0800, Guido van Rossum wrote:
> I am guessing in part that's a function of resistance to change, and in
> part it means PyPy hasn't gotten enough mindshare yet. (Raise your hand if
> you have PyPy installed on one of your systems. Raise your hand if you use
> it. Raise your hand if you are a PyPy contributor. :-)

I don't know if you actually want replies, but I'll bite.  I have pypy
installed (from the standard Fedora pypy package), and for a particular
project it provided a 20x speedup.  I'm not a PyPy contributor, but I'm
a believer.

I would use PyPy everywhere if it worked with Python 3 and scipy.  My
apologies if this was just a rhetorical question. :)

--
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868


From masklinn at masklinn.net  Thu Feb  9 20:03:28 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 20:03:28 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJ+JDPfy_FMHfKZ9TaJchdjZ65XDdu8-coCcaLL5xQbTNA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<CAP7+vJ+JDPfy_FMHfKZ9TaJchdjZ65XDdu8-coCcaLL5xQbTNA@mail.gmail.com>
Message-ID: <701473A6-1EF8-405D-80AB-6546774E03FE@masklinn.net>

On 2012-02-09, at 19:54 , Guido van Rossum wrote:
> 
> I stand corrected (but I am right about the single-threadedness :-).

Absolutely (until WebWorkers anyway)

>> And note that a
>> single-threaded event-driven process can serve 100,000 open sockets --
>> while no JVM can create 100,000 threads.
> 
> Only because it's OS threads of course, Erlang is not evented and has no
>> problem spawning half a million (preempted) processes if there's RAM
>> enough to store them.
>> 
> 
> Sure. But the people complaining about the GIL come from Java, not from
> Erlang. (Erlang users typically envy Python because of its superior
> standard library. :-)

True. Then they remember how good Python is with concurrency,
distribution and distributed resilience :D

(don't forget syntax, one of Erlang's biggest failures)

(although it pleased cfbolz since he could get syntax coloration for his
prolog)


From guido at python.org  Thu Feb  9 20:03:26 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:03:26 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
Message-ID: <CAP7+vJ+LUGF4S2W7D4mUwO5Ok=s0o9Q0h7TN5p=11k4+8WzGzg@mail.gmail.com>

On Thu, Feb 9, 2012 at 9:46 AM, Massimo Di Pierro <
massimo.dipierro at gmail.com> wrote:

> I think if easy_install, gevent, numpy (*), and win32 extensions where
> included in 3.x, together with a slightly better Idle (still based on
> Tkinter, with multiple pages, autocompletion, collapsible, line numbers,
> better printing with syntax highlitghing), and if easy_install were
> accessible via Idle, this would be a killer version.
>

IIRC gevent still needs to be ported to 3.x (maybe someone with the
necessary skills should apply to the PSF for funding). But the rest sounds
like the domain of a superinstaller, not inclusion in the stdlib. IDLE will
never be able to compete with Eclipse -- you can love one or the other bot
not both.

Longer term removing the GIL and using garbage collection should be a
> priority. I am not sure what is involved and how difficult it is but
> perhaps this is what PyCon money can be used for.


I think the best way to accomplish both is to focus on PyPy. It needs
porting to 3.x; Google has already given them some money towards this goal.


> If this cannot be done without breaking backward compatibility again, then
> 3.x should be considered an experimental branch, people should be advised
> to stay with 2.7 (2.8?) and then skip to 4.x directly when these problems
> are resolved.


That's really bad advice. 4.x will not be here for another decade.


> Python should not make a habit of breaking backward compatibility.
>

Agreed. 4.x should be fully backwards compatible -- with 3.x, not with 2.x.

It would be really nice if it were to include an async web server (based on
> gevent for example) and better parser for HTTP headers and a python based
> template language (like mako or the web2py one) not just for the web but
> for document generation in general.
>

Again, that's a bundling issue. With the infrequency of Python releases,
anything still under development is much better off being distributed
separately. Bundling into core Python requires a package to be essentially
stable, i.e., dead.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/baba9b0e/attachment.html>

From guido at python.org  Thu Feb  9 20:05:15 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:05:15 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341710.9030806@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org> <4F341710.9030806@molden.no>
Message-ID: <CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:57 AM, Sturla Molden <sturla at molden.no> wrote:

> On 09.02.2012 19:42, Mike Meyer wrote:
>
>  If threading is the only acceptable concurrency mechanism, then Python
>> is the wrong language to use. But you're also not building scaleable
>> systems, which is most of where it really matters. If you're willing
>> to consider things other than threading - and you have to if you want
>> to build scaleable systems - then Python makes a good choice.
>>
>
> Yes or no... Python is used for parallel computing on the biggest
> supercomputers, monsters like Cray and IBM blue genes with tens of
> thousands of CPUs. But what really fails to scale is the Python module
> loader! For example it can take hours to "import numpy" for 30,000 Python
> processes on a blue gene. And yes, nobody would consider to use Java for
> such systems, even though Java does not have a GIL (well, theads do no
> matter that much on a cluster with distributed memory anyway). It is
> Python, C and Fortran that are popular. But that really disproves that
> Python sucks for big concurrency, except perhaps for the module loader.
>

I'm curious about the module loader problem. Did someone ever analyze the
cause and come up with a fix? Is it the import lock? Maybe it's something
for the bug tracker.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/a2a4bfa5/attachment.html>

From guido at python.org  Thu Feb  9 20:06:35 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:06:35 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209185810.GC20556@mcnabbs.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org>
Message-ID: <CAP7+vJKsEAPNA6FF_LCdaURCnEu8s+YrtjYvrVwtMwLr9178pw@mail.gmail.com>

On Thu, Feb 9, 2012 at 10:58 AM, Andrew McNabb <amcnabb at mcnabbs.org> wrote:

> On Thu, Feb 09, 2012 at 10:44:42AM -0800, Guido van Rossum wrote:
> > I am guessing in part that's a function of resistance to change, and in
> > part it means PyPy hasn't gotten enough mindshare yet. (Raise your hand
> if
> > you have PyPy installed on one of your systems. Raise your hand if you
> use
> > it. Raise your hand if you are a PyPy contributor. :-)
>
> I don't know if you actually want replies, but I'll bite.  I have pypy
> installed (from the standard Fedora pypy package), and for a particular
> project it provided a 20x speedup.  I'm not a PyPy contributor, but I'm
> a believer.
>
> I would use PyPy everywhere if it worked with Python 3 and scipy.  My
> apologies if this was just a rhetorical question. :)


Thanks for replying, it was not a rhetorical question. It's something I'm
considering asking during my keynote at PyCon next month.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/5757ed5e/attachment.html>

From sturla at molden.no  Thu Feb  9 20:08:36 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 20:08:36 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
Message-ID: <4F3419B4.6010802@molden.no>

On 09.02.2012 19:50, Masklinn wrote:

> I don't think I've seen a serious refcounted JS implementation in the last
> decade. , although it is possible that JS runtimes have localized usage
> of references and reference-counted resources. AFAIK all modern JS
> runtimes are JITed which probably does not mesh well with refcounting.
>
> In any case, V8 (Chrome's runtime) uses a stop-the-world generational
> GC for sure[0],

And Chrome uses one *process* for each tab, right? Is there a reason 
Chrome does not use one thread for each tab, such as security?


> Only because it's OS threads of course, Erlang is not evented and has no
> problem spawning half a million (preempted) processes if there's RAM
> enough to store them.

Actually, spawning half a million OS threads will burn the computer.

*POFF*

... and it goes up in a ball of smoke.

Spawning half a million threads is the Windows equivalent of a fork bomb.

I think you confuse threads and fibers/coroutines.


Sturla


From ericsnowcurrently at gmail.com  Thu Feb  9 20:12:46 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 9 Feb 2012 12:12:46 -0700
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
Message-ID: <CALFfu7A3b5_VZdxXjSUSmbVh=KG+Bg4UMRHrit5wMRrZ8t-J1A@mail.gmail.com>

On Thu, Feb 9, 2012 at 12:05 PM, Guido van Rossum <guido at python.org> wrote:
> On Thu, Feb 9, 2012 at 10:57 AM, Sturla Molden <sturla at molden.no> wrote:
>> Yes or no... Python is used for parallel computing on the biggest
>> supercomputers, monsters like Cray and IBM blue genes with tens of thousands
>> of CPUs. But what really fails to scale is the Python module loader! For
>> example it can take hours to "import numpy" for 30,000 Python processes on a
>> blue gene. And yes, nobody would consider to use Java for such systems, even
>> though Java does not have a GIL (well, theads do no matter that much on a
>> cluster with distributed memory anyway). It is Python, C and Fortran that
>> are popular. But that really disproves that Python sucks for big
>> concurrency, except perhaps for the module loader.
>
>
> I'm curious about the module loader problem. Did someone ever analyze the
> cause and come up with a fix? Is it the import lock? Maybe it's something
> for the bug tracker.

+1

-eric


From anacrolix at gmail.com  Thu Feb  9 20:16:00 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 03:16:00 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209104237.154be949@bhuda.mired.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
Message-ID: <CAB4yi1Ma1YzYkDB7ZZT6FR=wi+AGBOGwFs7An6WCbzaOCbYwHQ@mail.gmail.com>

> If threading is the only acceptable concurrency mechanism, then Python
> is the wrong language to use. But you're also not building scaleable
> systems, which is most of where it really matters. If you're willing
> to consider things other than threading - and you have to if you want
> to build scaleable systems - then Python makes a good choice.

Yes but core Python doesn't have any other true concurrency mechanisms
other than native threading, and they're too heavyweight for this
purpose alone. On top of this they're useless for Python-only
parallelism.

> Personally, I'd like to see a modern threading model in Python,
> especially if it's tools can be extended to work with other
> concurrency mechanisms. But that's a *long* way into the future.

Too far. It needs to be now. The downward spiral is already beginning.
Mobile phones are going multicore. My next desktop will probably have
8 cores or more. All the heavyweight languages are firing up
thread/STM standardizations and implementations to make this stuff
more performant and easier than it already is.

> As for "popular vs. good" - "good" is subjective measure. So the two
> statements "anything popular is good" and "nothing popular was ever
> good unless it had no competition" can both be true.
>
> Personally, I lean toward the latter. I tend to find things that are
> popular to not be very good, which makes me distrust the taste of the
> populace. The python core developers, on the other hand, have an
> excellent record when it comes to keeping the language good - and the
> failures tend to be concessions to popularity! So I'd rather the
> current system for adding features stay in place and *not* see the
> language add features just to gain popularity. We already have Perl if
> you want that kind of language.
>
> That said, it's perfectly reasonable to suggest changes you think will
> improve the popularity of the language. But be prepared to show that
> they're actually good, as opposed to merely possibly popular.

This doesn't apply to "enabling" features. Features that make it
possible for popular stuff to happen. Concurrency isn't popular, but
parallelism is. At least where the GIL is concerned, an good
alternative concurrency mechanism doesn't exist. (The popular one is
native threading).


From g.rodola at gmail.com  Thu Feb  9 20:16:00 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 9 Feb 2012 20:16:00 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
Message-ID: <CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>

Il 09 febbraio 2012 18:35, Matt Joiner <anacrolix at gmail.com> ha scritto:
> From my own observations, the recent drop is sure to uncertainty with Python
> 3, and an increase of alternatives on server side, such as Node.
>
> The transition is only going to get more painful as system critical software
> lags on 2.x while users clamour for 3.x.

I think it's not only a matter of 3th party modules not being ported
quickly enough or the amount of work involved when facing the 2->3
conversion.
I bet a lot of people don't want to upgrade for another reason: unicode.
The impression I got is that python 3 forces the user to use and
*understand* unicode and a lot of people simply don't want to deal
with that.
In python 2 there was no such a strong imposition.
Python 2 string type acting both as bytes and as text was certainly
ambiguos and "impure" on different levels and changing that was
definitively a win in terms of purity and correctness.
I bet most advanced users are happy with this change.
On the other hand, Python 2 average user was free to ignore that
distinction even if that meant having subtle bugs hidden somewhere in
his/her code.
I think this aspect shouldn't be underestimated.


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From mwm at mired.org  Thu Feb  9 20:18:10 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 9 Feb 2012 11:18:10 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341710.9030806@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
Message-ID: <20120209111810.58e0cf42@bhuda.mired.org>

On Thu, 09 Feb 2012 19:57:20 +0100
Sturla Molden <sturla at molden.no> wrote:
> On 09.02.2012 19:42, Mike Meyer wrote:
> > If threading is the only acceptable concurrency mechanism, then Python
> > is the wrong language to use. But you're also not building scaleable
> > systems, which is most of where it really matters. If you're willing
> > to consider things other than threading - and you have to if you want
> > to build scaleable systems - then Python makes a good choice.
> Yes or no... Python is used for parallel computing on the biggest 
> supercomputers, monsters like Cray and IBM blue genes with tens of 
> thousands of CPUs. But what really fails to scale is the Python module 
> loader! For example it can take hours to "import numpy" for 30,000 
> Python processes on a blue gene.

Whether or not hours of time to import is an issue depends on what
you're doing. I typically build systems running on hundreds of CPUs
for weeks on end, meaning you get years of CPU time per run. So if it
took a few hours of CPU time to get started, it wouldn't be much of a
problem. If it took a few hours of wall clock time - well, that would
be more of a problem, mostly because that long of an outage would be
unacceptable.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From anacrolix at gmail.com  Thu Feb  9 20:19:36 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 03:19:36 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341419.6030808@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
Message-ID: <CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>

> The GIL annoys those who have learned to expect threading.Thread for CPU
> bound concurrency in advance -- which typically means prior experience with
> Java. Python threads are fine for their intended use -- e.g. I/O and
> background tasks in a GUI.

Even for that purpose they're too heavy. The GIL conflicts, and
boilerplate overhead spawning threads is obscene for more than trivial
cases.


From sturla at molden.no  Thu Feb  9 20:23:14 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 20:23:14 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
Message-ID: <4F341D22.4020706@molden.no>

On 09.02.2012 20:05, Guido van Rossum wrote:

> I'm curious about the module loader problem. Did someone ever analyze
> the cause and come up with a fix? Is it the import lock? Maybe it's
> something for the bug tracker.

See this:

http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html

The offender is actually imp.find_module, which results in huge number 
of failed open() calls when used concurrently from many processes.

So a solution is to have one process locate the modules and then 
broadcast their location to the other processes.

There is even a paper on the issue. Here they suggest importing from 
ramdisk might work on IBM blue gene, but not on Cray.

http://www.cs.uoregon.edu/Research/paracomp/papers/iccs11/iccs_paper_final.pdf

Another solution might be to use sys.meta_path to bypass imp.find_module:

http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059813.html

The best solution would of course be to fix imp.find_module so it scales 
properly.

Sturla


From phd at phdru.name  Thu Feb  9 20:23:44 2012
From: phd at phdru.name (Oleg Broytman)
Date: Thu, 9 Feb 2012 23:23:44 +0400
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3419B4.6010802@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<4F3419B4.6010802@molden.no>
Message-ID: <20120209192344.GA22166@iskra.aviel.ru>

On Thu, Feb 09, 2012 at 08:08:36PM +0100, Sturla Molden wrote:
> And Chrome uses one *process* for each tab, right? Is there a reason
> Chrome does not use one thread for each tab, such as security?

   Safety, I dare say.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From pyideas at rebertia.com  Thu Feb  9 20:23:56 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 9 Feb 2012 11:23:56 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3419B4.6010802@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<4F3419B4.6010802@molden.no>
Message-ID: <CAMZYqRT8i0PaiYb85MEzB83huYdRLzwswLXjHqQKoUQ_Q3TY5w@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:08 AM, Sturla Molden <sturla at molden.no> wrote:
> On 09.02.2012 19:50, Masklinn wrote:
>
>> I don't think I've seen a serious refcounted JS implementation in the last
>> decade. , although it is possible that JS runtimes have localized usage
>> of references and reference-counted resources. AFAIK all modern JS
>> runtimes are JITed which probably does not mesh well with refcounting.
>>
>> In any case, V8 (Chrome's runtime) uses a stop-the-world generational
>> GC for sure[0],
>
> And Chrome uses one *process* for each tab, right? Is there a reason Chrome
> does not use one thread for each tab, such as security?

Stability and security. If something goes wrong/rogue, the effects are
reasonably isolated to the individual tab in question. And they can
use OS resource/privilege limiting APIs to lock down these processes
as much as possible.

Cheers,
Chris


From ericsnowcurrently at gmail.com  Thu Feb  9 20:25:45 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 9 Feb 2012 12:25:45 -0700
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
Message-ID: <CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>

On Thu, Feb 9, 2012 at 12:16 PM, Giampaolo Rodol? <g.rodola at gmail.com> wrote:
> I bet a lot of people don't want to upgrade for another reason: unicode.
> The impression I got is that python 3 forces the user to use and
> *understand* unicode and a lot of people simply don't want to deal
> with that.
> In python 2 there was no such a strong imposition.
> Python 2 string type acting both as bytes and as text was certainly
> ambiguos and "impure" on different levels and changing that was
> definitively a win in terms of purity and correctness.
> I bet most advanced users are happy with this change.
> On the other hand, Python 2 average user was free to ignore that
> distinction even if that meant having subtle bugs hidden somewhere in
> his/her code.
> I think this aspect shouldn't be underestimated.

Isn't that more accurate for framework writers, rather than for
"average" users?  How often do average users have to address
encoding/decoding in Python 3?

-eric


From sturla at molden.no  Thu Feb  9 20:25:48 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 20:25:48 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
Message-ID: <4F341DBC.5010609@molden.no>

On 09.02.2012 20:05, Guido van Rossum wrote:

 > I'm curious about the module loader problem. Did someone ever analyze
 > the cause and come up with a fix? Is it the import lock? Maybe it's
 > something for the bug tracker.

See this:

http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html

The offender is actually imp.find_module, which results in huge number 
of failed open() calls when used concurrently from many processes.

So a solution is to have one process locate the modules and then 
broadcast their location to the other processes.

There is even a paper on the issue. Here they suggest importing from 
ramdisk might work on IBM blue gene, but not on Cray.

http://www.cs.uoregon.edu/Research/paracomp/papers/iccs11/iccs_paper_final.pdf

Another solution might be to use sys.meta_path to bypass imp.find_module:

http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059813.html

The best solution would of course be to fix imp.find_module so it scales 
properly.

Sturla


From masklinn at masklinn.net  Thu Feb  9 20:27:22 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 20:27:22 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3419B4.6010802@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<4F3419B4.6010802@molden.no>
Message-ID: <B0F3C86F-4D0E-4A5E-BEC3-435543796958@masklinn.net>

On 2012-02-09, at 20:08 , Sturla Molden wrote:
> On 09.02.2012 19:50, Masklinn wrote:
>> I don't think I've seen a serious refcounted JS implementation in the last
>> decade. , although it is possible that JS runtimes have localized usage
>> of references and reference-counted resources. AFAIK all modern JS
>> runtimes are JITed which probably does not mesh well with refcounting.
>> 
>> In any case, V8 (Chrome's runtime) uses a stop-the-world generational
>> GC for sure[0],
> 
> And Chrome uses one *process* for each tab, right? Is there a reason Chrome does not use one thread for each tab, such as security?

I do not know the precise reasons no, but it probably has to do with
security and ensuring isolation yes (webpage semantics mandate that each
page gets its very own isolated javascript execution context)

>> Only because it's OS threads of course, Erlang is not evented and has no
>> problem spawning half a million (preempted) processes if there's RAM
>> enough to store them.
> 
> Actually, spawning half a million OS threads will burn the computer.
> 
> *POFF*
> 
> ... and it goes up in a ball of smoke.
> 
> Spawning half a million threads is the Windows equivalent of a fork bomb.
> 
> I think you confuse threads and fibers/coroutines.

No. You probably misread my comment somehow.

From sturla at molden.no  Thu Feb  9 20:30:09 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 20:30:09 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
Message-ID: <4F341EC1.6060004@molden.no>

On 09.02.2012 20:19, Matt Joiner wrote:
>> The GIL annoys those who have learned to expect threading.Thread for CPU
>> bound concurrency in advance -- which typically means prior experience with
>> Java. Python threads are fine for their intended use -- e.g. I/O and
>> background tasks in a GUI.
>
> Even for that purpose they're too heavy. The GIL conflicts, and
> boilerplate overhead spawning threads is obscene for more than trivial
> cases.

In which case you want to use I/O completion ports on Windows. (And they 
scale equally well from Python.)

Sturla


From anacrolix at gmail.com  Thu Feb  9 20:31:58 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 03:31:58 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
Message-ID: <CAB4yi1OOPiYXE4MN+8F4Yeb9s6m8HrzSgNLx9jgxPz2QaLOrrA@mail.gmail.com>

> Isn't that more accurate for framework writers, rather than for
> "average" users? ?How often do average users have to address
> encoding/decoding in Python 3?

Constantly. As a Python noob I tried Python 3 it was the first wall I
encountered. I had to learn Unicode right then and there. Fortunately,
the Python docs HOWTO on Unicode is excellent.


From stefan_ml at behnel.de  Thu Feb  9 20:32:03 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 09 Feb 2012 20:32:03 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F34149F.5020909@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<jh11nj$hn1$1@dough.gmane.org> <4F3410DF.30602@molden.no>
	<CAP7+vJK-d1QmA09z2OfkmbXuEZSm=uFPej2KSPg045JVaUrxFA@mail.gmail.com>
	<4F34149F.5020909@molden.no>
Message-ID: <jh16vj$snb$1@dough.gmane.org>

Sturla Molden, 09.02.2012 19:46:
> On 09.02.2012 19:36, Guido van Rossum wrote:
> 
>> I don't know much of this area, but maybe this is something where a
>> dynamic installer (along the lines of easy_install) might actually be handy?
> 
> That is what NumPy and SciPy does on Windows. But it also means the
> "superpack" installer is a very big download.

I think this is an area where distributors can best play their role. If you
want Python to include SciPy, go and ask Enthought. If you also want an
operating system with it, go and ask Debian or Canonical. Or macports, if
you prefer paying for your apples instead.

Stefan


From pydanny at gmail.com  Thu Feb  9 20:34:57 2012
From: pydanny at gmail.com (Daniel Greenfeld)
Date: Thu, 9 Feb 2012 11:34:57 -0800
Subject: [Python-ideas] Python-ideas Digest, Vol 63, Issue 23
In-Reply-To: <mailman.30932.1328812512.27777.python-ideas@python.org>
References: <mailman.30932.1328812512.27777.python-ideas@python.org>
Message-ID: <CAOoSJ_pCcwjbHPcPC=4+FbTGkR_ti8kddBgx2QJ_VzkBFfdibA@mail.gmail.com>

> ? 1. Re: Python 3000 TIOBE -3% (Massimo Di Pierro)
> ? 2. Re: Python 3000 TIOBE -3% (Guido van Rossum)
> ? 3. Re: Python 3000 TIOBE -3% (Sturla Molden)
> ? 4. Re: Python 3000 TIOBE -3% (Guido van Rossum)
> Date: Thu, 9 Feb 2012 12:25:18 -0600
> From: Massimo Di Pierro <massimo.dipierro at gmail.com>
> To: Steven D'Aprano <steve at pearwood.info>
> Cc: python-ideas <python-ideas at python.org>

>> Massimo Di Pierro wrote:
>>> Here is another data point:
>>> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
>>> Unfortunately the TIOBE index does matter. I can speak for python
>>> in education and trends I seen.
>>> Python is and remains the easiest language to teach but it is no
>>> longer true that getting Python to run is easer than alternatives
>>> (not for the average undergrad student).
>>
>> Is that a commentary on Python, or the average undergrad student?
>
> I teach so the average student is my benchmark. Please do not
> misunderstand. While some may be lazy, but the average CS undergrad is
> not stupid but quite intelligent. They just do not like wasting time
> with setups and I sympathize with that. Batteries included is the
> Python motto.

I'm going to delurk from this list and really back up Massimo here.
It's not precisely his issue, but it's close enough to count.

While we love our Linux and BSD variants, and OS X usage is growing,
the truth of the matter is that the clear majority of people learning
Python at the entry level do so on Windows. And I can assure you
having attended many of the tutorials given by PyLadies and other
groups, the part that took the most amount of time was ensuring a
correct installation on Windows. It's not just a matter of getting the
installation onto the machine, it's a matter of making sure the paths
are set correctly so they can follow code examples trivially.

In fact, at PyLadies tutorial events they would give literally special
party hats to teachers who could get Python running under ideal
conditions under Windows. And still it ate a lot of time and caused
frustration. Frustration that gets shared with management and other
people.

I'm well aware that this matter of installation has been 'addressed'.
There is a complex PEP to handle different version installs on
Windows. I can go and click on a small link on the home page of
python.org and download one-click installer that DOESN'T set up
Windows paths. I can follow instructions 'somewhere' and get the paths
set up, but I shouldn't have too. Students should be able to one-click
Python and have it just work.

Yet, for all the times I've been told it's fixed or we've complained
about it and been told "It's getting fixed!", it is still an ongoing
problem. If I were a Windows developer I would fix it today. So
perhaps this should become a GSOC project of high priority: One click
install of Python on Windows on a version hosted by python.org.

Note: People suggest virtual machines or Vagrant. This works on new
machines, but you try getting any of that working on an old Windows
machine in a room of 40-100 students waiting on installation.
Providing laptops for the event are also completely out of budget for
most of these events.

In order to make this issue as clear as possible, I'm going to quote
Audrey Roy: "The number one thing that Python educators struggle with
on entry level tutorials is Windows installations of Python. Ask me,
ask Zed Shaw, ask any of the PyLadies."

-- 
'Knowledge is Power'
Daniel Greenfeld
http://pydanny.blogspot.com


From ctb at msu.edu  Thu Feb  9 20:36:35 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Thu, 9 Feb 2012 11:36:35 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
Message-ID: <20120209193635.GE9836@idyll.org>

On Fri, Feb 10, 2012 at 03:19:36AM +0800, Matt Joiner wrote:
> > The GIL annoys those who have learned to expect threading.Thread for CPU
> > bound concurrency in advance -- which typically means prior experience with
> > Java. Python threads are fine for their intended use -- e.g. I/O and
> > background tasks in a GUI.
> 
> Even for that purpose they're too heavy. The GIL conflicts, and
> boilerplate overhead spawning threads is obscene for more than trivial
> cases.

The GIL is almost entirely a PR issue.  In actual practice, it is so great
(simple, straightforward, functional) I believe that it is a sign of Guido's
time machine-enabled foresight.

--titus
-- 
C. Titus Brown, ctb at msu.edu


From ctb at msu.edu  Thu Feb  9 20:37:43 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Thu, 9 Feb 2012 11:37:43 -0800
Subject: [Python-ideas] Python-ideas Digest, Vol 63, Issue 23
In-Reply-To: <CAOoSJ_pCcwjbHPcPC=4+FbTGkR_ti8kddBgx2QJ_VzkBFfdibA@mail.gmail.com>
References: <mailman.30932.1328812512.27777.python-ideas@python.org>
	<CAOoSJ_pCcwjbHPcPC=4+FbTGkR_ti8kddBgx2QJ_VzkBFfdibA@mail.gmail.com>
Message-ID: <20120209193743.GB28383@idyll.org>

On Thu, Feb 09, 2012 at 11:34:57AM -0800, Daniel Greenfeld wrote:
> > ? 1. Re: Python 3000 TIOBE -3% (Massimo Di Pierro)
> > ? 2. Re: Python 3000 TIOBE -3% (Guido van Rossum)
> > ? 3. Re: Python 3000 TIOBE -3% (Sturla Molden)
> > ? 4. Re: Python 3000 TIOBE -3% (Guido van Rossum)
> > Date: Thu, 9 Feb 2012 12:25:18 -0600
> > From: Massimo Di Pierro <massimo.dipierro at gmail.com>
> > To: Steven D'Aprano <steve at pearwood.info>
> > Cc: python-ideas <python-ideas at python.org>
> 
> >> Massimo Di Pierro wrote:
> >>> Here is another data point:
> >>> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
> >>> Unfortunately the TIOBE index does matter. I can speak for python
> >>> in education and trends I seen.
> >>> Python is and remains the easiest language to teach but it is no
> >>> longer true that getting Python to run is easer than alternatives
> >>> (not for the average undergrad student).
> >>
> >> Is that a commentary on Python, or the average undergrad student?
> >
> > I teach so the average student is my benchmark. Please do not
> > misunderstand. While some may be lazy, but the average CS undergrad is
> > not stupid but quite intelligent. They just do not like wasting time
> > with setups and I sympathize with that. Batteries included is the
> > Python motto.
> 
> I'm going to delurk from this list and really back up Massimo here.
> It's not precisely his issue, but it's close enough to count.
> 
> While we love our Linux and BSD variants, and OS X usage is growing,
> the truth of the matter is that the clear majority of people learning
> Python at the entry level do so on Windows. And I can assure you
> having attended many of the tutorials given by PyLadies and other
> groups, the part that took the most amount of time was ensuring a
> correct installation on Windows. It's not just a matter of getting the
> installation onto the machine, it's a matter of making sure the paths
> are set correctly so they can follow code examples trivially.

+inf.

--titus


From jimjjewett at gmail.com  Thu Feb  9 20:39:10 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 9 Feb 2012 14:39:10 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3419B4.6010802@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<4F3419B4.6010802@molden.no>
Message-ID: <CA+OGgf6aPys23xgk1L9gph24XKpSNiGu6SVcnpfJ72u83VbgRA@mail.gmail.com>

On Thu, Feb 9, 2012 at 2:08 PM, Sturla Molden <sturla at molden.no> wrote:

> And Chrome uses one *process* for each tab, right?

Supposedly.  If you click the wrench, then select Tools/Task Manager,
it looks like there are actually several tabs/process (at least if you
have enough tabs), but there can easily be several processes
controlling separate tabs within the same window.

> Is there a reason Chrome
> does not use one thread for each tab, such as security?

That too, but the reason they documented when introducing Chrome was
for stability.

I can say that Chrome often warns me that a selection of tabs[1]
appears to be stopped, and asks if I want to kill them; it more often
appears to freeze -- but switching to a different tab is usually
effective in getting some response, while I wait the issue out.

[1]  Not sure if the selection is exactly equal to those handled by a
single process, but it seems so.

-jJ


From guido at python.org  Thu Feb  9 20:39:58 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:39:58 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1OOPiYXE4MN+8F4Yeb9s6m8HrzSgNLx9jgxPz2QaLOrrA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
	<CAB4yi1OOPiYXE4MN+8F4Yeb9s6m8HrzSgNLx9jgxPz2QaLOrrA@mail.gmail.com>
Message-ID: <CAP7+vJJbkkHKsj9XO0u-TDimGEA9oaxmuS6pU0d77p2VQ5vMrg@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:31 AM, Matt Joiner <anacrolix at gmail.com> wrote:

> > Isn't that more accurate for framework writers, rather than for
> > "average" users?  How often do average users have to address
> > encoding/decoding in Python 3?
>
> Constantly. As a Python noob I tried Python 3 it was the first wall I
> encountered. I had to learn Unicode right then and there. Fortunately,
> the Python docs HOWTO on Unicode is excellent.
>

The difference is that *if* you hit a Unicode error in 2.x, you're done
for. Even understanding Unicode doesn't help. In 3.x, you will hit Unicode
problems less frequently than in 2.x, and when you do, the problem can
actually be overcome, and then your code is better. In 2.x, the typical
solution, when there *is* a solution, involves making your code messier and
sending up frequent prayers to the gods of Unicode.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/be8842f9/attachment.html>

From guido at python.org  Thu Feb  9 20:40:40 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:40:40 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341D22.4020706@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org> <4F341710.9030806@molden.no>
	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
	<4F341D22.4020706@molden.no>
Message-ID: <CAP7+vJJqmDgSyVNtckCNqNPt+acQbAvvM3hVt04mUemXFhV8Zw@mail.gmail.com>

Please do file an upstream bug for this.

On Thu, Feb 9, 2012 at 11:23 AM, Sturla Molden <sturla at molden.no> wrote:

> On 09.02.2012 20:05, Guido van Rossum wrote:
>
>  I'm curious about the module loader problem. Did someone ever analyze
>> the cause and come up with a fix? Is it the import lock? Maybe it's
>> something for the bug tracker.
>>
>
> See this:
>
> http://mail.scipy.org/**pipermail/numpy-discussion/**
> 2012-January/059801.html<http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html>
>
> The offender is actually imp.find_module, which results in huge number of
> failed open() calls when used concurrently from many processes.
>
> So a solution is to have one process locate the modules and then broadcast
> their location to the other processes.
>
> There is even a paper on the issue. Here they suggest importing from
> ramdisk might work on IBM blue gene, but not on Cray.
>
> http://www.cs.uoregon.edu/**Research/paracomp/papers/**
> iccs11/iccs_paper_final.pdf<http://www.cs.uoregon.edu/Research/paracomp/papers/iccs11/iccs_paper_final.pdf>
>
> Another solution might be to use sys.meta_path to bypass imp.find_module:
>
> http://mail.scipy.org/**pipermail/numpy-discussion/**
> 2012-January/059813.html<http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059813.html>
>
> The best solution would of course be to fix imp.find_module so it scales
> properly.
>
>
> Sturla
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/319c1221/attachment.html>

From masklinn at masklinn.net  Thu Feb  9 20:42:40 2012
From: masklinn at masklinn.net (Masklinn)
Date: Thu, 9 Feb 2012 20:42:40 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209193635.GE9836@idyll.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<20120209193635.GE9836@idyll.org>
Message-ID: <F045F114-8CC0-4ACF-9CE6-F07011882F8C@masklinn.net>

On 2012-02-09, at 20:36 , C. Titus Brown wrote:
> On Fri, Feb 10, 2012 at 03:19:36AM +0800, Matt Joiner wrote:
>>> The GIL annoys those who have learned to expect threading.Thread for CPU
>>> bound concurrency in advance -- which typically means prior experience with
>>> Java. Python threads are fine for their intended use -- e.g. I/O and
>>> background tasks in a GUI.
>> 
>> Even for that purpose they're too heavy. The GIL conflicts, and
>> boilerplate overhead spawning threads is obscene for more than trivial
>> cases.
> 
> The GIL is almost entirely a PR issue.  In actual practice, it is so great
> (simple, straightforward, functional) I believe that it is a sign of Guido's
> time machine-enabled foresight.

I'm not sure dabeaz would agree with you if he intervened in the discussion.

From guido at python.org  Thu Feb  9 20:42:57 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:42:57 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
Message-ID: <CAP7+vJKTQvx+UdyrC+c3YH+zFnxyUgfbqu9pofhf-G7_jrqtuw@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:19 AM, Matt Joiner <anacrolix at gmail.com> wrote:

> > The GIL annoys those who have learned to expect threading.Thread for CPU
> > bound concurrency in advance -- which typically means prior experience
> with
> > Java. Python threads are fine for their intended use -- e.g. I/O and
> > background tasks in a GUI.
>
> Even for that purpose they're too heavy. The GIL conflicts, and
> boilerplate overhead spawning threads is obscene for more than trivial
> cases.


I'd actually say that using OS threads is too heavy *specifically* for
trivial cases. If you spawn a thread to add two numbers you'll have a huge
overhead. If you spawn a thread to do something significant, the overhead
doesn't matter much.

Note that even in Java, everyone uses thread pools to reduce thread
creation overhead.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/4f90ff39/attachment.html>

From mwm at mired.org  Thu Feb  9 20:43:27 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 9 Feb 2012 11:43:27 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Ma1YzYkDB7ZZT6FR=wi+AGBOGwFs7An6WCbzaOCbYwHQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<CAB4yi1Ma1YzYkDB7ZZT6FR=wi+AGBOGwFs7An6WCbzaOCbYwHQ@mail.gmail.com>
Message-ID: <20120209114327.0d262bd1@bhuda.mired.org>

On Fri, 10 Feb 2012 03:16:00 +0800
Matt Joiner <anacrolix at gmail.com> wrote:

> > If threading is the only acceptable concurrency mechanism, then Python
> > is the wrong language to use. But you're also not building scaleable
> > systems, which is most of where it really matters. If you're willing
> > to consider things other than threading - and you have to if you want
> > to build scaleable systems - then Python makes a good choice.
> Yes but core Python doesn't have any other true concurrency mechanisms
> other than native threading, and they're too heavyweight for this
> purpose alone. On top of this they're useless for Python-only
> parallelism.

Huh? Core python has other concurrency mechanisms other than native
threading. I don't know what your purpose is, but for mine (building
horizontally scaleable systems of various types), they work
fine. They're much easier to design with and maintain than using
threads as well. They also work well in Python-only systems.

If you're using "true" to exclude anything but threading, then you're
just playing word games. The reality is that most problems don't need
threading. The only thing it buys you over the alternatives is easy
shared memory. Very few problems actually require that.

> > Personally, I'd like to see a modern threading model in Python,
> > especially if it's tools can be extended to work with other
> > concurrency mechanisms. But that's a *long* way into the future.
> Too far. It needs to be now. The downward spiral is already beginning.
> Mobile phones are going multicore. My next desktop will probably have
> 8 cores or more. All the heavyweight languages are firing up
> thread/STM standardizations and implementations to make this stuff
> more performant and easier than it already is.

Yes, Python needs something like that. You can't have it without
breaking backwards compatibility. It's not clear you can have it
without serious performance hits in Python's primary use area, which
is single-threaded scripts. Which means it's probably a Python 4K
feature.

There have been a number of discussions on python-ideas about this. I
submitted a proto-pep that covers most of that to python-dev for
further discussion and approval. I'd suggest you chase those things
down.

> > That said, it's perfectly reasonable to suggest changes you think will
> > improve the popularity of the language. But be prepared to show that
> > they're actually good, as opposed to merely possibly popular.
> This doesn't apply to "enabling" features. Features that make it
> possible for popular stuff to happen. Concurrency isn't popular, but
> parallelism is. At least where the GIL is concerned, an good
> alternative concurrency mechanism doesn't exist. (The popular one is
> native threading).

No, the process needs to apply to *all* changes. Even changes to
implementation details - like removing the GIL. If your implementation
that removes the GIL causes a 50% slowdown in single-threaded python
code, it ain't gonna happen.

But until you actually propose a change, it won't matter. Nothing's
going to happen until someone actually does something more than talk
about it.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From guido at python.org  Thu Feb  9 20:43:41 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 11:43:41 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CA+OGgf6aPys23xgk1L9gph24XKpSNiGu6SVcnpfJ72u83VbgRA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<CAP7+vJJ6CLxqXwKjuqu_VUVDZ-MD4FQK98==pEuxyBtxf8Ly2A@mail.gmail.com>
	<F4A464DD-8CB6-4AEE-BEF3-B087B32F713A@masklinn.net>
	<4F3419B4.6010802@molden.no>
	<CA+OGgf6aPys23xgk1L9gph24XKpSNiGu6SVcnpfJ72u83VbgRA@mail.gmail.com>
Message-ID: <CAP7+vJ+GqK6SC4PQtyZHcb9_YAD3w=G9DwBeAUJOc_A9=hSC=A@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:39 AM, Jim Jewett <jimjjewett at gmail.com> wrote:

> On Thu, Feb 9, 2012 at 2:08 PM, Sturla Molden <sturla at molden.no> wrote:
>
> > And Chrome uses one *process* for each tab, right?
>

Can we stop discussing Chrome here? It doesn't really matter.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/e91e15e3/attachment.html>

From sturla at molden.no  Thu Feb  9 20:44:23 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 20:44:23 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PdGPZZmYqjaphxg8GQn4CMnHbJQqptkTm+5tTQV8owUw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>	<4F341419.6030808@molden.no>	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>	<4F341EC1.6060004@molden.no>
	<CAB4yi1PdGPZZmYqjaphxg8GQn4CMnHbJQqptkTm+5tTQV8owUw@mail.gmail.com>
Message-ID: <4F342217.4080109@molden.no>

On 09.02.2012 20:34, Matt Joiner wrote:
> Linux user here. I'm not sure that IOCP solve the I/O concurrency
> issue anyway, it's just as convoluted as polling from memory.

On Linux processes are so light-weight that you can fork (os.fork) 
instead of spawning threads. Threads are typically needed for Java, 
Solaris and Windows, where forking is either slow or not possible.

But if you need really scalable I/O on Linux, consider select/poll or epoll.

And on FreeBSD and Mac there is kqueue.

Sturla


From mwm at mired.org  Thu Feb  9 20:48:08 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 9 Feb 2012 11:48:08 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341DBC.5010609@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>
	<4F341DBC.5010609@molden.no>
Message-ID: <20120209114808.07220f5a@bhuda.mired.org>

On Thu, 09 Feb 2012 20:25:48 +0100
Sturla Molden <sturla at molden.no> wrote:
> The offender is actually imp.find_module, which results in huge number 
> of failed open() calls when used concurrently from many processes.

Ah, I see why I never ran into it. I build systems that start by
loading all the modules they need, then fork()ing many processes from
that parent.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From stefan_ml at behnel.de  Thu Feb  9 20:53:55 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 09 Feb 2012 20:53:55 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209185810.GC20556@mcnabbs.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org>
Message-ID: <jh188p$6ut$1@dough.gmane.org>

Andrew McNabb, 09.02.2012 19:58:
> On Thu, Feb 09, 2012 at 10:44:42AM -0800, Guido van Rossum wrote:
>> I am guessing in part that's a function of resistance to change, and in
>> part it means PyPy hasn't gotten enough mindshare yet. (Raise your hand if
>> you have PyPy installed on one of your systems. Raise your hand if you use
>> it. Raise your hand if you are a PyPy contributor. :-)
> 
> I don't know if you actually want replies, but I'll bite.  I have pypy
> installed (from the standard Fedora pypy package), and for a particular
> project it provided a 20x speedup.  I'm not a PyPy contributor, but I'm
> a believer.
> 
> I would use PyPy everywhere if it worked with Python 3 and scipy.

AFAIK, there is no concrete roadmap towards supporting SciPy on top of
PyPy. Currently, PyPy is getting its own implementation of NumPy-like
arrays, but there is currently no interaction with anything in the SciPy
world outside of those. Given the shear size of SciPy, reimplementing it on
top of numpypy is unrealistic.

That being said, it's quite possible to fire up CPython from PyPy (or vice
versa) and interact with that, if you really need both PyPy and SciPy. It
even seems to be supported through multiprocessing. I find that pretty cool.

http://thread.gmane.org/gmane.comp.python.pypy/9159/focus=9161

Stefan


From ctb at msu.edu  Thu Feb  9 20:57:57 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Thu, 9 Feb 2012 11:57:57 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <F045F114-8CC0-4ACF-9CE6-F07011882F8C@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<20120209193635.GE9836@idyll.org>
	<F045F114-8CC0-4ACF-9CE6-F07011882F8C@masklinn.net>
Message-ID: <20120209195757.GA30150@idyll.org>

On Thu, Feb 09, 2012 at 08:42:40PM +0100, Masklinn wrote:
> On 2012-02-09, at 20:36 , C. Titus Brown wrote:
> > On Fri, Feb 10, 2012 at 03:19:36AM +0800, Matt Joiner wrote:
> >>> The GIL annoys those who have learned to expect threading.Thread for CPU
> >>> bound concurrency in advance -- which typically means prior experience with
> >>> Java. Python threads are fine for their intended use -- e.g. I/O and
> >>> background tasks in a GUI.
> >> 
> >> Even for that purpose they're too heavy. The GIL conflicts, and
> >> boilerplate overhead spawning threads is obscene for more than trivial
> >> cases.
> > 
> > The GIL is almost entirely a PR issue.  In actual practice, it is so great
> > (simple, straightforward, functional) I believe that it is a sign of Guido's
> > time machine-enabled foresight.
> 
> I'm not sure dabeaz would agree with you if he intervened in the discussion.

Are we scheduling interventions for me now?  'cause there's a lot of people
who want to jump in that queue :)

dabeaz understands this stuff at a deeper level than me, which is often a
handicap in these kinds of discussions, IMO.  (He's also said that he prefers
message passing to threading.)  The point is that in terms of actually making
my own libraries and parallelizing code, the GIL has been very straightforward,
cross platform, and quite simple for understanding the consequences of a fairly
wide range of multithreading models.  Most people want to go do inappropriately
complex things ("ooh! threads! shiny!") with threads and then fail to write
robust code or understand the scaling of their code; I think the GIL does a
fine job of blocking the simplest stupidities.

Anyway, I love the GIL myself, although I think there is a great opportunity
for a richer & more usable mid-level C API for both thread states and
interpreters.

cheers,
--titus
-- 
C. Titus Brown, ctb at msu.edu


From sturla at molden.no  Thu Feb  9 21:03:03 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 09 Feb 2012 21:03:03 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209114808.07220f5a@bhuda.mired.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<20120209104237.154be949@bhuda.mired.org>	<4F341710.9030806@molden.no>	<CAP7+vJ+2yZzXBb8pStYA+oV79Yhf239PHJLNW5dYsjiXOK6pEA@mail.gmail.com>	<4F341DBC.5010609@molden.no>
	<20120209114808.07220f5a@bhuda.mired.org>
Message-ID: <4F342677.2070005@molden.no>

On 09.02.2012 20:48, Mike Meyer wrote:

> Ah, I see why I never ran into it. I build systems that start by
> loading all the modules they need, then fork()ing many processes from
> that parent.

Yes, but that would not work with MPI (e.g. mpi4py) where the MPI 
runtime (e.g. MPICH2) is starting the Python processes.

Theoretically the issue should be be present on Windows when using 
multiprocessing, but not on Linux as multiprocessing is using os.fork.

Sturla


From stefan_ml at behnel.de  Thu Feb  9 21:05:14 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 09 Feb 2012 21:05:14 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <F045F114-8CC0-4ACF-9CE6-F07011882F8C@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<20120209193635.GE9836@idyll.org>
	<F045F114-8CC0-4ACF-9CE6-F07011882F8C@masklinn.net>
Message-ID: <jh18tq$6ut$2@dough.gmane.org>

Masklinn, 09.02.2012 20:42:
> On 2012-02-09, at 20:36 , C. Titus Brown wrote:
>> On Fri, Feb 10, 2012 at 03:19:36AM +0800, Matt Joiner wrote:
>>>> The GIL annoys those who have learned to expect threading.Thread for CPU
>>>> bound concurrency in advance -- which typically means prior experience with
>>>> Java. Python threads are fine for their intended use -- e.g. I/O and
>>>> background tasks in a GUI.
>>>
>>> Even for that purpose they're too heavy. The GIL conflicts, and
>>> boilerplate overhead spawning threads is obscene for more than trivial
>>> cases.
>>
>> The GIL is almost entirely a PR issue.  In actual practice, it is so great
>> (simple, straightforward, functional) I believe that it is a sign of Guido's
>> time machine-enabled foresight.
> 
> I'm not sure dabeaz would agree with you if he intervened in the discussion.

That's an implementation detail, though.

Stefan


From stephen at xemacs.org  Thu Feb  9 21:14:54 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 10 Feb 2012 05:14:54 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
Message-ID: <874nuzyda9.fsf@uwakimon.sk.tsukuba.ac.jp>

Massimo Di Pierro writes:
 > 
 > On Feb 9, 2012, at 12:03 PM, Steven D'Aprano wrote:
 > 
 > > Massimo Di Pierro wrote:
 > >> Here is another data point:
 > >> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
 > >> Unfortunately the TIOBE index does matter. I can speak for python  
 > >> in education and trends I seen.

Well, maybe you should teach your students the rudiments of lying,
erm, "statistics".  That -3% on the TIOBE index is a steaming heap of
FUD, as Anatoly himself admitted.  Feb 2011 is clearly above trend,
Feb 2012 below it.  Variables vary, OK?  So at the moment it is
absolutely unclear whether Python's trend line has turned down or even
decreased slope.

And the RedMonk ranking shows Python at the very top.

 > Don't shoot the messenger please.
 > 
 > You can dismiss or address the problem. Anyway... undergrads do care  
 > because they will take 4 years to grade and they do not want to come  
 > out with obsolete skills. Our undergrads learn Python, Ruby, Java,  
 > Javascript and C++.

Maybe they should learn something about reality of the IT industry,
too.  According to the TIOBE survey, COBOL and PL/1 are in the same
class (rank 51-100, basically indistinguishable) with POSIX shell.
Old programming languages never die ... and experts in them only
become more valuable with time.  Python skills will hardly become
"obsolete" in the next decade, certainly not in the next 4 years.

You say "dismiss or address the problem."  Is there a problem?  I
dunno.  Popularity is nice, but I really don't know if I would want to
use a Python that spent the next five years (because that's what it
will take) fixing what ain't broke to conform to undergraduate
misconceptions.

Sure, it would be nice have more robust support for installing
non-stdlib modules such as numpy.  But guess what?  That's a hard nut
to crack, and more, people have been working quite hard on the issue
for a while.  The distutils folks seem to be about to release at this
point -- I guess the Time Machine has struck again!

And by the way, which of Ruby, Java, Javascript, and C++ provides
something like numpy that's easier to install?  Preferably part of
their stdlib?  In my experience on Linux and Mac, at least, numerical
code has always been an issue, whether it's numpy (once that I can
remember, and that was because of some dependency which wouldn't
build, not numpy itself), Steel Bank Common Lisp, ATLAS, R, ....

The one thing that bothers me about the picture at TIOBE is the
Objective-C line.  I assume that's being driven by iPhone and iPad
apps, and I suppose Java is being driven in part by Android.  It's too
bad Python can't get a piece of that action!


From raymond.hettinger at gmail.com  Thu Feb  9 21:16:45 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 9 Feb 2012 12:16:45 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>
Message-ID: <0862C9C6-A2AD-4779-BFEC-69304ECFD5C1@rcn.com>


On Feb 9, 2012, at 9:50 AM, Daniel Stutzbach wrote:

> Maintaining a sorted list using Python's list type is a trap.  The bisect is O(log n), but insertion and deletion are still O(n).
> 
> A SortedList class that provides O(log n) insertions is useful from time to time.  There are several existing implementations available (I wrote one of them, on top of my blist type), each with their pros and cons.

I concur.  People who want to maintain sorted collections (periodically adding and deleting items) are far better-off using your blist, a binary tree, or an in-memory sqlite database.  Otherwise, we will have baited them into a non-scalable O(n) solution.  Unfortunately, when people see the word "bisect", they will presume they've got O(log n) code.  We've documented the issue with bisect.insort(), but the power of suggestion is very strong.


Raymond 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/9095f669/attachment.html>

From ehlesmes at gmail.com  Thu Feb  9 21:20:18 2012
From: ehlesmes at gmail.com (Edward Lesmes)
Date: Thu, 9 Feb 2012 15:20:18 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <874nuzyda9.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<874nuzyda9.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADwd9XnjhiQvycwS0b-k4KbqQzohqCj22Z6=1D=_6f2rqBvvVw@mail.gmail.com>

It's too bad Python can't get a piece of that action!
Indeed.

On Thu, Feb 9, 2012 at 3:14 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Massimo Di Pierro writes:
>  >
>  > On Feb 9, 2012, at 12:03 PM, Steven D'Aprano wrote:
>  >
>  > > Massimo Di Pierro wrote:
>  > >> Here is another data point:
>  > >> http://redmonk.com/sogrady/2012/02/08/language-rankings-2-2012/
>  > >> Unfortunately the TIOBE index does matter. I can speak for python
>  > >> in education and trends I seen.
>
> Well, maybe you should teach your students the rudiments of lying,
> erm, "statistics".  That -3% on the TIOBE index is a steaming heap of
> FUD, as Anatoly himself admitted.  Feb 2011 is clearly above trend,
> Feb 2012 below it.  Variables vary, OK?  So at the moment it is
> absolutely unclear whether Python's trend line has turned down or even
> decreased slope.
>
> And the RedMonk ranking shows Python at the very top.
>
>  > Don't shoot the messenger please.
>  >
>  > You can dismiss or address the problem. Anyway... undergrads do care
>  > because they will take 4 years to grade and they do not want to come
>  > out with obsolete skills. Our undergrads learn Python, Ruby, Java,
>  > Javascript and C++.
>
> Maybe they should learn something about reality of the IT industry,
> too.  According to the TIOBE survey, COBOL and PL/1 are in the same
> class (rank 51-100, basically indistinguishable) with POSIX shell.
> Old programming languages never die ... and experts in them only
> become more valuable with time.  Python skills will hardly become
> "obsolete" in the next decade, certainly not in the next 4 years.
>
> You say "dismiss or address the problem."  Is there a problem?  I
> dunno.  Popularity is nice, but I really don't know if I would want to
> use a Python that spent the next five years (because that's what it
> will take) fixing what ain't broke to conform to undergraduate
> misconceptions.
>
> Sure, it would be nice have more robust support for installing
> non-stdlib modules such as numpy.  But guess what?  That's a hard nut
> to crack, and more, people have been working quite hard on the issue
> for a while.  The distutils folks seem to be about to release at this
> point -- I guess the Time Machine has struck again!
>
> And by the way, which of Ruby, Java, Javascript, and C++ provides
> something like numpy that's easier to install?  Preferably part of
> their stdlib?  In my experience on Linux and Mac, at least, numerical
> code has always been an issue, whether it's numpy (once that I can
> remember, and that was because of some dependency which wouldn't
> build, not numpy itself), Steel Bank Common Lisp, ATLAS, R, ....
>
> The one thing that bothers me about the picture at TIOBE is the
> Objective-C line.  I assume that's being driven by iPhone and iPad
> apps, and I suppose Java is being driven in part by Android.  It's too
> bad Python can't get a piece of that action!
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
Edward Lesmes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/abe18276/attachment.html>

From raymond.hettinger at gmail.com  Thu Feb  9 21:34:47 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 9 Feb 2012 12:34:47 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <CAP7+vJJE0zhWquQdSi5QYg876BPRgYMDanw_JTd89xtH5NE1TQ@mail.gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<FCAAB119-097A-4F92-87F0-DCB96E4B92CE@rcn.com>
	<CAP7+vJJE0zhWquQdSi5QYg876BPRgYMDanw_JTd89xtH5NE1TQ@mail.gmail.com>
Message-ID: <2E5BF6F5-6A23-45D9-BF65-36D936925543@gmail.com>


On Feb 9, 2012, at 9:48 AM, Guido van Rossum wrote:

> The more fundamental "conflict" here seems to be between algorithms and classes. list.sort(), bisect and heapq focus on the algorithm.

Bisect in particular had way too much focus on the algorithm.  The API is awkward and error-prone for many common use cases.

I've tried to remedy that through documenting how to implement the common use cases:   http://docs.python.org/py3k/library/bisect.html#searching-sorted-lists

The issue is that the current API focuses on "insertion points" rather than on finding values.  Unfortunately, this API is very old, so the only way to fix it is to introduce a new class.

If we introduced class around a sorted sequence, then we could make an reasonable API that corresponds to what people usually want to do with sorted sequences.  

Of course, that still leaves the issue with an O(n) insort.  As Daniel pointed-out, a list is not the correct underlying data structure if you want to do periodic insertions and deletions.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/b5ab0807/attachment.html>

From timothy.c.delaney at gmail.com  Thu Feb  9 21:42:54 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Fri, 10 Feb 2012 07:42:54 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJKsEAPNA6FF_LCdaURCnEu8s+YrtjYvrVwtMwLr9178pw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org>
	<CAP7+vJKsEAPNA6FF_LCdaURCnEu8s+YrtjYvrVwtMwLr9178pw@mail.gmail.com>
Message-ID: <CAN8CLgmh_uMB02pt67x=LW+N_=0anEVkVrNBP3dAEt8cBSFxqg@mail.gmail.com>

On 10 February 2012 06:06, Guido van Rossum <guido at python.org> wrote:

> On Thu, Feb 9, 2012 at 10:58 AM, Andrew McNabb <amcnabb at mcnabbs.org>wrote:
>
>> On Thu, Feb 09, 2012 at 10:44:42AM -0800, Guido van Rossum wrote:
>> > I am guessing in part that's a function of resistance to change, and in
>> > part it means PyPy hasn't gotten enough mindshare yet. (Raise your hand
>> if
>> > you have PyPy installed on one of your systems. Raise your hand if you
>> use
>> > it. Raise your hand if you are a PyPy contributor. :-)
>>
>> I don't know if you actually want replies, but I'll bite.  I have pypy
>> installed (from the standard Fedora pypy package), and for a particular
>> project it provided a 20x speedup.  I'm not a PyPy contributor, but I'm
>> a believer.
>>
>> I would use PyPy everywhere if it worked with Python 3 and scipy.  My
>> apologies if this was just a rhetorical question. :)
>
>
> Thanks for replying, it was not a rhetorical question. It's something I'm
> considering asking during my keynote at PyCon next month.
>

In that case ...

- I have various versions of PyPy installed (regularly pull the latest
working Windows build);

- I use it occasionally, but most of my Python work ATM is Google App
Engine-based, and the GAE SDK doesn't work with PyPy;

- I'm not a PyPy contributor, but am also a believer - I definitely think
that PyPy is the future and should be the base for Python4K.

- I won't be at PyCon.

Cheers,

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/abfcdd48/attachment.html>

From ncoghlan at gmail.com  Thu Feb  9 22:05:32 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Feb 2012 07:05:32 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
Message-ID: <CADiSq7ef9t7LSJK+jX_v4VdwQp5BVgYgoXGsYpK71ydY+fHuiw@mail.gmail.com>

On Fri, Feb 10, 2012 at 5:19 AM, Matt Joiner <anacrolix at gmail.com> wrote:
>> The GIL annoys those who have learned to expect threading.Thread for CPU
>> bound concurrency in advance -- which typically means prior experience with
>> Java. Python threads are fine for their intended use -- e.g. I/O and
>> background tasks in a GUI.
>
> Even for that purpose they're too heavy. The GIL conflicts, and
> boilerplate overhead spawning threads is obscene for more than trivial
> cases.

Have you even *tried* concurrent.futures
(http://docs.python.org/py3k/library/concurrent.futures)? Or the 2.x
backport on PyPI (http://pypi.python.org/pypi/futures)?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From guido at python.org  Thu Feb  9 22:20:48 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 13:20:48 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <2E5BF6F5-6A23-45D9-BF65-36D936925543@gmail.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<FCAAB119-097A-4F92-87F0-DCB96E4B92CE@rcn.com>
	<CAP7+vJJE0zhWquQdSi5QYg876BPRgYMDanw_JTd89xtH5NE1TQ@mail.gmail.com>
	<2E5BF6F5-6A23-45D9-BF65-36D936925543@gmail.com>
Message-ID: <CAP7+vJK4g0H5tnYeO8Km0e=3u==BkVGD-M9w24iAhVBpzcO2JA@mail.gmail.com>

On Thu, Feb 9, 2012 at 12:34 PM, Raymond Hettinger <
raymond.hettinger at gmail.com> wrote:

> Bisect in particular had way too much focus on the algorithm.  The API is
> awkward and error-prone for many common use cases.
>
> I've tried to remedy that through documenting how to implement the common
> use cases:
> http://docs.python.org/py3k/library/bisect.html#searching-sorted-lists
>
> The issue is that the current API focuses on "insertion points" rather
> than on finding values.  Unfortunately, this API is very old, so the only
> way to fix it is to introduce a new class.
>
> If we introduced class around a sorted sequence, then we could make an
> reasonable API that corresponds to what people usually want to do with
> sorted sequences.
>
> Of course, that still leaves the issue with an O(n) insort.  As Daniel
> pointed-out, a list is not the correct underlying data structure if you
> want to do periodic insertions and deletions.
>

Maybe you're overanalyzing the problem? It seems what you want would
require a PEP and/or a reference implementation that is thoroughly tested
as a 3rd party package before it warrants inclusion into the stdlib. In the
mean time adding a key= option that echoes the API offered by list.sort()
and sorted() is a no-brainer.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/1ee6e5ea/attachment.html>

From ncoghlan at gmail.com  Thu Feb  9 22:34:25 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Feb 2012 07:34:25 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
Message-ID: <CADiSq7efoK-e7bwzmsBxX9C=heCa9ogSK_pGCGesp0BtT1hHMA@mail.gmail.com>

On Fri, Feb 10, 2012 at 5:25 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Thu, Feb 9, 2012 at 12:16 PM, Giampaolo Rodol? <g.rodola at gmail.com> wrote:
>> I bet a lot of people don't want to upgrade for another reason: unicode.
>> The impression I got is that python 3 forces the user to use and
>> *understand* unicode and a lot of people simply don't want to deal
>> with that.
>> In python 2 there was no such a strong imposition.
>> Python 2 string type acting both as bytes and as text was certainly
>> ambiguos and "impure" on different levels and changing that was
>> definitively a win in terms of purity and correctness.
>> I bet most advanced users are happy with this change.
>> On the other hand, Python 2 average user was free to ignore that
>> distinction even if that meant having subtle bugs hidden somewhere in
>> his/her code.
>> I think this aspect shouldn't be underestimated.
>
> Isn't that more accurate for framework writers, rather than for
> "average" users? ?How often do average users have to address
> encoding/decoding in Python 3?

The problem for average users *right now* is that many of the Unicode
handling tools that were written for the blurry
"is-it-bytes-or-is-it-text?" 2.x 8-bit str type haven't been ported to
3.x yet. That's currently happening, and the folks doing it are the
ones who really have to make the adjustment, and figure out what they
can deal with on behalf of their users and what they need to expose
(if anything).

The idea with Python 3 unicode is to have errors happen at (or at
least close to) the point where the code is doing something wrong,
unlike the Python 2 implicit conversion model, where either data gets
silently corrupted, or you get a Unicode error far from the location
that introduced the problem.

I actually find it somewhat amusing when people say that python-dev
isn't focusing on users enough because of the Python 3 transition or
the Windows installer problems. What they *actually* seem to be
complaining about is that python-dev isn't focused entirely on users
that are native English speakers using an expensive proprietary OS.
And that's a valid observation - most of us are here because we like
Python and primarily want to make it better for the environments where
*we* use it, which is mostly a combination of Linux and Mac users, a
few other POSIX based platforms and a small minority of Windows
developers. Given the contrariness of Windows as a target platform,
the time of those developers is mostly spent on making it keep
working, and bringing it up to feature parity with the POSIX version,
so cleaning up the installation process falls to the wayside. (And,
for all the cries of, "Python should be better supported on Windows!",
we just don't see many Windows devs signing up to help - since I
consider developing for Windows it's own special kind of hell that I'm
happy to never have to do again, it doesn't actually surprise me
there's a shortage of people willing to do it as a hobby)

In terms of actually *fixing it*, the PSF doesn't generally solicit
grant proposals, it reviews (and potentially accepts) them. If anyone
is serious about getting something done for 3.3, then *write and
submit a grant proposal* to the PSF board with the goal of either
finalising the Python launcher for Windows, or else just closing out
various improvements to the current installer that are already on the
issue tracker (e.g. version numbers in the shortcut names, an option
to modify the system PATH). Even without going all the way to a grant
proposal, go find those tracker items I mentioned and see if there's
anything you can do to help folks like Martin von Loewis, Brian Curtin
and Terry Reedy close them out.

In the meantime, if the python.org packages for Windows aren't up to
scratch (and they aren't in many ways), *use the commercially backed
ones* (or one of the other sumo distributions that are out there).
Don't tell your students to grab the raw installers directly from
python.org, redirect them to the free rebuilds from ActiveState or
Enthought, or go all out and get them to install something like
Python(X, Y).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From massimo.dipierro at gmail.com  Thu Feb  9 22:41:22 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Thu, 9 Feb 2012 15:41:22 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7efoK-e7bwzmsBxX9C=heCa9ogSK_pGCGesp0BtT1hHMA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
	<CADiSq7efoK-e7bwzmsBxX9C=heCa9ogSK_pGCGesp0BtT1hHMA@mail.gmail.com>
Message-ID: <90C8316C-1AB0-4759-B3DF-0FB07477FF08@gmail.com>

First of all all the Python developers are doing an amazing job, and  
none of the comments should be taken as a critique but only as a  
suggestion.

On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
[...]
> In the meantime, if the python.org packages for Windows aren't up to
> scratch (and they aren't in many ways), *use the commercially backed
> ones* (or one of the other sumo distributions that are out there).
> Don't tell your students to grab the raw installers directly from
> python.org, redirect them to the free rebuilds from ActiveState or
> Enthought, or go all out and get them to install something like
> Python(X, Y).

This is what I do now. I tell my students if they have trouble to  
Enthought. Yet there are issues with license and 32 (free) vs 64 bits  
(not free). Long term I do not think this what we should encourage.

>
> Cheers,
> Nick.
>
> -- 
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From tjreedy at udel.edu  Thu Feb  9 22:51:53 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 16:51:53 -0500
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <0862C9C6-A2AD-4779-BFEC-69304ECFD5C1@rcn.com>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>
	<0862C9C6-A2AD-4779-BFEC-69304ECFD5C1@rcn.com>
Message-ID: <jh1f6b$t0g$1@dough.gmane.org>

On 2/9/2012 3:16 PM, Raymond Hettinger wrote:
>
> On Feb 9, 2012, at 9:50 AM, Daniel Stutzbach wrote:
>
>> Maintaining a sorted list using Python's list type is a trap. The
>> bisect is O(log n), but insertion and deletion are still O(n).

The omitted constants are such that the log n term dominates for 'small' 
n. list.sort internally uses binary insert sort for n up to 64. It only 
switches to mergesort for runs of at least 64.

>> A SortedList class that provides O(log n) insertions is useful from
>> time to time. There are several existing implementations available (I
>> wrote one of them, on top of my blist type), each with their pros and
>> cons.

Are your blist leaves lists (or arrays) of some maximum size?

> I concur. People who want to maintain sorted collections (periodically
> adding and deleting items) are far better-off using your blist, a binary
> tree, or an in-memory sqlite database. Otherwise, we will have baited
> them into a non-scalable O(n) solution. Unfortunately, when people see
> the word "bisect", they will presume they've got O(log n) code. We've
> documented the issue with bisect.insort(), but the power of suggestion
> is very strong.

Using insort on a list of a millions items is definitely not a good 
idea, but I can see how someone not so aware of scaling issues might be 
tempted, especially with no stdlib alternative. One could almost be 
tempted to issue a warning if 'hi' is 'too large'.

-- 
Terry Jan Reedy


From stutzbach at google.com  Thu Feb  9 23:01:27 2012
From: stutzbach at google.com (Daniel Stutzbach)
Date: Thu, 9 Feb 2012 14:01:27 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <jh1f6b$t0g$1@dough.gmane.org>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>
	<0862C9C6-A2AD-4779-BFEC-69304ECFD5C1@rcn.com>
	<jh1f6b$t0g$1@dough.gmane.org>
Message-ID: <CAMMy=Ou5w7ToiLF-pCHMy+bFdRqEjBG7Kk2B68sCt=dnw2b3JA@mail.gmail.com>

On Thu, Feb 9, 2012 at 1:51 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On Feb 9, 2012, at 9:50 AM, Daniel Stutzbach wrote:
>> A SortedList class that provides O(log n) insertions is useful from
>
> time to time. There are several existing implementations available (I
>>> wrote one of them, on top of my blist type), each with their pros and
>>> cons.
>>>
>>
> Are your blist leaves lists (or arrays) of some maximum size?


Yes.  Each leaf has at most 128 elements.  It's a compile-time constant.

-- 
Daniel Stutzbach
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/a73109dc/attachment.html>

From guido at python.org  Thu Feb  9 23:02:00 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 14:02:00 -0800
Subject: [Python-ideas] Optional key to `bisect`'s functions?
In-Reply-To: <jh1f6b$t0g$1@dough.gmane.org>
References: <05E0F324-690E-45A4-8567-BB9BCD226B42@masklinn.net>
	<CAGmFidZOknUTYEjKAanX6G8MSuX8WOtSKiqz6M4dOYVbRw606Q@mail.gmail.com>
	<CAP7+vJLUFUiXdRRBiURK7H7aN0i88OoJGN01Kno3=Bj+-ouU7g@mail.gmail.com>
	<CAMMy=OtNy7_CmqiZ4Qsesu=6Gd2G2w8zqzY86JYC-VtGpU6qpA@mail.gmail.com>
	<0862C9C6-A2AD-4779-BFEC-69304ECFD5C1@rcn.com>
	<jh1f6b$t0g$1@dough.gmane.org>
Message-ID: <CAP7+vJKPk1gtxXh8kDnAYBnJneN5VysrYOOTmypFzZQmF6VA9w@mail.gmail.com>

On Thu, Feb 9, 2012 at 1:51 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> Using insort on a list of a millions items is definitely not a good idea,
> but I can see how someone not so aware of scaling issues might be tempted,
> especially with no stdlib alternative. One could almost be tempted to issue
> a warning if 'hi' is 'too large'.
>

Put it in the docs.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/4672d575/attachment.html>

From ubershmekel at gmail.com  Thu Feb  9 23:24:18 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Fri, 10 Feb 2012 00:24:18 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <90C8316C-1AB0-4759-B3DF-0FB07477FF08@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
	<CADiSq7efoK-e7bwzmsBxX9C=heCa9ogSK_pGCGesp0BtT1hHMA@mail.gmail.com>
	<90C8316C-1AB0-4759-B3DF-0FB07477FF08@gmail.com>
Message-ID: <CANSw7Kx9qzqJgGDM1d2GUhQpctWX9jGCK=qHQaPt4N_1y0qMdQ@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:41 PM, Massimo Di Pierro <
massimo.dipierro at gmail.com> wrote:

> First of all all the Python developers are doing an amazing job, and none
> of the comments should be taken as a critique but only as a suggestion.
>
> On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
> [...]
>
>  In the meantime, if the python.org packages for Windows aren't up to
>> scratch (and they aren't in many ways), *use the commercially backed
>> ones* (or one of the other sumo distributions that are out there).
>> Don't tell your students to grab the raw installers directly from
>> python.org, redirect them to the free rebuilds from ActiveState or
>> Enthought, or go all out and get them to install something like
>> Python(X, Y).
>>
>
> This is what I do now. I tell my students if they have trouble to
> Enthought. Yet there are issues with license and 32 (free) vs 64 bits (not
> free). Long term I do not think this what we should encourage.
>
>

Concerning the eclipse and the plugin thing - "Aptana" is a nice bundle of
pydev with eclipse so it's just one download and you get a nice python IDE
with autocompletion etc.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/578335f6/attachment.html>

From amcnabb at mcnabbs.org  Thu Feb  9 23:25:40 2012
From: amcnabb at mcnabbs.org (Andrew McNabb)
Date: Thu, 9 Feb 2012 15:25:40 -0700
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh188p$6ut$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
Message-ID: <20120209222540.GD20556@mcnabbs.org>

On Thu, Feb 09, 2012 at 08:53:55PM +0100, Stefan Behnel wrote:
> 
> AFAIK, there is no concrete roadmap towards supporting SciPy on top of
> PyPy. Currently, PyPy is getting its own implementation of NumPy-like
> arrays, but there is currently no interaction with anything in the SciPy
> world outside of those. Given the shear size of SciPy, reimplementing it on
> top of numpypy is unrealistic.

I understand that there is some hope in getting cython to support pure
python and ctypes as a backend, and then to migrate scipy to use cython.
This is definitely a long-term solution.

Most people don't depend on all of scipy, and for some use cases, it's
not too hard to find alternatives.  Today I migrated a project from
scipy to the GNU Scientific Library (with ctypes).  It now works great
with PyPy, and I saw a total speedup of 10.6.  Dropping from 27 seconds
to 2.55 seconds is huge.  It's funny, but for a new project I would go
to great lengths to try to use the GSL instead of scipy (though I'm sure
for some use cases it wouldn't be possible).


> That being said, it's quite possible to fire up CPython from PyPy (or vice
> versa) and interact with that, if you really need both PyPy and SciPy. It
> even seems to be supported through multiprocessing. I find that pretty cool.
> 
> http://thread.gmane.org/gmane.comp.python.pypy/9159/focus=9161

That's a fascinating idea that I had never considered.  Thanks for
sharing.

--
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868


From tjreedy at udel.edu  Thu Feb  9 23:46:33 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 17:46:33 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
Message-ID: <jh1ics$k86$1@dough.gmane.org>

On 2/9/2012 12:46 PM, Massimo Di Pierro wrote:
> I think if easy_install, gevent, numpy (*), and win32 extensions where
> included in 3.x, together with a slightly better Idle (still based on

I am working on the patches already on the tracker, starting with bug fixes.

> Tkinter, with multiple pages,

If you mean multiple tabbed pages in one window, I believe there is a patch.

  autocompletion,

IDLE already has 'auto-completion'. If you mean something else, please 
explain.

> collapsible [blocks], line numbers,

I have thought about those.

> better printing with syntax highlighting),

Better basic printing support is really needed. #1528593
Color printing if not possible now would be nice, as color printers are 
common now. I have no idea if tkinter print support makes either easier now.

> and if easy_install were accessible via Idle, this would be a killer version.

That should be possible with an extension.

> Longer term removing the GIL and using garbage collection should be a
> priority. I am not sure what is involved and how difficult it is but

As has been discussed here and on pydev, the problems include things 
like making Python slower and disabling C extensions.

> perhaps this is what PyCon money can be used for. If this cannot be done
> without breaking backward compatibility again, then 3.x should be
> considered an experimental branch, people should be advised to stay with
> 2.7 (2.8?) and then skip to 4.x directly when these problems are

For non-Euro-Americans, a major problem with Python 1/2 was the use of 
ascii for identifiers. This was *fixed* by Python 3. When I went to 
Japan a couple of years ago and stopped in a general bookstore (like 
Borders), its computer language section had about 10 books on Python, 
most in Japanese as I remember. So it is apparently in use there.

> resolved. Python should not make a habit of breaking backward
> compatibility.

I believe the main problem has been the unicode switch, which is 
critical to Python being a world language. Removal of old-style classes 
was mostly a non-issue, except for the very few who intentionally 
continued to use them.

-- 
Terry Jan Reedy


From pydanny at gmail.com  Thu Feb  9 23:50:58 2012
From: pydanny at gmail.com (Daniel Greenfeld)
Date: Thu, 9 Feb 2012 14:50:58 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
Message-ID: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>

> Message: 1
> Date: Thu, 9 Feb 2012 15:41:22 -0600
> From: Massimo Di Pierro <massimo.dipierro at gmail.com>
> To: Nick Coghlan <ncoghlan at gmail.com>
> Cc: python-ideas <python-ideas at python.org>
> Subject: Re: [Python-ideas] Python 3000 TIOBE -3%
> Message-ID: <90C8316C-1AB0-4759-B3DF-0FB07477FF08 at gmail.com>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
> First of all all the Python developers are doing an amazing job, and
> none of the comments should be taken as a critique but only as a
> suggestion.

I completely agree with Massimo again. :-)

>
> On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
> [...]
>> In the meantime, if the python.org packages for Windows aren't up to
>> scratch (and they aren't in many ways), *use the commercially backed
>> ones* (or one of the other sumo distributions that are out there).
>> Don't tell your students to grab the raw installers directly from
>> python.org, redirect them to the free rebuilds from ActiveState or
>> Enthought, or go all out and get them to install something like
>> Python(X, Y).
>
> This is what I do now. I tell my students if they have trouble to
> Enthought. Yet there are issues with license and 32 (free) vs 64 bits
> (not free). Long term I do not think this what we should encourage.

I think it is odd to encourage users to go to use open source distros,
but if they have installation problems (which is really common -
Massimo/Titus/Audrey/Zed/etc seem to back me up here) to recommend
'somewhere' to go to commercial-but-free distros.

If we should be pointing new users to ActiveState or Enthought, maybe
we should just change the python.org default installers to what they
provide.

Tell you what, I'll take this matter off-list and bring it up with
Jesse Noller and the rest of the board working on the python.org RFP.


-- 
'Knowledge is Power'
Daniel Greenfeld
http://pydanny.blogspot.com


From guido at python.org  Thu Feb  9 23:56:37 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 14:56:37 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120209222540.GD20556@mcnabbs.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
Message-ID: <CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>

On Thu, Feb 9, 2012 at 2:25 PM, Andrew McNabb <amcnabb at mcnabbs.org> wrote:

> On Thu, Feb 09, 2012 at 08:53:55PM +0100, Stefan Behnel wrote:
> >
> > AFAIK, there is no concrete roadmap towards supporting SciPy on top of
> > PyPy. Currently, PyPy is getting its own implementation of NumPy-like
> > arrays, but there is currently no interaction with anything in the SciPy
> > world outside of those. Given the shear size of SciPy, reimplementing it
> on
> > top of numpypy is unrealistic.
>
> I understand that there is some hope in getting cython to support pure
> python and ctypes as a backend, and then to migrate scipy to use cython.
> This is definitely a long-term solution.
>
> Most people don't depend on all of scipy, and for some use cases, it's
> not too hard to find alternatives.  Today I migrated a project from
> scipy to the GNU Scientific Library (with ctypes).  It now works great
> with PyPy, and I saw a total speedup of 10.6.  Dropping from 27 seconds
> to 2.55 seconds is huge.  It's funny, but for a new project I would go
> to great lengths to try to use the GSL instead of scipy (though I'm sure
> for some use cases it wouldn't be possible).
>

Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
incorporate GSL?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/8cf58ac6/attachment.html>

From guido at python.org  Thu Feb  9 23:59:35 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 14:59:35 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
Message-ID: <CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>

On Thu, Feb 9, 2012 at 2:50 PM, Daniel Greenfeld <pydanny at gmail.com> wrote:

> > Message: 1
> > Date: Thu, 9 Feb 2012 15:41:22 -0600
> > From: Massimo Di Pierro <massimo.dipierro at gmail.com>
> > To: Nick Coghlan <ncoghlan at gmail.com>
> > Cc: python-ideas <python-ideas at python.org>
> > Subject: Re: [Python-ideas] Python 3000 TIOBE -3%
> > Message-ID: <90C8316C-1AB0-4759-B3DF-0FB07477FF08 at gmail.com>
> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> >
> > First of all all the Python developers are doing an amazing job, and
> > none of the comments should be taken as a critique but only as a
> > suggestion.
>
> I completely agree with Massimo again. :-)
>
> >
> > On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
> > [...]
> >> In the meantime, if the python.org packages for Windows aren't up to
> >> scratch (and they aren't in many ways), *use the commercially backed
> >> ones* (or one of the other sumo distributions that are out there).
> >> Don't tell your students to grab the raw installers directly from
> >> python.org, redirect them to the free rebuilds from ActiveState or
> >> Enthought, or go all out and get them to install something like
> >> Python(X, Y).
> >
> > This is what I do now. I tell my students if they have trouble to
> > Enthought. Yet there are issues with license and 32 (free) vs 64 bits
> > (not free). Long term I do not think this what we should encourage.
>
> I think it is odd to encourage users to go to use open source distros,
> but if they have installation problems (which is really common -
> Massimo/Titus/Audrey/Zed/etc seem to back me up here) to recommend
> 'somewhere' to go to commercial-but-free distros.
>

Why is that odd?

Those distros are an integral part of the ecosystem that is enabled by open
source. I see no philosophical problems (unless you are of the GNU religion
of course -- but then you should have said FOSS instead of open source :-).


> If we should be pointing new users to ActiveState or Enthought, maybe
> we should just change the python.org default installers to what they
> provide.
>

Again, why? The commercial distributors often lag way behind what
python.orgoffers -- and for very good reasons.


> Tell you what, I'll take this matter off-list and bring it up with
> Jesse Noller and the rest of the board working on the python.org RFP.
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/e10e0024/attachment.html>

From robert.kern at gmail.com  Fri Feb 10 00:06:09 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 09 Feb 2012 23:06:09 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
Message-ID: <jh1jh1$r2j$1@dough.gmane.org>

On 2/9/12 10:56 PM, Guido van Rossum wrote:
> On Thu, Feb 9, 2012 at 2:25 PM, Andrew McNabb <amcnabb at mcnabbs.org
> <mailto:amcnabb at mcnabbs.org>> wrote:
>
>     On Thu, Feb 09, 2012 at 08:53:55PM +0100, Stefan Behnel wrote:
>      >
>      > AFAIK, there is no concrete roadmap towards supporting SciPy on top of
>      > PyPy. Currently, PyPy is getting its own implementation of NumPy-like
>      > arrays, but there is currently no interaction with anything in the SciPy
>      > world outside of those. Given the shear size of SciPy, reimplementing it on
>      > top of numpypy is unrealistic.
>
>     I understand that there is some hope in getting cython to support pure
>     python and ctypes as a backend, and then to migrate scipy to use cython.
>     This is definitely a long-term solution.
>
>     Most people don't depend on all of scipy, and for some use cases, it's
>     not too hard to find alternatives.  Today I migrated a project from
>     scipy to the GNU Scientific Library (with ctypes).  It now works great
>     with PyPy, and I saw a total speedup of 10.6.  Dropping from 27 seconds
>     to 2.55 seconds is huge.  It's funny, but for a new project I would go
>     to great lengths to try to use the GSL instead of scipy (though I'm sure
>     for some use cases it wouldn't be possible).
>
>
> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy incorporate GSL?

GSL is GPLed. scipy is BSD.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From tjreedy at udel.edu  Fri Feb 10 00:07:18 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 18:07:18 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
Message-ID: <jh1jjo$rsk$1@dough.gmane.org>

On 2/9/2012 1:26 PM, Guido van Rossum wrote:
> On Thu, Feb 9, 2012 at 10:14 AM, Masklinn <masklinn at masklinn.net
> <mailto:masklinn at masklinn.net>> wrote:
>
>     On 2012-02-09, at 19:03 , Steven D'Aprano wrote:
>      > The choice of which garbage collection implementation (ref
>     counting is garbage collection) is a quality of implementation
>     detail, not a language feature.
>
>     That's debatable, it's an implementation detail with very different
>     semantics which tends to leak out into usage patterns of the
>     language (as it did with CPython, which basically did not get fixed
>     in the community until Pypy started ascending),
>
>
> I think it was actually Jython that first sensitized the community to
> this issue.

Yes, it was. The first PyPy status blog in Oct 2007
http://morepypy.blogspot.com/2007/10/first-post.html
long before any practical release, was a year after the 2.5 release.

-- 
Terry Jan Reedy


From pydanny at gmail.com  Fri Feb 10 00:09:55 2012
From: pydanny at gmail.com (Daniel Greenfeld)
Date: Thu, 9 Feb 2012 15:09:55 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
	<CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>
Message-ID: <CAOoSJ_oxtsKpTin-vt6MNQrbEjy_ccxzN8jJdYo6kv7u1vjGeA@mail.gmail.com>

On Thu, Feb 9, 2012 at 2:59 PM, Guido van Rossum <guido at python.org> wrote:

>> > On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
>> > [...]
>> >> In the meantime, if the python.org packages for Windows aren't up to
>> >> scratch (and they aren't in many ways), *use the commercially backed
>> >> ones* (or one of the other sumo distributions that are out there).
>> >> Don't tell your students to grab the raw installers directly from
>> >> python.org, redirect them to the free rebuilds from ActiveState or
>> >> Enthought, or go all out and get them to install something like
>> >> Python(X, Y).
>> >
>> > This is what I do now. I tell my students if they have trouble to
>> > Enthought. Yet there are issues with license and 32 (free) vs 64 bits
>> > (not free). Long term I do not think this what we should encourage.
>>
>> I think it is odd to encourage users to go to use open source distros,
>> but if they have installation problems (which is really common -
>> Massimo/Titus/Audrey/Zed/etc seem to back me up here) to recommend
>> 'somewhere' to go to commercial-but-free distros.
>
> Why is that odd?
>
> Those distros are an integral part of the ecosystem that is enabled by open
> source. I see no philosophical problems (unless you are of the GNU religion
> of course -- but then you should have said FOSS instead of open source :-).

I may have not said this as well as I thought. :P

I don't follow the GNU religion but I do make Python a very good friend. :-)

I think it's wonderful that ActiveState and Enthought are providing
distributions for free. I got kickstarted on ActiveState back in 2005.
However, for people coming into the language, they should be able to
expect an easy installation from the core site regardless of their
operating system.

I'm wondering that rather than pointing all the new Windows users at
ActiveState/Enthought sites, if their distros are easier to install,
maybe the links should be to those distros. There are probably all
sorts of really good reasons why this is not possible, but if you ever
have to see 10 instructors at once waste a couple hours of
installation on 75 students you won't care about those reasons
anymore.

>>
>> If we should be pointing new users to ActiveState or Enthought, maybe
>> we should just change the python.org default installers to what they
>> provide.
>
> Again, why? The commercial distributors often lag way behind what python.org
> offers -- and for very good reasons.

Just trying to find an easier path for instructors get students
kickstarted in our favorite programming language.

-- 
'Knowledge is Power'
Daniel Greenfeld
http://pydanny.blogspot.com


From sturla at molden.no  Fri Feb 10 00:52:32 2012
From: sturla at molden.no (Sturla Molden)
Date: Fri, 10 Feb 2012 00:52:32 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
Message-ID: <DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>


Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum <guido at python.org>:
).
> 
> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy incorporate GSL? 
> 
> 

GPL vs BSD issue.


Sturla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/8e269cb7/attachment.html>

From senthil at uthcode.com  Fri Feb 10 01:00:32 2012
From: senthil at uthcode.com (Senthil Kumaran)
Date: Fri, 10 Feb 2012 08:00:32 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
Message-ID: <20120210000032.GA1855@mathmagic>

On Thu, Feb 09, 2012 at 11:46:45AM -0600, Massimo Di Pierro wrote:
> I think if easy_install, gevent, numpy (*), and win32 extensions
> where included in 3.x, together with a slightly better Idle (still
> based on Tkinter, with multiple pages, autocompletion, collapsible,
> line numbers, better printing with syntax highlitghing), and if
> easy_install were accessible via Idle, this would be a killer
> version.
> 
> Longer term removing the GIL and using garbage collection should be
> a priority. I am not sure what is involved and how difficult it is

I am not sure if popularity contests are just based on technical
merits/demerits alone. I guess, less people here could care less for
popularity, but more for good tools in python land. So if there are
things lacking in Python world, then those are good project
opportunities.  

What I personally feel is, the various plug-and-play libraries are
giving JavaScript a thumbs up and more is going on web world front-end
than back-end. So, if there is a requirement for Python programmer,
there is an assumption that he should web techs too. There are also
PHP/Ruby/Java folks who also know web technologies. So, the web tech
like (javascript) gets counted 4x time.

-- 
Senthil


From guido at python.org  Fri Feb 10 01:02:34 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 16:02:34 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <CAOoSJ_oxtsKpTin-vt6MNQrbEjy_ccxzN8jJdYo6kv7u1vjGeA@mail.gmail.com>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
	<CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>
	<CAOoSJ_oxtsKpTin-vt6MNQrbEjy_ccxzN8jJdYo6kv7u1vjGeA@mail.gmail.com>
Message-ID: <CAP7+vJ+fsqExUoPMJjgU4Dfah-4wabp1RjBHO6N38PB8YcyiEA@mail.gmail.com>

Sadly, it's quite frequent that works really well in an educational setting
shouldn't be recommended in a professional programming environment, and
vice versa. I'm not sure how to answer this except by creating, maintaining
and promoting some wiki pages aimed specifically at instructors.

On Thu, Feb 9, 2012 at 3:09 PM, Daniel Greenfeld <pydanny at gmail.com> wrote:

> On Thu, Feb 9, 2012 at 2:59 PM, Guido van Rossum <guido at python.org> wrote:
>
> >> > On Feb 9, 2012, at 3:34 PM, Nick Coghlan wrote:
> >> > [...]
> >> >> In the meantime, if the python.org packages for Windows aren't up to
> >> >> scratch (and they aren't in many ways), *use the commercially backed
> >> >> ones* (or one of the other sumo distributions that are out there).
> >> >> Don't tell your students to grab the raw installers directly from
> >> >> python.org, redirect them to the free rebuilds from ActiveState or
> >> >> Enthought, or go all out and get them to install something like
> >> >> Python(X, Y).
> >> >
> >> > This is what I do now. I tell my students if they have trouble to
> >> > Enthought. Yet there are issues with license and 32 (free) vs 64 bits
> >> > (not free). Long term I do not think this what we should encourage.
> >>
> >> I think it is odd to encourage users to go to use open source distros,
> >> but if they have installation problems (which is really common -
> >> Massimo/Titus/Audrey/Zed/etc seem to back me up here) to recommend
> >> 'somewhere' to go to commercial-but-free distros.
> >
> > Why is that odd?
> >
> > Those distros are an integral part of the ecosystem that is enabled by
> open
> > source. I see no philosophical problems (unless you are of the GNU
> religion
> > of course -- but then you should have said FOSS instead of open source
> :-).
>
> I may have not said this as well as I thought. :P
>
> I don't follow the GNU religion but I do make Python a very good friend.
> :-)
>
> I think it's wonderful that ActiveState and Enthought are providing
> distributions for free. I got kickstarted on ActiveState back in 2005.
> However, for people coming into the language, they should be able to
> expect an easy installation from the core site regardless of their
> operating system.
>
> I'm wondering that rather than pointing all the new Windows users at
> ActiveState/Enthought sites, if their distros are easier to install,
> maybe the links should be to those distros. There are probably all
> sorts of really good reasons why this is not possible, but if you ever
> have to see 10 instructors at once waste a couple hours of
> installation on 75 students you won't care about those reasons
> anymore.
>
> >>
> >> If we should be pointing new users to ActiveState or Enthought, maybe
> >> we should just change the python.org default installers to what they
> >> provide.
> >
> > Again, why? The commercial distributors often lag way behind what
> python.org
> > offers -- and for very good reasons.
>
> Just trying to find an easier path for instructors get students
> kickstarted in our favorite programming language.
>
> --
> 'Knowledge is Power'
> Daniel Greenfeld
> http://pydanny.blogspot.com
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/970803c6/attachment.html>

From guido at python.org  Fri Feb 10 01:03:20 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 16:03:20 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
Message-ID: <CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>

On Thu, Feb 9, 2012 at 3:52 PM, Sturla Molden <sturla at molden.no> wrote:

>
>
> Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum <guido at python.org>:
> ).
>
>
> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
> incorporate GSL?
>
>
>
> GPL vs BSD issue.
>

That's a bummer. Someone should open negotiations.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/5a43cfab/attachment.html>

From dreamingforward at gmail.com  Fri Feb 10 01:11:54 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Thu, 9 Feb 2012 17:11:54 -0700
Subject: [Python-ideas] [Python-Dev]  matrix operations on dict :)
In-Reply-To: <CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
Message-ID: <CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>

On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien at tayon.net> wrote:

> 2012/2/7 Mark Janssen <dreamingforward at gmail.com>:
> > On Mon, Feb 6, 2012 at 6:12 PM, Steven D'Aprano <steve at pearwood.info
> > wrote:
>

>
> I have the problem looking for this solution!
> >
> > The application for this functionality is in coding a fractal graph (or
> > "multigraph" in the literature).  This is the most powerful structure
> that
> > Computer Science has ever conceived.  If you look at the evolution of
> data
> > structures in compsci, the fractal graph is the ultimate.  From lists to
> > trees to graphs to multigraphs.  The latter elements can always encompass
> > the former with only O(1) extra cost.
>
> { "a" : 1 } + { "a" : { "b" : 1 } } == KABOOM. This a counter example
> proving it does not handle all structures.
>
> Okay, I guess I did not make myself very clear.  What I'm proposing
probably will (eventually) require changes to the "object" model of Python
and may require (or want) the addition of the "compound" data-type (as in
python's predecessor ABC).  The symbol that denotes a compound would be the
colon (":") and associates a left hand side with right-hand side value, a
NAME with a VALUE.

A dictionary would (then) be a SET of these. (Voila! things have already
gotten simplified.)  Eventually, I also think this will seque and integrate
nicely into Mark Shannon's "shared-key dict" proposal (PEP 410).

The compound data-type would act as the articulation point, around which
the recursive, fractal data structure would revolve:  much like the decimal
point forms in (non-integer) number.  (In theory, you could even do a
reciprocal or __INVERT__ operation on this type.)  OR perhaps a closer
comparison is whatever separates the imaginary from the real part in a
complex number on the complex plane -- by virtue of such, creates two
orthogonal and independent spaces.  The same we want to do with this new
fractal dictionary type.

While in the abstract one might think to allow any arbitrary data-type for
right-hand-side values, in PRACTICE, integers are sufficient.  The reason
is thus.  In the fractal data type, you simply need to define the
bottom-most and top-most layers of the fractalset abstraction, which is the
same as saying the relationship between the *atomic* and the *group* --
everything in-between will be taken care of by the power of the type
itself.  It makes sense to use a maximally atomic, INTEGER data type
(starting with the UNIT 1) for the bottom most level., and a maximally
abstract top-most level -- this is simply an abstract grouping type (i.e. a
collection).  I'm going to suggest a SET is the most abstract (i.e.
sufficient) because it does not impose an order and for reasons regarding
the CONFLATION rule (RULE1).

The CONFLATION rule is thus:  items of the same name are combined ({'a':1,
'a':3} ==> {'a':4}, and non-named (atomic) items are summed.  To simplify
representation, values should be conflated as much as possible, the idea is
maximizing reduction.  This rule separates a set from a list, because
non-unique items will be conflated into one.  Such a set or grouping should
be looked at as an arbitrary n-dimensional space.  An interesting thing to
think about is how this space can be mapped into a unique 1-dimensional,
ordered list and vice versa.  Reflectively, a list can be converted
uniquely into this fractal set thusly:  All non-integer, non-collection
items will be considered NAMES and counted.  If an item is another list, it
will recurse and create another set.  If a set it will simply add it, as
is.  These rules could be important in object serialization (we'll call
this EXPANSION).

In any case, for sake of your example.  In the above KABOOM example,
unnamed, atomic elements can just be considered ANONYMOUS (using None as
the key).   In this case, the new dict becomes:

 { "a" : 1 } + { "a" : { "b" : 1 } } ==> { "a" :  {None: 1, "b" : 1 } } ,
OR if have a compound data-type, we can remove the redundant pseudo-name:
 { "a" :  { 1,  "b" : 1 } }.
Furthermore we can assume a default value of 1 for non-valued "names", so
we could express this more simply:
{ 'a' } +  { 'a" : { 'b' } } ==> { ''a': { 1, 'b' } }  No ambiguity! as
long as we determine a convention.

As noted, one element is named, and the other is not.  Consider unnamed
values within a grouping like a GAS and *named* values as a SOLID.  You're
adding them into the same room where they can co-exist just fine.  No
confusion!

To clarify the properties of this fractal data type more clearly:  there is
only 1 key in the the second, inner set ('b').  We can remove the values()
method as they will always be the atomic INTEGER type and conflate to a
single number.  We'll call this other thing, this property "mass"; in this
case = 2.)  The use of physical analog is helpful and will inform the
definition.

(Could one represent a python CLASS heirarchy more simply with this
fractalset object somehow....?)

Further definitions:

RULE2:  When an atomic is added to a compound, a grouping must be created:

1 + "b" : 1 = { None : 1, "b" : 1 }

RULE3:  Preserve groupings where present:

'b' : 7 + { 'b' : 1 } =  { 'b' : 8 }

I think this might be sufficient.  Darn, I hope it makes some sense....

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/ff33fe1c/attachment.html>

From guido at python.org  Fri Feb 10 01:18:22 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 16:18:22 -0800
Subject: [Python-ideas] [Python-Dev] matrix operations on dict :)
In-Reply-To: <CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>
Message-ID: <CAP7+vJ+RKT_ynn1+zxc6_VYibhUXGx2tjtJu9DoCYbJi4_--Eg@mail.gmail.com>

On Thu, Feb 9, 2012 at 4:11 PM, Mark Janssen <dreamingforward at gmail.com>wrote:

> On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien at tayon.net> wrote:
>
>> 2012/2/7 Mark Janssen <dreamingforward at gmail.com>:
>>
>> > On Mon, Feb 6, 2012 at 6:12 PM, Steven D'Aprano <steve at pearwood.info
>> > wrote:
>>
>
>>
> > I have the problem looking for this solution!
>> >
>> > The application for this functionality is in coding a fractal graph (or
>> > "multigraph" in the literature).  This is the most powerful structure
>> that
>> > Computer Science has ever conceived.  If you look at the evolution of
>> data
>> > structures in compsci, the fractal graph is the ultimate.  From lists to
>> > trees to graphs to multigraphs.  The latter elements can always
>> encompass
>> > the former with only O(1) extra cost.
>>
>> { "a" : 1 } + { "a" : { "b" : 1 } } == KABOOM. This a counter example
>> proving it does not handle all structures.
>>
>> Okay, I guess I did not make myself very clear.  What I'm proposing
> probably will (eventually) require changes to the "object" model of Python
> and may require (or want) the addition of the "compound" data-type (as in
> python's predecessor ABC).  The symbol that denotes a compound would be the
> colon (":") and associates a left hand side with right-hand side value, a
> NAME with a VALUE.
>

That was not a user-visible data type in ABC. ABC had dictionaries (with
somewhat different semantics due to the polymorphic static typing) and the
':' was part of the dictionary syntax, not of the type system.


> A dictionary would (then) be a SET of these. (Voila! things have already
> gotten simplified.)
>

Really? So {a:1, a:2} would be a dict of length 2?


>   Eventually, I also think this will seque and integrate nicely into Mark
> Shannon's "shared-key dict" proposal (PEP 410).
>
> The compound data-type would act as the articulation point, around which
> the recursive, fractal data structure would revolve:  much like the decimal
> point forms in (non-integer) number.  (In theory, you could even do a
> reciprocal or __INVERT__ operation on this type.)  OR perhaps a closer
> comparison is whatever separates the imaginary from the real part in a
> complex number on the complex plane -- by virtue of such, creates two
> orthogonal and independent spaces.  The same we want to do with this new
> fractal dictionary type.
>
> While in the abstract one might think to allow any arbitrary data-type for
> right-hand-side values, in PRACTICE, integers are sufficient.  The reason
> is thus.  In the fractal data type, you simply need to define the
> bottom-most and top-most layers of the fractalset abstraction, which is the
> same as saying the relationship between the *atomic* and the *group* --
> everything in-between will be taken care of by the power of the type
> itself.  It makes sense to use a maximally atomic, INTEGER data type
> (starting with the UNIT 1) for the bottom most level., and a maximally
> abstract top-most level -- this is simply an abstract grouping type (i.e. a
> collection).  I'm going to suggest a SET is the most abstract (i.e.
> sufficient) because it does not impose an order and for reasons regarding
> the CONFLATION rule (RULE1).
>
> The CONFLATION rule is thus:  items of the same name are combined ({'a':1,
> 'a':3} ==> {'a':4}, and non-named (atomic) items are summed.  To simplify
> representation, values should be conflated as much as possible, the idea is
> maximizing reduction.  This rule separates a set from a list, because
> non-unique items will be conflated into one.  Such a set or grouping should
> be looked at as an arbitrary n-dimensional space.  An interesting thing to
> think about is how this space can be mapped into a unique 1-dimensional,
> ordered list and vice versa.  Reflectively, a list can be converted
> uniquely into this fractal set thusly:  All non-integer, non-collection
> items will be considered NAMES and counted.  If an item is another list, it
> will recurse and create another set.  If a set it will simply add it, as
> is.  These rules could be important in object serialization (we'll call
> this EXPANSION).
>
> In any case, for sake of your example.  In the above KABOOM example,
> unnamed, atomic elements can just be considered ANONYMOUS (using None as
> the key).   In this case, the new dict becomes:
>
>  { "a" : 1 } + { "a" : { "b" : 1 } } ==> { "a" :  {None: 1, "b" : 1 } } ,
> OR if have a compound data-type, we can remove the redundant pseudo-name:
>  { "a" :  { 1,  "b" : 1 } }.
> Furthermore we can assume a default value of 1 for non-valued "names", so
> we could express this more simply:
> { 'a' } +  { 'a" : { 'b' } } ==> { ''a': { 1, 'b' } }  No ambiguity! as
> long as we determine a convention.
>
> As noted, one element is named, and the other is not.  Consider unnamed
> values within a grouping like a GAS and *named* values as a SOLID.  You're
> adding them into the same room where they can co-exist just fine.  No
> confusion!
>
> To clarify the properties of this fractal data type more clearly:  there
> is only 1 key in the the second, inner set ('b').  We can remove the
> values() method as they will always be the atomic INTEGER type and conflate
> to a single number.  We'll call this other thing, this property "mass"; in
> this case = 2.)  The use of physical analog is helpful and will inform the
> definition.
>
> (Could one represent a python CLASS heirarchy more simply with this
> fractalset object somehow....?)
>
> Further definitions:
>
> RULE2:  When an atomic is added to a compound, a grouping must be created:
>
> 1 + "b" : 1 = { None : 1, "b" : 1 }
>
> RULE3:  Preserve groupings where present:
>
> 'b' : 7 + { 'b' : 1 } =  { 'b' : 8 }
>
> I think this might be sufficient.  Darn, I hope it makes some sense....
>

Maybe you should reduce your coffee intake. There's too much SHOUTING in
your post... :-)

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/a61bf499/attachment.html>

From jnoller at gmail.com  Fri Feb 10 01:26:59 2012
From: jnoller at gmail.com (Jesse Noller)
Date: Thu, 9 Feb 2012 19:26:59 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CADwd9Xm3XuSDODCYVPYv2MQYH6J=bwa3_pSgdAL=TdUHMLF8Qw@mail.gmail.com>
	<0B687CDC-6C26-4032-BFBB-CF562AF29767@gmail.com>
Message-ID: <70FCC0D7-1A06-4686-ACB9-31E591A4FDAD@gmail.com>


On Feb 9, 2012, at 12:46 PM, Massimo Di Pierro <massimo.dipierro at gmail.com> wrote:

> I think if easy_install, gevent, numpy (*), and win32 extensions where included in 3.x, together with a slightly better Idle (still based on Tkinter, with multiple pages, autocompletion, collapsible, line numbers, better printing with syntax highlitghing), and if easy_install were accessible via Idle, this would be a killer version.
> 
> Longer term removing the GIL and using garbage collection should be a priority. I am not sure what is involved and how difficult it is but perhaps this is what PyCon money can be used for. 

Please do not volunteer revenue that does not exist, or PSF funds for things without a grant proposal or working group.

Especially PyCon revenue - which does not exist.

Jesse

From ncoghlan at gmail.com  Fri Feb 10 02:16:46 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 10 Feb 2012 11:16:46 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
Message-ID: <CADiSq7eq6TB1yrHs0LZFG8Yo368aDQtYj3rYiw1c=kDhghC0cw@mail.gmail.com>

On Fri, Feb 10, 2012 at 8:50 AM, Daniel Greenfeld <pydanny at gmail.com> wrote:
> I think it is odd to encourage users to go to use open source distros,
> but if they have installation problems (which is really common -
> Massimo/Titus/Audrey/Zed/etc seem to back me up here) to recommend
> 'somewhere' to go to commercial-but-free distros.

The reason I encourage budding developers to switch to an open source
base as soon as they can is because it comes with an entire open
source ecosystem around it. With open source code that needs to
interact with OS level APIs, the development flow is often Linux first
(it's completely open source, POSIX compatible and very popular), then
OpenSolaris and the *BSD variants (also open source, POSIX compatible,
but significantly less popular) then Mac OS X (at least it offers a
decent POSIX layer), then Windows (from my perspective, the win32 API
and NTFS stand tall as a couple of the worst cases of NIH syndrome in
the history of computing).

In other words, it's almost the exact reverse of the situation in the
proprietary desktop software world (which usually goes Windows->Mac OS
X->Linux based on desktop market share).

With POSIX compatible code covering pretty much every platform other
than Windows, and with win32 API programming being such an alien (and
verbose) experience to anyone used to the file descriptor based POSIX
world, volunteers that are willing to develop and maintain such code
on their own time are pretty thin on the ground. As aresult, it's
frequently necessary to turn to proprietary vendors to get a smooth,
Windows-appropriate user experience.

Given that Windows itself is a proprietary OS, suggesting that people
use a free-as-in-beer-but-not-as-in-speech package that lets them skip
the boring bits and get straight to coding sounds quite reasonable to
me. Sure it's not perfect, but unless you can wave your hand and
create a larger pool of volunteer developers that decide to stick with
Windows for their hobbyist development instead of embracing a
completely open platform like Linux or a POSIX-compatible open core
one like Mac OS X, Windows support is always going to lag (including
in the installation-and-deployment space).

My experience on Linux is that most things, up to and including pip
installation of C extension modules, *just works* (the exception being
that some C extensions have broken build processes and require a bit
of cajoling - it would be nice if someone actually sat down and wrote
a bdist_simple PEP instead of just talking about it on this list).
Automating the setup of these platforms is fairly straightforward
because they come with tools like Python and wget preinstalled, so you
can just use them without needing to worry about giving the user
instructions on obtaining them.

In contrast, on Windows, you have to do a lot of work up front to be
able to compile C extensions at all, and installing pip is a far cry
from being able to just do "yum install python-pip". You don't even
have access to "wget" to fetch a script that handles the setup for
you.

Getting set up to do software development on Windows is hard because
Windows is built on the assumption that the world can be cleanly
divided into "Developers" that build their own copies of software from
source code (basically, people that are willing to pay for a copy of
Visual Studio, or at least download and install one of the Express
editions) and "Users" that only run software that someone else built
(everyone else). The Linux distros (and other open source platforms),
on the other hand, make the tools to *build* the software just as
readily available as the software itself (although, these days, they
also do their best to make sure you don't *need* to build stuff from
source).

Cross platform tools like Python can make an attempt to paper over
those fundamental philosophical differences between the platforms, but
really, there's only so much any given third party can do about it
(and, for most people, trying to do so doesn't qualify as a fun
hobby).

Suppose Python core gets our packaging story on Windows fixed. What
then? Well, NumPy still runs into problems due to BLAS. What's
installing Postgres, MySQL or MongoDB on Windows like? (I genuinely
don't know, I've never tried). Are there Windows installers for PyPy?
These kinds of road blocks are endemic in Windows open source
development, and they'll likely stay that way unless MS release a
native Windows POSIX compatibility layer that isn't horrible (I
personally expect that to happen somewhere around the time the Earth
gets swallowed by the Sun).

> If we should be pointing new users to ActiveState or Enthought, maybe
> we should just change the python.org default installers to what they
> provide.

No, because the python.org installers are what the redistributor's use
to create their own sumo packages.

It may be reasonable for us to point new users that aren't already
experienced Windows software developers directly to the sumo
distributions, though. For example, as far I know, Python(X, Y) does a
nice job of dumping a comprehensive Python environment on a Windows
system without relying on a proprietary vendor.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From tjreedy at udel.edu  Fri Feb 10 04:48:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 22:48:49 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341710.9030806@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
Message-ID: <jh243j$uo7$1@dough.gmane.org>

On 2/9/2012 1:57 PM, Sturla Molden wrote:
> On 09.02.2012 19:42, Mike Meyer wrote:
>
>> If threading is the only acceptable concurrency mechanism, then Python
>> is the wrong language to use. But you're also not building scaleable
>> systems, which is most of where it really matters. If you're willing
>> to consider things other than threading - and you have to if you want
>> to build scaleable systems - then Python makes a good choice.
>
> Yes or no... Python is used for parallel computing on the biggest
> supercomputers, monsters like Cray and IBM blue genes with tens of
> thousands of CPUs. But what really fails to scale is the Python module
> loader! For example it can take hours to "import numpy" for 30,000
> Python processes on a blue gene.

Mike Meyer posted that on pydev today
http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html

They determined that the time was gobbled by *finding* modules in each 
process, so they cut hours by finding them in 1 process and sending the 
locations to the other 29,999. We are already discussing how to use this 
lesson in core Python. The sub-thread is today's posts in
"requirements for moving __import__ over to importlib?"

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Fri Feb 10 05:21:20 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 23:21:20 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
Message-ID: <jh260i$9ir$1@dough.gmane.org>

On 2/9/2012 2:16 PM, Giampaolo Rodol? wrote:

> I bet a lot of people don't want to upgrade for another reason: unicode.
> The impression I got is that python 3 forces the user to use and
> *understand* unicode and a lot of people simply don't want to deal
> with that.

Do *you* think that? Or or you reporting what others think? In either 
case, we have another communication problem. If one only uses the ascii 
subset, the usage of 3.x strings is transparent. As far as I can think, 
one does not need to know *anything* about unicode to use 3.x. In 3.3, 
there will not even be a memory hit. We should be saying that.

Thanks for the head's up. It is hard to know what misconceptions people 
have until someone reports them ;-).

> In python 2 there was no such a strong imposition.

Nor is there in 3.x. We need to communicate that. I may give it a try on 
python-list. If and when one does want to use more characters, it should 
be *easier* in 3.x than in 2.x, especially for non-Latin1 Western 
European chars .

-- 
Terry Jan Reedy


From anacrolix at gmail.com  Fri Feb 10 05:30:41 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 12:30:41 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh260i$9ir$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
Message-ID: <CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>

Not true, it's necessary to understand that encodings translate to and from
bytes, and how to use the API. In 2.x you rarely needed to know what
unicode is.
On Feb 10, 2012 12:22 PM, "Terry Reedy" <tjreedy at udel.edu> wrote:

> On 2/9/2012 2:16 PM, Giampaolo Rodol? wrote:
>
>  I bet a lot of people don't want to upgrade for another reason: unicode.
>> The impression I got is that python 3 forces the user to use and
>> *understand* unicode and a lot of people simply don't want to deal
>> with that.
>>
>
> Do *you* think that? Or or you reporting what others think? In either
> case, we have another communication problem. If one only uses the ascii
> subset, the usage of 3.x strings is transparent. As far as I can think, one
> does not need to know *anything* about unicode to use 3.x. In 3.3, there
> will not even be a memory hit. We should be saying that.
>
> Thanks for the head's up. It is hard to know what misconceptions people
> have until someone reports them ;-).
>
>  In python 2 there was no such a strong imposition.
>>
>
> Nor is there in 3.x. We need to communicate that. I may give it a try on
> python-list. If and when one does want to use more characters, it should be
> *easier* in 3.x than in 2.x, especially for non-Latin1 Western European
> chars .
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/33203436/attachment.html>

From tjreedy at udel.edu  Fri Feb 10 05:36:36 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 09 Feb 2012 23:36:36 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1OOPiYXE4MN+8F4Yeb9s6m8HrzSgNLx9jgxPz2QaLOrrA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<CALFfu7C3ZkXv6+o9sv7vbnM4==p=0Z_VwWyj0kmW-BPPVHHJMw@mail.gmail.com>
	<CAB4yi1OOPiYXE4MN+8F4Yeb9s6m8HrzSgNLx9jgxPz2QaLOrrA@mail.gmail.com>
Message-ID: <jh26t6$e96$1@dough.gmane.org>

On 2/9/2012 2:31 PM, Matt Joiner wrote:
>> Isn't that more accurate for framework writers, rather than for
>> "average" users?  How often do average users have to address
>> encoding/decoding in Python 3?
>
> Constantly. As a Python noob I tried Python 3 it was the first wall I
> encountered.

I am really puzzled what you mean. I have used Python 3 since 3.0 alpha 
and as long as I have used strictly ascii, I have encountered no such 
issues.
 >>> f = open('f:/python/mypy/test.txt', 'w')
 >>> f.write('test line 1\n')
12
 >>> f.write('test line 2 and more\n')
21
 >>> f.close()
Now I can open in any other program, or open in Python.

I have learned about unicode, but just so I could play around with
other characters.

 > I had to learn Unicode right then and there. Fortunately,
> the Python docs HOWTO on Unicode is excellent.

Were you doing some non-ascii or non-average framework-like things? 
Would you really not have had to learn the same about unicode if you 
were using 2.x?

-- 
Terry Jan Reedy


From ctb at msu.edu  Fri Feb 10 06:00:44 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Thu, 9 Feb 2012 21:00:44 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <CAP7+vJ+fsqExUoPMJjgU4Dfah-4wabp1RjBHO6N38PB8YcyiEA@mail.gmail.com>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
	<CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>
	<CAOoSJ_oxtsKpTin-vt6MNQrbEjy_ccxzN8jJdYo6kv7u1vjGeA@mail.gmail.com>
	<CAP7+vJ+fsqExUoPMJjgU4Dfah-4wabp1RjBHO6N38PB8YcyiEA@mail.gmail.com>
Message-ID: <20120210050044.GP18049@idyll.org>

On Thu, Feb 09, 2012 at 04:02:34PM -0800, Guido van Rossum wrote:
> Sadly, it's quite frequent that works really well in an educational setting
> shouldn't be recommended in a professional programming environment, and
> vice versa. I'm not sure how to answer this except by creating, maintaining
> and promoting some wiki pages aimed specifically at instructors.

Perhaps I am brainfried ATM, but I cannot imagine what you are talking about
here.  Do you have any examples you can share that illustrate what you mean?

thanks much,
--titus


From guido at python.org  Fri Feb 10 06:18:59 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 9 Feb 2012 21:18:59 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3% (Massimo Di Pierro)
In-Reply-To: <20120210050044.GP18049@idyll.org>
References: <CAOoSJ_p4yozT45V5K6Le9Wxa+YtYSuuCbaLBUd5fLtqshrVD4A@mail.gmail.com>
	<CAP7+vJK0WDetpuG9Exz4nVFDoEWKf5KKBfOqgCY4KL7dq04tAQ@mail.gmail.com>
	<CAOoSJ_oxtsKpTin-vt6MNQrbEjy_ccxzN8jJdYo6kv7u1vjGeA@mail.gmail.com>
	<CAP7+vJ+fsqExUoPMJjgU4Dfah-4wabp1RjBHO6N38PB8YcyiEA@mail.gmail.com>
	<20120210050044.GP18049@idyll.org>
Message-ID: <CAP7+vJJteS2XRkd_2gHcaJ4sL1+qC+zePFNysF+-XvGcapODKA@mail.gmail.com>

On Thu, Feb 9, 2012 at 9:00 PM, C. Titus Brown <ctb at msu.edu> wrote:

> On Thu, Feb 09, 2012 at 04:02:34PM -0800, Guido van Rossum wrote:
> > Sadly, it's quite frequent that works really well in an educational
> setting
> > shouldn't be recommended in a professional programming environment, and
> > vice versa. I'm not sure how to answer this except by creating,
> maintaining
> > and promoting some wiki pages aimed specifically at instructors.
>
> Perhaps I am brainfried ATM, but I cannot imagine what you are talking
> about
> here.  Do you have any examples you can share that illustrate what you
> mean?
>

Simplest example: many educators seem delighted with Python 3 because it
solves a bunch of beginner's pitfalls, and their students learn in a
greenfield situation. (Though this is not the case for Massimo.)
Professionals OTOH don't seem to like Python 3 because it means they have
to change a pile of software that took them a decade (and an army of
programmers) to create.

Educators also often give their students a simple library of convenience
functions and tell them to put the magic line "from blah import *" at the
top of their module (or session). Again something that most professionals
loathe, but it works well for the first steps in programming -- certainly
better than the Java approach "copy these ten lines of gobbledygook [the
minimal "hello world" in Java] into your file, don't ask what they mean,
and above all be careful not to accidentally edit any of them".

OTOH when educators want their students to install some 3rd party package
it is often something hideously complex like pygame, rather than something
simple and elegant like WebOb or flask.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120209/5ca15d86/attachment.html>

From tjreedy at udel.edu  Fri Feb 10 06:47:29 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Feb 2012 00:47:29 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
Message-ID: <jh2b23$52o$1@dough.gmane.org>

On 2/9/2012 11:30 PM, Matt Joiner wrote:
> Not true, it's necessary to understand that encodings translate to and
> from bytes,

Only if you use byte encodings for ascii text. I never have, and I would 
not know why you do unless you are using internet modules that do not 
sufficiently hide such details. Anyway...

 >>> b = b'abc'
 >>> u = str(b)
 >>> b = bytes(u, 'ascii')

So one only needs to know one encoding name, which most should know 
anyway, and that it *is* an encoding name.

> and how to use the API.

Give the required parameter, which is standard.

 > In 2.x you rarely needed to know what unicode is.

All one *needs* to know about unicode, that I can see, is that unicode 
is a superset of ascii, that ascii number codes remain the same, and 
that one can ignore just about everything else until one uses (or wants 
to know about) non-ascii characters. Since one will see 'utf-8' here and 
there, it is probably to know that the utf-8 encoding is a superset of 
the ascii encoding, so that ascii text *is* utf-8 text.

-- 
Terry Jan Reedy


From ericsnowcurrently at gmail.com  Fri Feb 10 08:04:54 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 10 Feb 2012 00:04:54 -0700
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <874nuzyda9.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<874nuzyda9.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CALFfu7AuvDtCofZrep=wijUMrb5BUQ0kXmDpn6fNBxfxPZ8aSQ@mail.gmail.com>

On Thu, Feb 9, 2012 at 1:14 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> It's too
> bad Python can't get a piece of that action!

Getting closer:

  http://morepypy.blogspot.com/2012/02/almost-there-pypys-arm-backend_01.html

-eric


From stephen at xemacs.org  Fri Feb 10 09:41:20 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 10 Feb 2012 17:41:20 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh260i$9ir$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
Message-ID: <871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > > In python 2 there was no such a strong imposition [of Unicode
 > > awareness on users].
 > 
 > Nor is there in 3.x.

Sorry, Terry, but you're basically wrong here.  True, if one sticks to
pure ASCII, there's no difference to notice, but that's just not
possible for people who live outside of the U.S., or who share text
with people outside of the U.S.  They need currency symbols, they have
friends whose names have little dots on them.  Every single one of
those is a backtrace waiting to happen.  A backtrace on

    f = open('text-file.txt')
    for line in f: pass

is an imposition.  That doesn't happen in 2.x (for the wrong reasons,
but it's very convenient 95% of the time).

This is what Victor's "locale" codec is all about.  I think that's the
wrong spelling for the feature, but there does need to be a way to
express "don't bother me about Unicode" in most scripts for most
people.  We don't have a decent boilerplate for that yet.


From masklinn at masklinn.net  Fri Feb 10 09:49:40 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 10 Feb 2012 09:49:40 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
Message-ID: <623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>

On 2012-02-10, at 01:03 , Guido van Rossum wrote:
> On Thu, Feb 9, 2012 at 3:52 PM, Sturla Molden <sturla at molden.no> wrote:
>> Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum <guido at python.org>:
>> ).
>> 
>> 
>> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
>> incorporate GSL?
>> 
>> GPL vs BSD issue.
>> 
> 
> That's a bummer. Someone should open negotiations.

I'm not sure what could be open to negotiate, being part of the GNU
constellation I don't see GSL budging from the GPL, and SciPy is backed
by industry members and used in "nonfree" products (notably the Enthought
Python Distribution) so there's little room for it to use the GPL.

Best thing that could happen (and I'm not even sure it's allowed by the
GSL's license (which is under the GPL not the LGPL) would be for SciPy to
grow some sort of GSL backend to delegate its operations to, when the GSL
is installed.


From dirkjan at ochtman.nl  Fri Feb 10 10:16:28 2012
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Fri, 10 Feb 2012 10:16:28 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
Message-ID: <CAKmKYaDkz2py1CNXUn7T3TQv98gftAT+W=1PrMiGZHy=EH=5Zg@mail.gmail.com>

On Thu, Feb 9, 2012 at 19:26, Guido van Rossum <guido at python.org> wrote:
> Are there still Python idioms/patterns/recipes around that depend on
> refcounting? (There also used to be some well-known anti-patterns that were
> only bad because of the refcounting, mostly around saving exceptions. But
> those should all have melted away -- CPython has had auxiliary GC for over a
> decade.)

There are some simple patterns that are great with refcounting and not
so great with garbage collection. We encountered some of these with
Mercurial. IIRC, the basic example is just

open('foo').read()

With refcounting, the file will be closed soon. With garbage
collection, it won't. Being able to rely on cleanup per frame/function
call is pretty useful.

Cheers,

Dirkjan


From jeanpierreda at gmail.com  Fri Feb 10 10:20:59 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Fri, 10 Feb 2012 04:20:59 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
Message-ID: <CABicbJKWHpVKx=eu_G4vmsSZbeV1qJLMoZ4dZJ8A5mjWa_2tNw@mail.gmail.com>

On Thu, Feb 9, 2012 at 11:49 AM, Massimo Di Pierro
<massimo.dipierro at gmail.com> wrote:
> 50+% of the students have a mac and an increasing number of packages depend
> on numpy. Installing numpy on mac is a lottery.
>
> Those who do not have a mac have windows and they expect an IDE like
> eclipse. I know you can use Python with eclipse but they do not. They
> download Python and complain that IDLE has no autocompletion, no line
> numbers, no collapsible functions/classes.

At the University of Toronto we tell students to use the Wing IDE
(Wing 101 was developed specifically for our use in the classroom, in
fact). All classroom examples are done either in the interactive
interpreter, or in a session of Wing 101. All computer lab sessions
are done using Wing 101, and the first lab is dedicated specifically
for introducing how to edit files with it and use its debugging
features.

If students don't like IDLE, tell them to use a different editor
instead, and pretend that Python doesn't include one with itself. (By
default IDLE only shows an interactive session, so if they get curious
and click-y they'll still be in the dark.)

-- Devin


From mark at hotpy.org  Fri Feb 10 10:29:55 2012
From: mark at hotpy.org (Mark Shannon)
Date: Fri, 10 Feb 2012 09:29:55 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2b23$52o$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org>
Message-ID: <4F34E393.9020105@hotpy.org>


There are a lot of things covered in this thread.
I want to address 2 of them.

1. Garbage Collection.

Python has garbage collection. There is no free() function in Python,
anyone who says that Python does not have GC is talking nonsense.
CPython using reference counting as its means of implementing GC.

Ref counting has different performance characteristics from tracing GC,
but it only makes sense to consider this is the context of overall
Python performance.
One key disadvantage of ref-counting is that does not play well with 
threads, which leads on to...

2. Global Interpreter Lock and Threads.

The GIL is so deeply embedded into CPython that I think it cannot be 
removed. There are too many subtle assumptions pervading both the VM and 
3rd party code, to make truly concurrent threads possible.

But are threads the way to go?
Javascript does not have threads. Lua does not have threads.
Erlang does not have threads; Erlang processes are implemented (in the 
BEAM engine) as coroutines.

One of the Lua authors said this about threads:
(I can't remember the quote so I will paraphrase)
"How can you program in a language where 'a = a + 1' is not deterministic?"
Indeed.

What Python needs are better libraries for concurrent programming based 
on processes and coroutines.

Cheers,
Mark.


From timothy.c.delaney at gmail.com  Fri Feb 10 10:32:14 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Fri, 10 Feb 2012 20:32:14 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAKmKYaDkz2py1CNXUn7T3TQv98gftAT+W=1PrMiGZHy=EH=5Zg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<CAKmKYaDkz2py1CNXUn7T3TQv98gftAT+W=1PrMiGZHy=EH=5Zg@mail.gmail.com>
Message-ID: <CAN8CLgk8X72is+_CGQS12qDsqPgN_9gwyD5CT4pgjb45Dkt53Q@mail.gmail.com>

On 10 February 2012 20:16, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:

> There are some simple patterns that are great with refcounting and not
> so great with garbage collection. We encountered some of these with
> Mercurial. IIRC, the basic example is just
>
> open('foo').read()
>
> With refcounting, the file will be closed soon. With garbage
> collection, it won't. Being able to rely on cleanup per frame/function
> call is pretty useful.


This is the #1 anti-pattern that shouldn't be encouraged. Using this idiom
is just going to cause problems (mysterious exceptions while trying to open
files due to running out of file handles for the process) for anyone trying
to port your code to other implementations of Python.

If you read PEP 343 (and the various discussions around that time) it's
clear that the above anti-pattern is one of the major driving forces for
the introduction of the 'with' statement.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/29233e6e/attachment.html>

From storchaka at gmail.com  Fri Feb 10 10:51:05 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Fri, 10 Feb 2012 11:51:05 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341710.9030806@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<20120209104237.154be949@bhuda.mired.org>
	<4F341710.9030806@molden.no>
Message-ID: <jh2paa$180$1@dough.gmane.org>

09.02.12 20:57, Sturla Molden ???????(??):
> Yes or no... Python is used for parallel computing on the biggest
> supercomputers, monsters like Cray and IBM blue genes with tens of
> thousands of CPUs. But what really fails to scale is the Python module
> loader! For example it can take hours to "import numpy" for 30,000
> Python processes on a blue gene. And yes, nobody would consider to use
> Java for such systems, even though Java does not have a GIL (well,
> theads do no matter that much on a cluster with distributed memory
> anyway). It is Python, C and Fortran that are popular. But that really
> disproves that Python sucks for big concurrency, except perhaps for the
> module loader.

What about os.fork()?


From jeanpierreda at gmail.com  Fri Feb 10 10:54:13 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Fri, 10 Feb 2012 04:54:13 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAN8CLgk8X72is+_CGQS12qDsqPgN_9gwyD5CT4pgjb45Dkt53Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<CAKmKYaDkz2py1CNXUn7T3TQv98gftAT+W=1PrMiGZHy=EH=5Zg@mail.gmail.com>
	<CAN8CLgk8X72is+_CGQS12qDsqPgN_9gwyD5CT4pgjb45Dkt53Q@mail.gmail.com>
Message-ID: <CABicbJJzQ-EMoaUuncv5oU+ai8H-XL-w77LRJZJDGoc+1ev2aQ@mail.gmail.com>

On Fri, Feb 10, 2012 at 4:32 AM, Tim Delaney
<timothy.c.delaney at gmail.com> wrote:
> On 10 February 2012 20:16, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
>> open('foo').read()
>>
>> With refcounting, the file will be closed soon. With garbage
>> collection, it won't. Being able to rely on cleanup per frame/function
>> call is pretty useful.
>
>
> This is the #1 anti-pattern that shouldn't be encouraged. Using this idiom
> is just going to cause problems (mysterious exceptions while trying to open
> files due to running out of file handles for the process) for anyone trying
> to port your code to other implementations of Python.

It's not that open('foo').read() is "good". Clearly with the presence
of nondeterministic garbage collection, it's bad. But it is convenient
and compact.

Refcounting GCs in general give very nice, predictable behavior, which
lets us ignore a lot of the details of destroying things. Without
something like this, we have to do some forms of resource management
by hand that we could otherwise push to the garbage collector, and
while sometimes this is as easy as a with statement, sometimes it
isn't. For example, what do you do if multiple objects are meant to
hold onto a file and take turns reading it? How do we close the file
at the end when all the objects are done? Is the answer "manual
refcounting"? Or is the answer "I don't care, let the GC handle it"?

-- Devin


From robert.kern at gmail.com  Fri Feb 10 11:09:11 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 10 Feb 2012 10:09:11 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
Message-ID: <jh2qc9$a34$1@dough.gmane.org>

On 2/10/12 8:49 AM, Masklinn wrote:
> On 2012-02-10, at 01:03 , Guido van Rossum wrote:
>> On Thu, Feb 9, 2012 at 3:52 PM, Sturla Molden<sturla at molden.no>  wrote:
>>> Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum<guido at python.org>:
>>> ).
>>>
>>>
>>> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
>>> incorporate GSL?
>>>
>>> GPL vs BSD issue.
>>>
>>
>> That's a bummer. Someone should open negotiations.
>
> I'm not sure what could be open to negotiate, being part of the GNU
> constellation I don't see GSL budging from the GPL, and SciPy is backed
> by industry members and used in "nonfree" products (notably the Enthought
> Python Distribution) so there's little room for it to use the GPL.

While I am an Enthought employee and really do want to keep scipy BSD so I can 
continue to use it in the proprietary software that I write for clients, I must 
also add that the most vociferous BSD advocates in our community are the 
academics. They have to wade through more weird licensing arrangements than I 
do, and the flexibility of the BSD license is quite important to let them get 
their jobs done.

> Best thing that could happen (and I'm not even sure it's allowed by the
> GSL's license (which is under the GPL not the LGPL) would be for SciPy to
> grow some sort of GSL backend to delegate its operations to, when the GSL
> is installed.

We've done that kind of thing in the past for FFTW and other libraries but have 
since removed them for all of the installation and maintenance headaches it causes.

In my mind (and others disagree), having scipy-the-package subsume every 
relevant library is not a worthwhile pursuit. The important thing is that these 
packages are available to the scientific Python community.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From masklinn at masklinn.net  Fri Feb 10 11:41:58 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 10 Feb 2012 11:41:58 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2qc9$a34$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
Message-ID: <DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>

On 2012-02-10, at 11:09 , Robert Kern wrote:
> On 2/10/12 8:49 AM, Masklinn wrote:
>> On 2012-02-10, at 01:03 , Guido van Rossum wrote:
>>> On Thu, Feb 9, 2012 at 3:52 PM, Sturla Molden<sturla at molden.no>  wrote:
>>>> Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum<guido at python.org>:
>>>> ).
>>>> 
>>>> 
>>>> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
>>>> incorporate GSL?
>>>> 
>>>> GPL vs BSD issue.
>>>> 
>>> 
>>> That's a bummer. Someone should open negotiations.
>> 
>> I'm not sure what could be open to negotiate, being part of the GNU
>> constellation I don't see GSL budging from the GPL, and SciPy is backed
>> by industry members and used in "nonfree" products (notably the Enthought
>> Python Distribution) so there's little room for it to use the GPL.
> 
> While I am an Enthought employee and really do want to keep scipy BSD so I can continue to use it in the proprietary software that I write for clients, I must also add that the most vociferous BSD advocates in our community are the academics. They have to wade through more weird licensing arrangements than I do, and the flexibility of the BSD license is quite important to let them get their jobs done.

Completely true, I'd thought about this case but completely forgot about it
when I started actually writing my message, I'm very sorry.


From cmjohnson.mailinglist at gmail.com  Fri Feb 10 11:43:14 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Fri, 10 Feb 2012 00:43:14 -1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2qc9$a34$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
Message-ID: <28E8080F-75DF-455E-9C6A-BE3D4D603604@gmail.com>

Can we please break this thread out into multiple subject headers? It's very difficult to follow the flow of conversation with some many different discussions all lumped under one name.

Some proposed subjects:

- Refcounting vs. Other GC
- Numpy
- Windows Installers
- Unicode
- Python in Education
- Python's Popularity


From robert.kern at gmail.com  Fri Feb 10 11:49:29 2012
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 10 Feb 2012 10:49:29 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
Message-ID: <jh2snq$rbb$1@dough.gmane.org>

On 2/10/12 10:41 AM, Masklinn wrote:
> On 2012-02-10, at 11:09 , Robert Kern wrote:
>> On 2/10/12 8:49 AM, Masklinn wrote:
>>> On 2012-02-10, at 01:03 , Guido van Rossum wrote:
>>>> On Thu, Feb 9, 2012 at 3:52 PM, Sturla Molden<sturla at molden.no>   wrote:
>>>>> Den 9. feb. 2012 kl. 23:56 skrev Guido van Rossum<guido at python.org>:
>>>>> ).
>>>>>
>>>>>
>>>>> Hm... is there a reason GSL and SciPy need to compete? Can't SciPy
>>>>> incorporate GSL?
>>>>>
>>>>> GPL vs BSD issue.
>>>>>
>>>>
>>>> That's a bummer. Someone should open negotiations.
>>>
>>> I'm not sure what could be open to negotiate, being part of the GNU
>>> constellation I don't see GSL budging from the GPL, and SciPy is backed
>>> by industry members and used in "nonfree" products (notably the Enthought
>>> Python Distribution) so there's little room for it to use the GPL.
>>
>> While I am an Enthought employee and really do want to keep scipy BSD so I can continue to use it in the proprietary software that I write for clients, I must also add that the most vociferous BSD advocates in our community are the academics. They have to wade through more weird licensing arrangements than I do, and the flexibility of the BSD license is quite important to let them get their jobs done.
>
> Completely true, I'd thought about this case but completely forgot about it
> when I started actually writing my message, I'm very sorry.

No apologies necessary. I just wanted to be thorough. :-)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco


From yoavglazner at gmail.com  Fri Feb 10 11:51:03 2012
From: yoavglazner at gmail.com (yoav glazner)
Date: Fri, 10 Feb 2012 12:51:03 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <28E8080F-75DF-455E-9C6A-BE3D4D603604@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<28E8080F-75DF-455E-9C6A-BE3D4D603604@gmail.com>
Message-ID: <CAJ78kjN3gKHVY8A4e=pt6k=pug7F9Y3nLiqF7zT5TvVRYwp5eA@mail.gmail.com>

On Fri, Feb 10, 2012 at 12:43 PM, Carl M. Johnson <
cmjohnson.mailinglist at gmail.com> wrote:

> Can we please break this thread out into multiple subject headers? It's
> very difficult to follow the flow of conversation with some many different
> discussions all lumped under one name.
>
> Some proposed subjects:
>
> - Refcounting vs. Other GC
> - Numpy
> - Windows Installers
> - Unicode
> - Python in Education
> - Python's Popularity
>

No, The subject is correct, we have a -3% problem in the index. so the
solution is to keep this thread long with many keywords like python pypy
jython etc... and than the % will grow! (at least @TIOBE since it relies on
google search ;) )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/11ec1723/attachment.html>

From ubershmekel at gmail.com  Fri Feb 10 11:52:09 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Fri, 10 Feb 2012 12:52:09 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2snq$rbb$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
Message-ID: <CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>

>
> Also, AFAIK Ruby has a GIL much like Python. I think it's time to start a
> PR offensive explaining why these are not the problem the trolls make them
> out to be, and how you simply have to use different patterns for scaling in
> some languages than in others.


GIL + Threads = Fibers

CPython doesn't have threads but it calls its fibers "threads" which causes
confusion and disappointment. The underlying implementation is not
important eg when you implement a "lock" using "events" does the lock
become an event? No. This is a PR disaster.

100% agree we need a PR offensive but first we need a strategy. Erlang
champions the actor/message paradigm so they dodge the threading bullet
completely. What's the python championed parallelism paradigm? It should be
on the front page of python.org and in the first paragraph of wikipedia on
python.


One of the Lua authors said this about threads:
> (I can't remember the quote so I will paraphrase)
> "How can you program in a language where 'a = a + 1' is not deterministic?"
> Indeed.


Anyone who cares enough about performance doesn't mind that 'a = a + 1' is
only as deterministic as you design it to be with or without locks.
Multiprocessing has this same problem btw.


What Python needs are better libraries for concurrent programming based on
> processes and coroutines.


The killer feature for threads (vs multiprocessing) is access to shared
state with nearly zero overhead.


And note that a single-threaded event-driven process can serve 100,000 open
> sockets -- while no JVM can create 100,000 threads.


Graphics engines, simulations, games, etc don't want 100,000 threads, they
just want true threads as many as there are CPU's.


Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/3981ce01/attachment.html>

From stefan_ml at behnel.de  Fri Feb 10 12:23:04 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 12:23:04 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
Message-ID: <jh2umo$bfi$1@dough.gmane.org>

Yuval Greenfield, 10.02.2012 11:52:
> GIL + Threads = Fibers

No, that only applies (to a certain extent) to Python code being executed,
not to the complete runtime which includes external libraries etc.

Stefan


From anacrolix at gmail.com  Fri Feb 10 12:25:27 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 19:25:27 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F34E393.9020105@hotpy.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
Message-ID: <CAB4yi1OjpZohRPpnP1p3TGrTo8Fh4WAAEzg2xL54mRJs_bF3WQ@mail.gmail.com>

This. The process support is pretty good with multiprocessing but the
coroutines are missing.
On Feb 10, 2012 5:30 PM, "Mark Shannon" <mark at hotpy.org> wrote:

>
> There are a lot of things covered in this thread.
> I want to address 2 of them.
>
> 1. Garbage Collection.
>
> Python has garbage collection. There is no free() function in Python,
> anyone who says that Python does not have GC is talking nonsense.
> CPython using reference counting as its means of implementing GC.
>
> Ref counting has different performance characteristics from tracing GC,
> but it only makes sense to consider this is the context of overall
> Python performance.
> One key disadvantage of ref-counting is that does not play well with
> threads, which leads on to...
>
> 2. Global Interpreter Lock and Threads.
>
> The GIL is so deeply embedded into CPython that I think it cannot be
> removed. There are too many subtle assumptions pervading both the VM and
> 3rd party code, to make truly concurrent threads possible.
>
> But are threads the way to go?
> Javascript does not have threads. Lua does not have threads.
> Erlang does not have threads; Erlang processes are implemented (in the
> BEAM engine) as coroutines.
>
> One of the Lua authors said this about threads:
> (I can't remember the quote so I will paraphrase)
> "How can you program in a language where 'a = a + 1' is not deterministic?"
> Indeed.
>
> What Python needs are better libraries for concurrent programming based on
> processes and coroutines.
>
> Cheers,
> Mark.
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/6a3bb1f6/attachment.html>

From ubershmekel at gmail.com  Fri Feb 10 14:57:56 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Fri, 10 Feb 2012 15:57:56 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2umo$bfi$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
Message-ID: <CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>

On Fri, Feb 10, 2012 at 1:23 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Yuval Greenfield, 10.02.2012 11:52:
> > GIL + Threads = Fibers
>
> No, that only applies (to a certain extent) to Python code being executed,
> not to the complete runtime which includes external libraries etc.
>

Pure python code running in python "threads" on CPython behaves like
fibers. I'd like to point out the word "external" in your statement.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/690901a2/attachment.html>

From stefan_ml at behnel.de  Fri Feb 10 15:01:54 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 15:01:54 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
Message-ID: <jh380i$i3b$1@dough.gmane.org>

Yuval Greenfield, 10.02.2012 14:57:
> On Fri, Feb 10, 2012 at 1:23 PM, Stefan Behnel wrote:
>> Yuval Greenfield, 10.02.2012 11:52:
>>> GIL + Threads = Fibers
>>
>> No, that only applies (to a certain extent) to Python code being executed,
>> not to the complete runtime which includes external libraries etc.
> 
> Pure python code running in python "threads" on CPython behaves like
> fibers. I'd like to point out the word "external" in your statement.

Yes, many people forget that existing/external/yourwordinghere code in
non-Python languages is (and has always been) a substantial part of the
Python platform.

Stefan


From anacrolix at gmail.com  Fri Feb 10 15:48:07 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Fri, 10 Feb 2012 22:48:07 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<4606859F-1DCB-4B1C-8A6D-A875011B8128@masklinn.net>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
Message-ID: <CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>

> Pure python code running in python "threads" on CPython behaves like fibers.
> I'd like to point out the word "external" in your statement.

I don't believe this to be true. Fibers are not preempted. The GIL is
released at regular intervals to allow the effect of preempted
switching. Many other behaviours of Python threads are still native
thread like, particularly in their interaction with other components
and the OS.

GIL + Threads = Simplified, non parallel interpreter


From massimo.dipierro at gmail.com  Fri Feb 10 15:52:16 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Fri, 10 Feb 2012 08:52:16 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F34E393.9020105@hotpy.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
Message-ID: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>

The way I see it is not whether Python has threads, fibers, coroutines, etc. The problem is that in 5 years we going to have on the market CPUs with 100 cores (my phone has 2, my office computer has 8 not counting GPUs). The compiler/interpreters must be able to parallelize tasks using those cores without duplicating the memory space. Erlang may not have threads in the sense that it does not expose threads via an API but provides optional parallel schedulers where coroutines are distributed automatically over the available cores/CPUs (http://erlang.2086793.n4.nabble.com/Some-facts-about-Erlang-and-SMP-td2108770.html). Different languages have different mechanisms for taking advantages of multiple cores without forking. Python does not provide a mechanism and I do not know if anybody is working on one.

In Python, currently, you can only do threading to parallelize your code without duplicating memory space, but performance decreases instead of increasing with number of cores. This means threading is only good for concurrency not for scalability.

The GC vs reference counting (RC) is the hearth of the matter. With RC every time a variable is allocated or deallocated you need to lock the counter because you do know who else is accessing the same variable from another thread. This forces the interpreter to basically serialize the program even if you have threads, cores, coroutines, etc.

Forking is a solution only for simple toy cases and in trivially parallel cases. People use processes to parallelize web serves and task queues where the tasks do not need to talk to each other (except with the parent/master process). If you have 100 cores even with a small 50MB program, in order to parallelize it you go from 50MB to 5GB. Memory and memory access become a major bottle neck.

Erlang 

Massimo

On Feb 10, 2012, at 3:29 AM, Mark Shannon wrote:

> 
> There are a lot of things covered in this thread.
> I want to address 2 of them.
> 
> 1. Garbage Collection.
> 
> Python has garbage collection. There is no free() function in Python,
> anyone who says that Python does not have GC is talking nonsense.
> CPython using reference counting as its means of implementing GC.
> 
> Ref counting has different performance characteristics from tracing GC,
> but it only makes sense to consider this is the context of overall
> Python performance.
> One key disadvantage of ref-counting is that does not play well with threads, which leads on to...
> 
> 2. Global Interpreter Lock and Threads.
> 
> The GIL is so deeply embedded into CPython that I think it cannot be removed. There are too many subtle assumptions pervading both the VM and 3rd party code, to make truly concurrent threads possible.
> 
> But are threads the way to go?
> Javascript does not have threads. Lua does not have threads.
> Erlang does not have threads; Erlang processes are implemented (in the BEAM engine) as coroutines.
> 
> One of the Lua authors said this about threads:
> (I can't remember the quote so I will paraphrase)
> "How can you program in a language where 'a = a + 1' is not deterministic?"
> Indeed.
> 
> What Python needs are better libraries for concurrent programming based on processes and coroutines.
> 
> Cheers,
> Mark.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/ee557da4/attachment.html>

From stefan_ml at behnel.de  Fri Feb 10 15:59:45 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 15:59:45 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
	<CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>
Message-ID: <jh3bd1$e6o$1@dough.gmane.org>

Matt Joiner, 10.02.2012 15:48:
>> Pure python code running in python "threads" on CPython behaves like fibers.
>> I'd like to point out the word "external" in your statement.
> 
> I don't believe this to be true. Fibers are not preempted. The GIL is
> released at regular intervals to allow the effect of preempted
> switching. Many other behaviours of Python threads are still native
> thread like, particularly in their interaction with other components
> and the OS.

Absolutely. Even C extensions cannot always prevent a thread switch from
happening when they need to call back into CPython's C-API.


> GIL + Threads = Simplified, non parallel interpreter

Note that this also applies to PyPy, so even "interpreter" isn't enough of
a generalisation.

I think it's best to speak of the GIL as what it is: a lock that protects
internal state of the CPython runtime (and also some external code, when
used that way). Rather convenient, if you ask me.

Stefan


From masklinn at masklinn.net  Fri Feb 10 16:11:54 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 10 Feb 2012 16:11:54 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <C6FEE7B5-3B8A-48AE-B383-179B772427A6@masklinn.net>

On 2012-02-10, at 15:52 , Massimo Di Pierro wrote:
> Erlang may not have threads in the sense that it does not expose threads via an API but provides optional parallel schedulers

-smp has been enabled by default since R13 or R14, it's as optional as multithreading being optional because you can bind a process to a core.

> In Python, currently, you can only do threading to parallelize your code without duplicating memory space, but performance decreases instead of increasing with number of cores. This means threading is only good for concurrency not for scalability.

That's definitely not true, you can also fork and multiprocessing, while not ideal by a long shot, provides a number of tools for working building concurrent applications via multiple processes.


From masklinn at masklinn.net  Fri Feb 10 16:12:58 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 10 Feb 2012 16:12:58 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh3bd1$e6o$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAP7+vJK=j5McR_zpiB00nP6GxfgWi3Nc8j_ztNd7dHxLmKxGLg@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
	<CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>
	<jh3bd1$e6o$1@dough.gma ne.org>
Message-ID: <3BB3585B-EE07-4931-99CD-0FCF79D0557C@masklinn.net>

On 2012-02-10, at 15:59 , Stefan Behnel wrote:
>> GIL + Threads = Simplified, non parallel interpreter
> 
> Note that this also applies to PyPy, so even "interpreter" isn't enough of
> a generalisation.
> 
> I think it's best to speak of the GIL as what it is: a lock that protects
> internal state of the CPython runtime (and also some external code, when
> used that way). Rather convenient, if you ask me.

It is very convenient from the viewpoint of implementing the interpreter,
but you must acknowledge that it comes with quite severe limitations on
the ability of user code to take advantage of computing resources.


From stefan_ml at behnel.de  Fri Feb 10 16:28:11 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 16:28:11 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <jh3d2b$ssl$1@dough.gmane.org>

Massimo Di Pierro, 10.02.2012 15:52:
> Different languages have different mechanisms for taking advantages of
> multiple cores without forking. Python does not provide a mechanism and
> I do not know if anybody is working on one.

Seriously - what's wrong with forking? multiprocessing is so increadibly
easy to use that it's hard for me to understand why anyone would fight for
getting threading to do essentially the same thing, just less safe.

Threading is a seriously hard problem, very tricky to get right and full of
land mines. Basically, you start from a field that's covered with one big
mine, and start cutting it down until you can get yourself convinced that
the remaining mines (if any, right?) are small enough to not hurt anyone.
They usually do anyway, but at least not right away.

This is generally worth a read (not necessarily for the conclusion, but
definitely for the problem description):

http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf


> In Python, currently, you can only do threading to parallelize your code
> without duplicating memory space, but performance decreases instead of
> increasing with number of cores.

Well, nothing keeps you from putting your data into shared memory if you
use multiple processes. It's not that hard either, but it has the major
advantage over threading that you can choose exactly what data should be
shared, so that you can more easily avoid race conditions and unintended
interdependencies.

Basically, you start from a safe split and then add explicit data sharing
and messaging until you have enough shared data and synchronisation points
to make it work, while still keeping up a safe and efficient concurrent
system. Note how this is the opposite of threading, where you start off
from the maximum possible unsafety where all state is shared, and then wade
through it with a machete trying to cut down unsafe interaction points. And
if you miss any one spot, you have a problem.


> This means threading is only good for
> concurrency not for scalability.

Yes, concurrency, or more specifically, I/O concurrency is still a valid
use case for threading.


> The GC vs reference counting (RC) is the hearth of the matter. With RC
> every time a variable is allocated or deallocated you need to lock the
> counter because you do know who else is accessing the same variable from
> another thread. This forces the interpreter to basically serialize the
> program even if you have threads, cores, coroutines, etc.
> 
> Forking is a solution only for simple toy cases and in trivially
> parallel cases. People use processes to parallelize web serves and task
> queues where the tasks do not need to talk to each other (except with
> the parent/master process). If you have 100 cores even with a small 50MB
> program, in order to parallelize it you go from 50MB to 5GB. Memory and
> memory access become a major bottle neck.

I think you should read up a bit on the various mechanisms for parallel
processing.

Stefan


From stefan_ml at behnel.de  Fri Feb 10 16:30:12 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 16:30:12 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <3BB3585B-EE07-4931-99CD-0FCF79D0557C@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
	<CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>
	<jh3bd1$e6o$1@dough.gma ne.org>
	<3BB3585B-EE07-4931-99CD-0FCF79D0557C@masklinn.net>
Message-ID: <jh3d64$ssl$2@dough.gmane.org>

Masklinn, 10.02.2012 16:12:
> On 2012-02-10, at 15:59 , Stefan Behnel wrote:
>>> GIL + Threads = Simplified, non parallel interpreter
>>
>> Note that this also applies to PyPy, so even "interpreter" isn't enough of
>> a generalisation.
>>
>> I think it's best to speak of the GIL as what it is: a lock that protects
>> internal state of the CPython runtime (and also some external code, when
>> used that way). Rather convenient, if you ask me.
> 
> It is very convenient from the viewpoint of implementing the interpreter,
> but you must acknowledge that it comes with quite severe limitations on
> the ability of user code to take advantage of computing resources.

I don't think it does. See my other post just now in response to Massimo.

Stefan


From jimjjewett at gmail.com  Fri Feb 10 16:38:02 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 10 Feb 2012 10:38:02 -0500
Subject: [Python-ideas] [Python-Dev] matrix operations on dict :)
In-Reply-To: <CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>
Message-ID: <CA+OGgf5JotjTzJZXmdsT0nxxVKB+WoZ31VHPNj0=Z5a2egdfog@mail.gmail.com>

On Thu, Feb 9, 2012 at 7:11 PM, Mark Janssen <dreamingforward at gmail.com> wrote:
> On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien at tayon.net> wrote:
>>
>> 2012/2/7 Mark Janssen <dreamingforward at gmail.com>:
>> > On Mon, Feb 6, 2012 at 6:12 PM, Steven
>> > D'Aprano?<steve at pearwood.info>?wrote:

>> > I have the problem looking for this solution!

>> > The application for this functionality is in coding a fractal graph (or
>> > "multigraph" in the literature).

I think that would be better represented using an object of some sort,
such as a MultiGraphNode and/or MultiGraphEdge, instead of
re-purposing dict.

> Okay, I guess I did not make myself very clear. ?What I'm proposing probably
> will (eventually) require changes to the "object" model of Python

That means you're talking about Python 4, at a minimum, and you would
need to show how valuable it is by building a workaround version and
getting people to use that extensively in Python 3.

And frankly, you should probably do that anyhow; this feels to me like
a bad plan for language defaults, but it is still a valid use case --
and I don't think this sort of math exploration should (or will) wait
for Python 4; people will model it somehow in an existing language.

> The symbol that denotes a compound would be the colon
> (":") and associates a left hand side with right-hand side value,
> a NAME with a VALUE. ? A dictionary would (then) be a SET
> of these. (Voila! things have already gotten simplified.)

That sounds like an association list.  I think you're dealing with
sufficiently abstract problems that you don't want to restrict your
keys to hashable things, and it is worth suffering a bit slower
performance in return.

>?Eventually, I also think this will seque and integrate
> nicely into Mark Shannon's "shared-key dict" proposal (PEP 410).

I'm pretty sure he doesn't intend to change the semantics of dict at
all. He does want to make the implementation more efficient, at least
in terms of space; any semantic differences are considered either bugs
or costs worth paying for that efficiency.

> While in the abstract one might think to allow any arbitrary data-type for
> right-hand-side values, in PRACTICE, integers are sufficient.

By integers, do you really mean pointers or (possibly abstract)
references to other structures?  Because if you do, then ordinary
arithmetic isn't the right solution, but if you don't, then I don't
see them as sufficient.

> ...?It makes sense to use ... a maximally abstract top-most level --
> this is simply an abstract grouping type (i.e. a collection). ?I'm going to
> suggest a SET is the most abstract (i.e. sufficient) because it does
> not impose an order

I agree that it is the most abstract (at least of the well-known)
type, and that all the other types can be represented in terms of
sets.  The catch is that these representations may be massively
inefficient.  If you're doing mathematical exploration, that may be a
reasonable tradeoff, but Python also caters to other use cases.

> (Could one represent a python CLASS heirarchy more simply with this
> fractalset object somehow....?)

Depending on what you want to represent, probably.

But if you want to represent the ancestors of a given class for
efficient method and attribute access, then no; it is hard to beat an
array for efficiency of sequential access.

-jJ


From arnodel at gmail.com  Fri Feb 10 16:43:54 2012
From: arnodel at gmail.com (Arnaud Delobelle)
Date: Fri, 10 Feb 2012 15:43:54 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>

On 10 February 2012 14:52, Massimo Di Pierro <massimo.dipierro at gmail.com> wrote:
> Forking is a solution only for simple toy cases and in trivially parallel
> cases. People use processes to parallelize web serves and task queues where
> the tasks do not need to talk to each other (except with the parent/master
> process).?If you have 100 cores even with a small 50MB program, in order to
> parallelize it you go from 50MB to 5GB. Memory and memory access become a
> major bottle neck.

I don't know much about forking, but I'm pretty sure that forking a
process doesn't mean you double the amount of physical memory used.
With copy-on-write, a lot of physical memory can be shared.

-- 
Arnaud


From ncoghlan at gmail.com  Fri Feb 10 16:57:12 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 11 Feb 2012 01:57:12 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh3d64$ssl$2@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<FF2FD5EE-01C7-42A7-A32D-54043F91C17D@masklinn.net>
	<CAP7+vJKtH=czp73u8sjVURgOcAgLmZgb5ha30QJaZnThot8OtQ@mail.gmail.com>
	<20120209185810.GC20556@mcnabbs.org> <jh188p$6ut$1@dough.gmane.org>
	<20120209222540.GD20556@mcnabbs.org>
	<CAP7+vJLCBeNBr3jmcB7nLJBW7qvKZq6fU_nNvAg_VW5WzttKiw@mail.gmail.com>
	<DA457A52-1F45-4280-828A-14AD5B53BECF@molden.no>
	<CAP7+vJK7PZ1XMKyHm8trvBQFwzLQQBHuQ=dbnTmUwjE0v2tFUw@mail.gmail.com>
	<623AA797-6F68-4158-82D1-A21872B34782@masklinn.net>
	<jh2qc9$a34$1@dough.gmane.org>
	<DB8222A0-8EC3-4F38-9F0E-A68BFA96D6E6@masklinn.net>
	<jh2snq$rbb$1@dough.gmane.org>
	<CANSw7KzBYRho7BK-NS5rw6xQXMuBj2U6_tNYQiwJBTuCSAM_-Q@mail.gmail.com>
	<jh2umo$bfi$1@dough.gmane.org>
	<CANSw7KxTi-dQ=S14X5dCWYc=xX-5GOkwzPUOtMKfyCMLV6WJKw@mail.gmail.com>
	<CAB4yi1NVP3g24TQKkY7sbou-gjnutiNk2fGKc=nQYP9_q5_VzA@mail.gmail.com>
	<3BB3585B-EE07-4931-99CD-0FCF79D0557C@masklinn.net>
	<jh3d64$ssl$2@dough.gmane.org>
Message-ID: <CADiSq7dxHSKmRAzd1u2U65BdiefKo4JLX=jFJcPwSzBy2naLWg@mail.gmail.com>

On Sat, Feb 11, 2012 at 1:30 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Masklinn, 10.02.2012 16:12:
>> On 2012-02-10, at 15:59 , Stefan Behnel wrote:
>>>> GIL + Threads = Simplified, non parallel interpreter
>>>
>>> Note that this also applies to PyPy, so even "interpreter" isn't enough of
>>> a generalisation.
>>>
>>> I think it's best to speak of the GIL as what it is: a lock that protects
>>> internal state of the CPython runtime (and also some external code, when
>>> used that way). Rather convenient, if you ask me.
>>
>> It is very convenient from the viewpoint of implementing the interpreter,
>> but you must acknowledge that it comes with quite severe limitations on
>> the ability of user code to take advantage of computing resources.
>
> I don't think it does. See my other post just now in response to Massimo.

Armin Rigo's series on Software Transactional Memory on the PyPy blog
is also required reading for anyone seriously interested in practical
shared memory concurrency that doesn't impose a horrendous maintenance
burden on developers that try to use it:
http://morepypy.blogspot.com.au/2011/06/global-interpreter-lock-or-how-to-kill.html
http://morepypy.blogspot.com.au/2011/08/we-need-software-transactional-memory.html
http://morepypy.blogspot.com.au/2012/01/transactional-memory-ii.html

And for those that may be inclined to dismiss STM as pie-in-the-sky
stuff that is never going to be practical in the "real world", the
best I can offer is Intel's plans to bake an initial attempt at it
into a consumer grade chip within the next couple of years:
http://arstechnica.com/business/news/2012/02/transactional-memory-going-mainstream-with-intel-haswell.ars?

I do like Armin's analogy that free threading is to concurrency as
malloc() and free() are to memory management :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Feb 10 16:58:33 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 11 Feb 2012 01:58:33 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
Message-ID: <CADiSq7cS8P9YmG7uTUE=c4prZxZK2L==H4iZVxBAPmLZhPCrRg@mail.gmail.com>

On Sat, Feb 11, 2012 at 1:43 AM, Arnaud Delobelle <arnodel at gmail.com> wrote:
> I don't know much about forking, but I'm pretty sure that forking a
> process doesn't mean you double the amount of physical memory used.
> With copy-on-write, a lot of physical memory can be shared.

Unfortunately, CPython's use of refcounting plays merry hell with the
effectiveness of copy-on-write memory saving techniques.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From massimo.dipierro at gmail.com  Fri Feb 10 17:05:44 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Fri, 10 Feb 2012 10:05:44 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh3d2b$ssl$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<jh3d2b$ssl$1@dough.gmane.org>
Message-ID: <93FAD103-1A38-44FC-9A2B-91EC58FB7BE3@gmail.com>


On Feb 10, 2012, at 9:28 AM, Stefan Behnel wrote:

> Massimo Di Pierro, 10.02.2012 15:52:
>> 
>> Forking is a solution only for simple toy cases and in trivially
>> parallel cases. People use processes to parallelize web serves and task
>> queues where the tasks do not need to talk to each other (except with
>> the parent/master process). If you have 100 cores even with a small 50MB
>> program, in order to parallelize it you go from 50MB to 5GB. Memory and
>> memory access become a major bottle neck.
> 
> I think you should read up a bit on the various mechanisms for parallel
> processing.

yes I should ;-) 
(Perhaps I should take this course http://www.cdm.depaul.edu/academics/pages/courseinfo.aspx?CrseId=001533)

The fact is, in my experience, many modern applications where performance is important try to take advantage of all parallelization available. I have worked on many years in lattice QCD and I have written code that runs on various parallel machines. We used processes to parallelize across nodes, threads to parallelize on single node, and assembly vectorial instructions to parallelize within each core. This used to be a state of art way of programming but now I see these patters trickling down to many consumer applications, for example games. People do not like threads because of the need for locking but, as you increase the number of cores, the bottle neck becomes memory access. If you use processes, you don't just bloat ram usage killing cache performance but you need to use message passing for interprocess communication. Message passing require copy of data which is expensive (remember ram is the bottle neck). Ever worse, some times message passing cannot be done using ram only and you need disk buffered message for interprocess communication.

Some programs are parallelized ok with processes. Those I have experience with require both processes and threads. Again, this does not mean using threading APIs. The VM should use threads to parallelize tasks. How this is exposed to the developed is a different matter.


Massimo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/1e6ecd03/attachment.html>

From massimo.dipierro at gmail.com  Fri Feb 10 17:07:09 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Fri, 10 Feb 2012 10:07:09 -0600
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
Message-ID: <0A002488-3C24-4C3B-B213-0F7CF4C3E65C@gmail.com>


On Feb 10, 2012, at 9:43 AM, Arnaud Delobelle wrote:

> On 10 February 2012 14:52, Massimo Di Pierro <massimo.dipierro at gmail.com> wrote:
>> Forking is a solution only for simple toy cases and in trivially parallel
>> cases. People use processes to parallelize web serves and task queues where
>> the tasks do not need to talk to each other (except with the parent/master
>> process). If you have 100 cores even with a small 50MB program, in order to
>> parallelize it you go from 50MB to 5GB. Memory and memory access become a
>> major bottle neck.
> 
> I don't know much about forking, but I'm pretty sure that forking a
> process doesn't mean you double the amount of physical memory used.
> With copy-on-write, a lot of physical memory can be shared.

Anyway, copy-on-write does not solve the problem. The OS tries to save memory but not duplicating physical memory space and by assigning the different address spaces of the various forked processes to the same physical memory. But as soon as one process writes into the segment, the entire segment is copied. It has to be, the processes must have different address spaces. That is what fork does.

Anyway, there are many applications that are parallelized well with processes (at least for a small number of cores/cpus).


> 
> -- 
> Arnaud


From stefan_ml at behnel.de  Fri Feb 10 17:07:12 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 10 Feb 2012 17:07:12 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
Message-ID: <jh3fbg$gdl$1@dough.gmane.org>

Arnaud Delobelle, 10.02.2012 16:43:
> On 10 February 2012 14:52, Massimo Di Pierro wrote:
>> Forking is a solution only for simple toy cases and in trivially parallel
>> cases. People use processes to parallelize web serves and task queues where
>> the tasks do not need to talk to each other (except with the parent/master
>> process). If you have 100 cores even with a small 50MB program, in order to
>> parallelize it you go from 50MB to 5GB. Memory and memory access become a
>> major bottle neck.
> 
> I don't know much about forking, but I'm pretty sure that forking a
> process doesn't mean you double the amount of physical memory used.
> With copy-on-write, a lot of physical memory can be shared.

That applies to systems that support both fork and copy-on-write. Not all
systems are that lucky, although many major Unices have caught up in recent
years.

The Cygwin implementation of fork() is especially involved for example,
simple because Windows lacks this idiom completely (well, in it's normal
non-POSIX identity, that is, where basically all Windows programs run).

http://seit.unsw.adfa.edu.au/staff/sites/hrp/webDesignHelp/cygwin-ug-net-nochunks.html#OV-HI-PROCESS

Stefan


From phd at phdru.name  Thu Feb  9 16:28:42 2012
From: phd at phdru.name (Oleg Broytman)
Date: Thu, 9 Feb 2012 19:28:42 +0400
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<0E6116E4-A406-4C49-A8C1-C18D6228C141@masklinn.net>
	<CANSw7KzHxk-ZBc=R_adCPg3ZMW7hYFpQWoKtW5XCFwQEvJQS_Q@mail.gmail.com>
Message-ID: <20120209152842.GA15149@iskra.aviel.ru>

On Thu, Feb 09, 2012 at 05:13:03PM +0200, Yuval Greenfield wrote:
> On Thu, Feb 9, 2012 at 5:05 PM, Masklinn <masklinn at masklinn.net> wrote:
> > On 2012-02-09, at 15:36 , anatoly techtonik wrote:
> > > Hi,
> > >
> > > I didn't want to grow FUD on python-dev, but a FUD there seems to be a
> > good
> > > topic for discussion here.
> > > http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
> >
> > 1. Python-ideas is not the right place for this stuff (neither is
> > Python-dev, by the way)
> > 2. Why would anybody care exactly?
> 
>    1. Where would be the correct place to talk about a grand state of
>    python affairs?

   Nowhere because:
1. Nobody cares. This is Free Software, and we are scratching our own
itches.
2. Do you consider Python developers stupid? Do you think they don't
have any idea how things are going on in the wild?

>    2. Like it or not, many use such ratings to decide which language to
>    learn, which language to use for their next project and whether or not to
>    be proud of their language of choice.

   Java (or Perl, or whatever) has won, hands down. Congrats to them! Can
we please return to our own development? We are not going to conquer the
world, are we?

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From breamoreboy at yahoo.co.uk  Fri Feb 10 17:33:53 2012
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Fri, 10 Feb 2012 16:33:53 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh3fbg$gdl$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<CAJ6cK1Y3yyhAK+oSrfJk2daLPuQJpg-O0RN7eMJ1GamTx-bqvg@mail.gmail.com>
	<jh3fbg$gdl$1@dough.gmane.org>
Message-ID: <jh3gtf$qkv$1@dough.gmane.org>

On 10/02/2012 16:07, Stefan Behnel wrote:
> Arnaud Delobelle, 10.02.2012 16:43:
>> On 10 February 2012 14:52, Massimo Di Pierro wrote:
>>> Forking is a solution only for simple toy cases and in trivially parallel
>>> cases. People use processes to parallelize web serves and task queues where
>>> the tasks do not need to talk to each other (except with the parent/master
>>> process). If you have 100 cores even with a small 50MB program, in order to
>>> parallelize it you go from 50MB to 5GB. Memory and memory access become a
>>> major bottle neck.
>>
>> I don't know much about forking, but I'm pretty sure that forking a
>> process doesn't mean you double the amount of physical memory used.
>> With copy-on-write, a lot of physical memory can be shared.
>
> That applies to systems that support both fork and copy-on-write. Not all
> systems are that lucky, although many major Unices have caught up in recent
> years.
>
> The Cygwin implementation of fork() is especially involved for example,
> simple because Windows lacks this idiom completely (well, in it's normal
> non-POSIX identity, that is, where basically all Windows programs run).
>
> http://seit.unsw.adfa.edu.au/staff/sites/hrp/webDesignHelp/cygwin-ug-net-nochunks.html#OV-HI-PROCESS
>
> Stefan

For those who don't follow c.l.p a thread subject "Fabric Engine + 
Python bechmarks" turned up 30 minutes ago.  Problem solved? :)

-- 
Cheers.

Mark Lawrence.


From dreamingforward at gmail.com  Fri Feb 10 17:56:02 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Fri, 10 Feb 2012 09:56:02 -0700
Subject: [Python-ideas] [Python-Dev] matrix operations on dict :)
In-Reply-To: <CA+OGgf5JotjTzJZXmdsT0nxxVKB+WoZ31VHPNj0=Z5a2egdfog@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9XAug7d-2BB2PYQJ_R14z8XkcVaO3NEnBKfG_qKtZnOA@mail.gmail.com>
	<CA+OGgf5JotjTzJZXmdsT0nxxVKB+WoZ31VHPNj0=Z5a2egdfog@mail.gmail.com>
Message-ID: <CAMjeLr9ZmxvJvc3qZLyN-9PxCyyrvABY2ipFmDBoaAoJm0xmLw@mail.gmail.com>

On Fri, Feb 10, 2012 at 8:38 AM, Jim Jewett <jimjjewett at gmail.com> wrote:

> On Thu, Feb 9, 2012 at 7:11 PM, Mark Janssen <dreamingforward at gmail.com>
> wrote:
> > On Wed, Feb 8, 2012 at 9:54 AM, julien tayon <julien at tayon.net> wrote:
> >>
> >> 2012/2/7 Mark Janssen <dreamingforward at gmail.com>:
> >> > On Mon, Feb 6, 2012 at 6:12 PM, Steven
> >> > D'Aprano <steve at pearwood.info> wrote:
>
> >> > I have the problem looking for this solution!
>
> >> > The application for this functionality is in coding a fractal graph
> (or
> >> > "multigraph" in the literature).
>
> I think that would be better represented using an object of some sort,
> such as a MultiGraphNode and/or MultiGraphEdge, instead of
> re-purposing dict.
>

Those would be good strategies in general, but the issue is how things hook
together in the object model.  These things are very abstract, it's exactly
the thing which had made metaclasses difficult to "grok" at times.  I'll
probably just have to try to implement them in Pypy or abandon the idea.


>
> > Okay, I guess I did not make myself very clear.  What I'm proposing
> probably
> > will (eventually) require changes to the "object" model of Python
>
> That means you're talking about Python 4, at a minimum, and you would
> need to show how valuable it is by building a workaround version and
> getting people to use that extensively in Python 3.
>
> I understand what you're saying but I among many of us Python3000 never
really happened.  So this is really still for what was dreamed to happen in
version 3.

>
> > While in the abstract one might think to allow any arbitrary data-type
> for
> > right-hand-side values, in PRACTICE, integers are sufficient.
>
> By integers, do you really mean pointers or (possibly abstract)
> references to other structures?  Because if you do, then ordinary
> arithmetic isn't the right solution, but if you don't, then I don't
> see them as sufficient.
>
> No actually python integers.  My point is that within this unified
information model, everything can be represented by atomic units (where
integers come in) and groups (or a collection type).  Compre with how all
the complexity of the physical world is a product of small-massed electrons
and protons.  I'm arguing that all the uses of data can be represented in a
similar way.

Thanks for the reply, but I think I'll shelve the discussion for now....

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120210/78d70f5b/attachment.html>

From solipsis at pitrou.net  Fri Feb 10 18:38:01 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 10 Feb 2012 18:38:01 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <20120210183801.59921627@pitrou.net>

On Fri, 10 Feb 2012 08:52:16 -0600
Massimo Di Pierro
<massimo.dipierro at gmail.com> wrote:
> The way I see it is not whether Python has threads, fibers, coroutines, etc.
> The problem is that in 5 years we going to have on the market CPUs with
> 100 cores

This is definitely untrue. No CPU maker has plans for a general-purpose
100-core CPU.

Regards

Antoine.


From mwm at mired.org  Fri Feb 10 19:34:10 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 10 Feb 2012 10:34:10 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <20120210103410.4a5d5841@bhuda.mired.org>

On Fri, 10 Feb 2012 08:52:16 -0600
Massimo Di Pierro <massimo.dipierro at gmail.com> wrote:
> Forking is a solution only for simple toy cases and in trivially parallel cases.

But threading is only a solution for simple toy cases and trivial
levels of scaling.

> People use processes to parallelize web serves and task queues where
> the tasks do not need to talk to each other (except with the
> parent/master process).

Only if they haven't thought much about using processes to build
parallel systems. They work quite well for data that can be handed off
to the next process, and where the communications is a small enough
part of the problem that serializing it for communications is
reasonable, and for cases where the data that needs high-speed
communications can be treated as a relocatable chunk of memory. And
any combination of those three, of course.

The real problem with using processes in python is that there's no way
to share complex python objects between processes - you're restricted
to ctypes values or arrays of those. For many applications, that's
fine. If you need to share a large searchable structure, you're
reduced to FORTRAN techniques.

> If you have 100 cores even with a small 50MB program, in order to
> parallelize it you go from 50MB to 5GB. Memory and memory access
> become a major bottle neck.

That should be fixed in the OS, not by making your problem 2**100
times as hard to analyze.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From mal at egenix.com  Fri Feb 10 19:36:52 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 10 Feb 2012 19:36:52 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <4F3563C4.2050703@egenix.com>

Massimo Di Pierro wrote:
> Forking is a solution only for simple toy cases and in trivially parallel cases. People use processes to parallelize web serves and task queues where the tasks do not need to talk to each other (except with the parent/master process). If you have 100 cores even with a small 50MB program, in order to parallelize it you go from 50MB to 5GB. Memory and memory access become a major bottle neck.

By the time we 100 core CPUs, we'll be measuring RAM in TB, so that
shouldn't be a problem ;-)

Many Python use cases are indeed easy to scale using multiple processes
which then each run on a separate core, so that approach is a very
workable way forward.

If you need to share data across processes, you can use a shared
memory mechanism. In many cases, the data to be shared will already
be stored in a database and those can easily be accessed from all
processes (again using shared memory).

I often find these GIL discussion a bit theoretical. In practice
I've so far never run into any issues with Python scalability. It's
other components that cause a lot more trouble, like e.g. database
query scalability, network congestion or disk access being too slow.

In cases where the GIL does cause problems, it's usually better to
consider changing the application design and use asynchronous processing
with a single threaded design or a multi-process design where each of
the processes only uses a low number of threads (20-50 per process).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 10 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From mwm at mired.org  Fri Feb 10 19:52:08 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 10 Feb 2012 10:52:08 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3563C4.2050703@egenix.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com>
Message-ID: <20120210105208.6f133329@bhuda.mired.org>

On Fri, 10 Feb 2012 19:36:52 +0100
"M.-A. Lemburg" <mal at egenix.com> wrote:
> In cases where the GIL does cause problems, it's usually better to
> consider changing the application design and use asynchronous processing
> with a single threaded design or a multi-process design where each of
> the processes only uses a low number of threads (20-50 per process).

Just a warning: mixing threads and forks can be hazardous to your
sanity. In particular, forking a process that has threads running has
behaviors, problems and solutions that vary between Unix
variants. Best to make sure you've done all your forks before you
create a thread if you want your code to be portable.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From jnoller at gmail.com  Fri Feb 10 19:55:54 2012
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 10 Feb 2012 13:55:54 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3563C4.2050703@egenix.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com>
Message-ID: <2B72BFF25CAD476C9785EC6960B43344@gmail.com>

> 
> 
> By the time we 100 core CPUs, we'll be measuring RAM in TB, so that
> shouldn't be a problem ;-)
> 
> Many Python use cases are indeed easy to scale using multiple processes
> which then each run on a separate core, so that approach is a very
> workable way forward.
> 
> If you need to share data across processes, you can use a shared
> memory mechanism. In many cases, the data to be shared will already
> be stored in a database and those can easily be accessed from all
> processes (again using shared memory).
> 
> I often find these GIL discussion a bit theoretical. In practice
> I've so far never run into any issues with Python scalability. It's
> other components that cause a lot more trouble, like e.g. database
> query scalability, network congestion or disk access being too slow.
> 
> In cases where the GIL does cause problems, it's usually better to
> consider changing the application design and use asynchronous processing
> with a single threaded design or a multi-process design where each of
> the processes only uses a low number of threads (20-50 per process).


I think the much, much better response to the questions and comments around Python, the GIL and parallel computing in general is this:

Yes, let's have more of that!

It's like asking if people like pie, or babies. 99% of people polled are going to say "Yes, let's have more of that!" - so it goes with Python, the GIL, STM, Multiprocessing, Threads, etc.

Where all of these discussions break down - and they always do - is that we lack:

1> Someone with a working patch for Pie
2> Someone with a fleshed out proposal/PEP on how to get more Pie
3> A group of people with time to bake more Pies that could help be paid to make Pie 

Banging on the table and asking for more Pie won't get us more Pie - what we need are actual proposals, in the form of well thought out PEPs, the people to implement and maintain the thing (see: unladen swallow), or working implementations.

No one in this thread is arguing that having more Pie, or babies, would be bad. No one is arguing that more/better concurrency constructs would be good. Tools like concurrent.futures in Python 3 would be a good example of something recently added. 

The problem is people, plans and time. If we can solve the People and Time problems, instead of looking to already overworked volunteers then I'm sure we can come up with a good Pie plan.

I really like pie. 

Jesse


From mal at egenix.com  Fri Feb 10 20:15:21 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 10 Feb 2012 20:15:21 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120210105208.6f133329@bhuda.mired.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com>
	<20120210105208.6f133329@bhuda.mired.org>
Message-ID: <4F356CC9.6070006@egenix.com>

Mike Meyer wrote:
> On Fri, 10 Feb 2012 19:36:52 +0100
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>> In cases where the GIL does cause problems, it's usually better to
>> consider changing the application design and use asynchronous processing
>> with a single threaded design or a multi-process design where each of
>> the processes only uses a low number of threads (20-50 per process).
> 
> Just a warning: mixing threads and forks can be hazardous to your
> sanity. In particular, forking a process that has threads running has
> behaviors, problems and solutions that vary between Unix
> variants. Best to make sure you've done all your forks before you
> create a thread if you want your code to be portable.

Right.

Applications using such strategies will usually have long running
processes, so it's often better to spawn new processes than
to use fork. This also helps if you want to bind processes to
cores.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 10 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From sturla at molden.no  Fri Feb 10 20:54:35 2012
From: sturla at molden.no (Sturla Molden)
Date: Fri, 10 Feb 2012 20:54:35 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3563C4.2050703@egenix.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>	<jh2b23$52o$1@dough.gmane.org>
	<4F34E393.9020105@hotpy.org>	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com>
Message-ID: <4F3575FB.60700@molden.no>

On 10.02.2012 19:36, M.-A. Lemburg wrote:

> By the time we 100 core CPUs, we'll be measuring RAM in TB, so that
> shouldn't be a problem ;-)

Actually, Python is already great for those. They are called GPUs, and 
OpenCL is all about text processing.


> In cases where the GIL does cause problems, it's usually better to
> consider changing the application design and use asynchronous processing
> with a single threaded design or a multi-process design where each of
> the processes only uses a low number of threads (20-50 per process).

The "GIL problem" is much easier to analyze than most Python developers 
using Linux might think:

- Windows has no fork system call. SunOS used to have a very slow fork 
system call. The majority of Java developers worked with Windows or Sun, 
and learned to work with threads.

For which the current summary is:

- The GIL sucks because Windows has no fork.

Which some might say is the equivalent of:

- Windows sucks.


Sturla


From sturla at molden.no  Fri Feb 10 20:57:31 2012
From: sturla at molden.no (Sturla Molden)
Date: Fri, 10 Feb 2012 20:57:31 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh2paa$180$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<20120209104237.154be949@bhuda.mired.org>	<4F341710.9030806@molden.no>
	<jh2paa$180$1@dough.gmane.org>
Message-ID: <4F3576AB.40202@molden.no>

On 10.02.2012 10:51, Serhiy Storchaka wrote:

> What about os.fork()?

MPI starts by spawning a group of empty processes. If you use these 
massive parallel computers, you have to play by the MPI rules.

Sturla


From sturla at molden.no  Fri Feb 10 21:01:07 2012
From: sturla at molden.no (Sturla Molden)
Date: Fri, 10 Feb 2012 21:01:07 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7ef9t7LSJK+jX_v4VdwQp5BVgYgoXGsYpK71ydY+fHuiw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<4F340A81.60300@pearwood.info>	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>	<4F341419.6030808@molden.no>	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<CADiSq7ef9t7LSJK+jX_v4VdwQp5BVgYgoXGsYpK71ydY+fHuiw@mail.gmail.com>
Message-ID: <4F357783.3050300@molden.no>

On 09.02.2012 22:05, Nick Coghlan wrote:

> Have you even *tried* concurrent.futures
> (http://docs.python.org/py3k/library/concurrent.futures)? Or the 2.x
> backport on PyPI (http://pypi.python.org/pypi/futures)?

Multiprocessing is fine, but is uses pickle for IPC and this is 
inefficient. We need unpickled, type-specialized queues. Or a queue that 
has the interface of a binary file.

Sturla


From solipsis at pitrou.net  Fri Feb 10 21:02:09 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 10 Feb 2012 21:02:09 +0100
Subject: [Python-ideas] multiprocessing IPC
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<CADiSq7ef9t7LSJK+jX_v4VdwQp5BVgYgoXGsYpK71ydY+fHuiw@mail.gmail.com>
	<4F357783.3050300@molden.no>
Message-ID: <20120210210209.277a50da@pitrou.net>

On Fri, 10 Feb 2012 21:01:07 +0100
Sturla Molden <sturla at molden.no> wrote:

> On 09.02.2012 22:05, Nick Coghlan wrote:
> 
> > Have you even *tried* concurrent.futures
> > (http://docs.python.org/py3k/library/concurrent.futures)? Or the 2.x
> > backport on PyPI (http://pypi.python.org/pypi/futures)?
> 
> Multiprocessing is fine, but is uses pickle for IPC and this is 
> inefficient. We need unpickled, type-specialized queues. Or a queue that 
> has the interface of a binary file.

If you have any concrete idea for that, don't hesitate to post it on
the bug tracker, or here under a separate thread (this thread
is a train wreck).

Regards

Antoine.


From mwm at mired.org  Fri Feb 10 22:15:52 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 10 Feb 2012 13:15:52 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F357783.3050300@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<4F340A81.60300@pearwood.info>
	<10381712-394F-47F9-986D-8D4A7679CC69@gmail.com>
	<4F341419.6030808@molden.no>
	<CAB4yi1M3LgMmEDf+5AkLw0DAYV-uqNYYoF+OJHmm8QHOZc4+8g@mail.gmail.com>
	<CADiSq7ef9t7LSJK+jX_v4VdwQp5BVgYgoXGsYpK71ydY+fHuiw@mail.gmail.com>
	<4F357783.3050300@molden.no>
Message-ID: <20120210131552.487b1f9d@bhuda.mired.org>

On Fri, 10 Feb 2012 21:01:07 +0100
Sturla Molden <sturla at molden.no> wrote:

> On 09.02.2012 22:05, Nick Coghlan wrote:
> 
> > Have you even *tried* concurrent.futures
> > (http://docs.python.org/py3k/library/concurrent.futures)? Or the 2.x
> > backport on PyPI (http://pypi.python.org/pypi/futures)?
> 
> Multiprocessing is fine, but is uses pickle for IPC and this is 
> inefficient. We need unpickled, type-specialized queues. Or a queue that 
> has the interface of a binary file.

In what way does the mmap module fail to provide your binary file
interface?

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From mal at egenix.com  Fri Feb 10 23:31:59 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 10 Feb 2012 23:31:59 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3575FB.60700@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>	<jh2b23$52o$1@dough.gmane.org>
	<4F34E393.9020105@hotpy.org>	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
Message-ID: <4F359ADF.3060202@egenix.com>

Sturla Molden wrote:
> On 10.02.2012 19:36, M.-A. Lemburg wrote:
>> In cases where the GIL does cause problems, it's usually better to
>> consider changing the application design and use asynchronous processing
>> with a single threaded design or a multi-process design where each of
>> the processes only uses a low number of threads (20-50 per process).
> 
> The "GIL problem" is much easier to analyze than most Python developers using Linux might think:
> 
> - Windows has no fork system call. SunOS used to have a very slow fork system call. The majority of
> Java developers worked with Windows or Sun, and learned to work with threads.
> 
> For which the current summary is:
> 
> - The GIL sucks because Windows has no fork.
> 
> Which some might say is the equivalent of:
> 
> - Windows sucks.

I'm not sure why you think you need os.fork() in order to work
with multiple processes. Spawning processes works just as well
and, often enough, is all you really need to get the second variant
working.

The first variant doesn't need threads at all, but can not always
be used since it requires all application components to play along
nicely with the async approach.

I forgot to mention a third variant: use a multi-process design
with single threaded asynchronous processing in each process.

This third variant is becoming increasingly popular, esp. if you have
to handle lots and lots of individual requests with relatively low
need for data sharing between the requests.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 10 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From jimjjewett at gmail.com  Sat Feb 11 00:33:42 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 10 Feb 2012 18:33:42 -0500
Subject: [Python-ideas] Py3 unicode impositions
Message-ID: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>

On Fri, Feb 10, 2012 at 3:41 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Terry Reedy writes:
>
> ?> > In python 2 there was no such a strong imposition [of Unicode
> ?> > awareness on users].

> ?> Nor is there in 3.x.

> Sorry, Terry, but you're basically wrong here. ?True, if one sticks to
> pure ASCII, there's no difference to notice, but that's just not
> possible for people who live outside of the U.S., or who share text
> with people outside of the U.S. ?They need currency symbols, they
> have friends whose names have little dots on them. ?Every single
> one of those is a backtrace waiting to happen. ?A backtrace on

> ? ?f = open('text-file.txt')
> ? ?for line in f: pass

> is an imposition. ?That doesn't happen in 2.x (for the wrong reasons,
> but it's very convenient 95% of the time).

I may be missing something, but as best I can tell

(1)  That uses an implicit encoding of None.
(2)  encoding=None is documented as being platform-dependent.

Are you saying that some (many?  all?) platforms make a bad choice there?

Does that only happen when sys.getdefaultencoding() !=
sys.getfilesystemencoding(), or when one of them gives bad
information?  (FWIW, on a mostly ASCII windows machine, the default is
utf-8 but the filesystem encoding is mbcs, so merely being different
doesn't always provoke problems.)

Would it cause problems to make the default be whatever locale
returns, or whatever it returns the first time open is called?

-jJ


From sturla at molden.no  Sat Feb 11 00:36:15 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 11 Feb 2012 00:36:15 +0100
Subject: [Python-ideas] multiprocessing IPC
Message-ID: <4F35A9EF.7030309@molden.no>

Den 10.02.2012 22:15, skrev Mike Meyer:
> In what way does the mmap module fail to provide your binary file 
> interface? <mike 

The short answer is that BSD mmap creates an anonymous kernel object. 
When working with multiprocessing for a while, one comes to the 
conclusion that we really need named kernel objects.

Here are two simple fail cases for anonymous kernel objects:

- Process A spawns/forks process B.
- Process B creates an object, one of the attributes is a lock.
- Fail: This object cannot be communicated back to process A. B inherits 
from A, A does not inherit from B.

- Process A spawns/forks a process pool.
- Process A creates an object, one of the attributes is a lock.
- Fail: This object cannot be communicated to the pool. They do not 
inherit new handles from A after they are started.

All of multiprocessing's IPC classes suffer from this!

Solution:

Use named kernel objects for IPC, pickle the name.

I made a shared memory array for NumPy that workes like this -- 
implemented by memory mapping from the paging file on Windows, System V 
IPC on Linux. Underneath is an extension class that allocates a shared 
memory buffer. When pickled it encodes the kernel name, not its content, 
and unpickling opens the object given its name.

There is another drawback too:

The speed of pickle. For example, sharing NumPy arrays with pickle is 
not faster with shared memory. The overhead from pickle completely 
dominate the time needed for IPC . That is why I want a type specialized 
or a binary channel. Making this from the named shared memory class I 
already have is a no-brainer.

So that is my other objection against multiprocessing.

That is:

1. Object sharing by handle inheritance fails when kernel objects must 
be passed back to the parent process or to a process pool. We need IPC 
objects that have a name in the kernel, so they can be created and 
shared in retrospect.

2. IPC with multiprocessing is too slow due to pickle. We need something 
that does not use pickle. (E.g. shared memory, but not by means of 
mmap.) It might be that the pipe or socket in multiprocessing will do 
this (I have not looked at it carefully enough), but they still don't have

Proof of concept:

http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip

Dependency on Cython and NumPy should probably be removed, never mind 
that. Important part is this:

sharedmemory_sysv.pyx (Linux)
sharedmemory_win.pyx and ntqueryobject.c (Windows)

Finally, I'd like to say that I think Python's standard lib should 
support high-performance asynchronous I/O for concurrency. That is not 
poll/select (on Windows it does not even work properly). Rather, I want 
IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about 
twisted.) There should also be a requirement that it works with 
multiprocessing. E.g. if we open a process pool, the processes should be 
able to use the same IOCP. In other words some highly scalable 
asynchronous I/O that works with multiprocessing.

So ... As far as I am concerned, the only thing worth keeping in 
multipricessing is multiprocessing.Process and multiprocessing.Pool. The 
rest doesn't do what we want.


Sturla


From jnoller at gmail.com  Sat Feb 11 01:04:11 2012
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 10 Feb 2012 19:04:11 -0500
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F35A9EF.7030309@molden.no>
References: <4F35A9EF.7030309@molden.no>
Message-ID: <B08D1B20-84EF-4D30-B049-75635BB5BDA6@gmail.com>


On Feb 10, 2012, at 6:36 PM, Sturla Molden <sturla at molden.no> wrote:

> Den 10.02.2012 22:15, skrev Mike Meyer:
>> In what way does the mmap module fail to provide your binary file interface? <mike 
> 
> The short answer is that BSD mmap creates an anonymous kernel object. When working with multiprocessing for a while, one comes to the conclusion that we really need named kernel objects.
> 
> Here are two simple fail cases for anonymous kernel objects:
> 
> - Process A spawns/forks process B.
> - Process B creates an object, one of the attributes is a lock.
> - Fail: This object cannot be communicated back to process A. B inherits from A, A does not inherit from B.
> 
> - Process A spawns/forks a process pool.
> - Process A creates an object, one of the attributes is a lock.
> - Fail: This object cannot be communicated to the pool. They do not inherit new handles from A after they are started.
> 
> All of multiprocessing's IPC classes suffer from this!
> 
> Solution:
> 
> Use named kernel objects for IPC, pickle the name.
> 
> I made a shared memory array for NumPy that workes like this -- implemented by memory mapping from the paging file on Windows, System V IPC on Linux. Underneath is an extension class that allocates a shared memory buffer. When pickled it encodes the kernel name, not its content, and unpickling opens the object given its name.
> 
> There is another drawback too:
> 
> The speed of pickle. For example, sharing NumPy arrays with pickle is not faster with shared memory. The overhead from pickle completely dominate the time needed for IPC . That is why I want a type specialized or a binary channel. Making this from the named shared memory class I already have is a no-brainer.
> 
> So that is my other objection against multiprocessing.
> 
> That is:
> 
> 1. Object sharing by handle inheritance fails when kernel objects must be passed back to the parent process or to a process pool. We need IPC objects that have a name in the kernel, so they can be created and shared in retrospect.
> 
> 2. IPC with multiprocessing is too slow due to pickle. We need something that does not use pickle. (E.g. shared memory, but not by means of mmap.) It might be that the pipe or socket in multiprocessing will do this (I have not looked at it carefully enough), but they still don't have
> 
> Proof of concept:
> 
> http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip
> 
> Dependency on Cython and NumPy should probably be removed, never mind that. Important part is this:
> 
> sharedmemory_sysv.pyx (Linux)
> sharedmemory_win.pyx and ntqueryobject.c (Windows)
> 
> Finally, I'd like to say that I think Python's standard lib should support high-performance asynchronous I/O for concurrency. That is not poll/select (on Windows it does not even work properly). Rather, I want IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about twisted.) There should also be a requirement that it works with multiprocessing. E.g. if we open a process pool, the processes should be able to use the same IOCP. In other words some highly scalable asynchronous I/O that works with multiprocessing.
> 
> So ... As far as I am concerned, the only thing worth keeping in multipricessing is multiprocessing.Process and multiprocessing.Pool. The rest doesn't do what we want.
> 
> 
> Sturla
> 

Sturla,

I think I've talked to you before - patches to improve multiprocessing from you are definitely welcome, and needed.

I disagree with tossing as much out as you are suggesting - managers are pretty useful, for example, but the entire team and especially me would welcome patches to improve things.

Jesse

From tjreedy at udel.edu  Sat Feb 11 01:07:51 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Feb 2012 19:07:51 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <jh4bha$4ag$1@dough.gmane.org>

On 2/10/2012 3:41 AM, Stephen J. Turnbull wrote:
> Terry Reedy writes:
>
>>>  In python 2 there was no such a strong imposition [of Unicode
>>>  awareness on users].

The claim is that Python3 imposes a large burden on users that Python2
does not.

>>  Nor is there in 3.x.

I view that claim as FUD, at least for many users, and at least until 
the persons making the claim demonstrate it. In particular, I claim that 
people who use Python2 knowing nothing of unicode do not need to know 
much more to do the same things in Python3. And, if someone uses Python2 
with full knowledge of Unicode, that Python3 cannot impose any extra 
burden. Since I am claiming negatives, the burden of proof is on those 
who claim otherwise.

> Sorry, Terry, but you're basically wrong here.

This is not a nice way to start a response, especially when you go on to 
admit that I was right as the the user case I discussed. Here is what 
you clipped.

 >> If one only uses the ascii subset, the usage of 3.x strings is 
transparent.

> True, if one sticks to pure ASCII, there's no difference to notice,

Which is a restatement what you clipped. In another post I detailed the 
*small* amount (one paragraph) that I believe such people need to know 
to move to Python3. I have not seen this minimum laid out before and I 
think it would be useful to help such people move to Python3 without FUD 
fear.

 > but that's just not possible for people who live outside of the U.S.,

Who *already* have to know about more than ascii to use Python2. The 
question is whether they have to learn *substantially* more to use Python3.

> or who share text with people outside of the U.S.
 > They need currency symbols, they have friends whose names
 > have little dots on them.

OK, real-life example. My wife has colleagues in China. They interchange 
emails (utf-8 encoded) with project budgets and some Chinese characters. 
Suppose she asks me to use Python to pick out ? renminbi/yuan figures 
and convert to dollars. What 'strong imposition' does Python3 make to 
learn things I would not have to know to do the same thing in Python2?

> Every single one of those is a backtrace waiting to happen.
> A backtrace on
>      f = open('text-file.txt')
>      for line in f: pass

I do not consider that adding an encoding argument to make the same work 
in Python3 to be "a strong imposition of unicode awareness". Do you? In 
order to do much other than pass, I believe one typically needs to know 
the encoding of the file, even in Python2. And of course, knowing about 
and using the one unicode byte encoding is *much* easier than knowing 
about and using the 100 or so non-unicode (or unicode subset) encodings.

To me, Python3's

   s = open('text.txt', 'utf-8').read()

is easier and simpler than either Python2 version below
(and please pardon any errors as I never actually did this)

   import codecs
   s = codecs.open('text.txt', 'utf-8').read()

or

   f = open('text.txt')
   s = unicode(f.read, 'utf-8')

-- 
Terry Jan Reedy


From anacrolix at gmail.com  Sat Feb 11 03:24:30 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sat, 11 Feb 2012 10:24:30 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh4bha$4ag$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
Message-ID: <CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>

Threading is a tool (the most popular, and most flexible tool) for
concurrency and parallelism. Compared to forking, multiprocessing, shared
memory, mmap, and dozens of other auxiliary OS concepts it's also the
easiest. Not all problems are clearly chunkable or fit some alternative
parallelism pattern. Threading is arguably the cheapest method for
parallelism, as we've heard throughout this thread.

Just because it can be dangerous is no reason to discourage it. Many
alternatives are equally as dangerous, more difficult and less portable.

Python is a very popular language.someone mentioned earlier that popularity
shouldn't be an argument for features but here it's fair ground. If Python
3 had unrestrained threading, this transition plunge would not be
happening. People would be flocking to it for their free lunches.
Unrestrained single process parallelism is the #1 reason not to choose
Python for a future project. Note that certain fields use alternative
parallelism like MPI, and whoopee for them, these aren't applicable to
general programming. Nor is the old argument "write a C extension".

Except for old stooges who can't let go of curly braces, most people agree
Python is the best mainstream language, but the implementation is holding
it back. The GIL has to go if CPython is to remain viable in the future for
non-trivial applications. The current transition is like VB when .NET came
out: everyone switched to C# rather than upgrade to VB.NET, because it was
wiser to switch to the better language than to pay the high upgrade cost.

Unfortunately the Python 3 ship has sailed, and presumably the GIL has to
remain until 4.x at the least. Given this, it seems there is some wisdom in
the current head-in-the-sand advice: It's too hard to remove the GIL so
just use some other mechanism if you want parallelism, but it's misleading
to suggest they're superior as described above. So with that in mind, can
the following changes occur in Python 3 without breaking spec?

- Replace the ref-counting with another GC?
- Remove the GIL?

If not, should these be relegated to Python 4 and alternate implementation
discussions?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120211/15a2bfdd/attachment.html>

From tjreedy at udel.edu  Sat Feb 11 04:20:15 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 10 Feb 2012 22:20:15 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
Message-ID: <jh4mq4$vth$1@dough.gmane.org>

Matt: directing a threading rant at me because I posted about unicode, a 
completely different subject, is bizarre. I have not said a word on this 
thread, and hardly ever on any other thread, about threading, 
concurrency, and the GIL.  I have no direct interest in these subjects. 
But since you directed this at me, I will answer.

On 2/10/2012 9:24 PM, Matt Joiner wrote:
...
> So with that in mind, can the following changes occur in Python 3
 > without breaking spec?
>
> - Replace the ref-counting with another GC?
> - Remove the GIL?

If you had paid attention to this thread and others, you would know

1. These are implementation details not in the spec.
2. There are other implementations without these.
3. People have attempted the changes you want for CPython. But so far, 
both would have substantial negative impacts on many CPython users, 
including me.
4. You are free to try to improve on previous work.

As to the starting subject of this thread: I switched to Python 1.3, 
just before 1.4, when Python was an obscure language in the Tiobe 20s. I 
thought then and still do that it was best for *me*, regardless of what 
others decided for themselves.

So while am I pleased that it usage has risen considerably, I do not 
mind that it has (relatively) plateaued over the last 5 years. And I am 
not panicked that an up wiggle was followed by a down wiggle.

-- 
Terry Jan Reedy


From stephen at xemacs.org  Sat Feb 11 04:32:20 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 11 Feb 2012 12:32:20 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh4bha$4ag$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
Message-ID: <87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > > Sorry, Terry, but you're basically wrong here.
 > 
 > This is not a nice way to start a response, especially when you go on to 
 > admit that I was right as the the user case I discussed. Here is what 
 > you clipped.

The point is that the user case you discuss is a toy case.  Of course
the problem goes away if you get to define the problem away.  I don't
know of any nice way to say that.

 > In another post I detailed the *small* amount (one paragraph) that
 > I believe such people need to know to move to Python3. I have not
 > seen this minimum laid out before and I think it would be useful to
 > help such people move to Python3 without FUD fear.

I'll go back and take a look at it.  It probably is useful.  But I
don't think it deals with the real issue.

The problem is that without substantially more knowledge than what you
describe as the minimum, the fear, uncertainty, and doubt is *real*.
Anybody who follows Mailman, for example, is going to hear (even
today, though much less frequently than 3 years ago, and only for
installations with ancient Mailman from 2006 or so) of weird Unicode
errors that cause messages to be "lost".  Hearing that Python 3
requires everything be decoded to Unicode is not going to give
innocent confidence.

There's also a lot of FUD being created out of whole cloth, as well,
such as the alleged inefficiency of recoding ASCII into Unicode, etc.,
which doesn't matter for most applications.  The problem is that the
FUD based on real issues that you don't understand gives credibility
to the FUD that somebody made up.

 > OK, real-life example. My wife has colleagues in China. They interchange 
 > emails (utf-8 encoded) with project budgets and some Chinese characters. 
 > Suppose she asks me to use Python to pick out ? renminbi/yuan figures 
 > and convert to dollars. What 'strong imposition' does Python3 make to 
 > learn things I would not have to know to do the same thing in
 > Python2?

None.  The FUD is not about *processing* non-ASCII.  It's about
non-ASCII horking your process even though you have no intention of
processing it.

 > I do not consider that adding an encoding argument to make the same work 
 > in Python3 to be "a strong imposition of unicode awareness". Do
 > you?

Yes, I do.  If you get it wrong, you will still get a fatal UnicodeError.

 > In order to do much other than pass, I believe one typically needs
 > to know the encoding of the file, even in Python2.

The gentleman once again seems to be suffering from a misconception.
Quite often you need to know nothing about the encoding of a file,
except that the parts you care about are ASCII-encoded.  For example,
in an American programming shop

    git log | ./count-files-touched-per-day.py

will founder on '?scar Fuentes' as author, unless you know what coding
system is used, or know enough to use latin-1 (because it's
effectively binary, not because it's the actual encoding).

 > And of course, knowing about and using the one unicode byte
 > encoding is *much* easier than knowing about and using the 100 or
 > so non-unicode (or unicode subset) encodings.
 > 
 > To me, Python3's
 > 
 >    s = open('text.txt', 'utf-8').read()
 > 
 > is easier and simpler than either Python2

Indeed, it is.  But we're not talking about dealing with Unicode;
we're talking about why somebody who really only wants to deal with
ASCII needs to know more about Unicode in Python 3 than in Python 2.

 > (and please pardon any errors as I never actually did this)
 > 
 >    import codecs
 >    s = codecs.open('text.txt', 'utf-8').read()
 > 
 > or
 > 
 >    f = open('text.txt')
 >    s = unicode(f.read, 'utf-8')

The reason why Unicode is part of the FUD is that in Python 2 you
never needed to do that, unless you wanted to deal with a non-English
language.  With Python 3 you need to deal with the codec, always, or
risk a UnicodeError simply because some Spaniard's name gets mentioned
by somebody who cares about orthography.


From anacrolix at gmail.com  Sat Feb 11 04:40:39 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sat, 11 Feb 2012 11:40:39 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh4mq4$vth$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
	<jh4mq4$vth$1@dough.gmane.org>
Message-ID: <CAB4yi1Nm8T5cPThhQJAf+j4X6TzJObRSmXa6-5WDrtSUef1mSw@mail.gmail.com>

I'm asking if it'd actually be accepted in 3. I know well, and have seen
how quickly things are blocked and rejected in core (dabeaz and shannon's
attempts come to mind). I'm well familiar with previous attempts.

As an example consider that replacing ref counting would probably change
the API, but is a prerequisite for performant removal of the GIL.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120211/4b4cb6b7/attachment.html>

From stephen at xemacs.org  Sat Feb 11 05:12:13 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 11 Feb 2012 13:12:13 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
Message-ID: <87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>

Jim Jewett writes:

 > Are you saying that some (many?  all?) platforms make a bad choice there?

No.  I'm saying that whatever choice is made (except for 'latin-1'
because it accepts all bytes regardless of the actual encoding of the
data, or PEP 383 "errors='surrogateescape'" for the same reason, both
of which are unacceptable defaults for production code *for the same
reason*), there is data that will cause that idiom to fail on Python 3
where it would not on Python 2.

This is especially the case if you work with older text data on Mac or
modern Linux where UTF-8 is used, because you're almost certain to run
into Latin-1-encoded files.  My favorite example is ChangeLogs, which
broke my Gentoo package manager when I experimented with using Python
3 as the default Python.  Most packages would work fine, but for some
reason some Python program in the PMS was actually reading the
ChangeLogs, and sometimes they'd be impure ASCII (I don't recall
whether it was utf-8 or latin-1), giving a fatal UnicodeError and
everything grinds to a halt.

That is reason enough for the naive to embrace fear, uncertainty, and
doubt about Python 3's use of Unicode.

The fact is that with a little bit of knowledge, you can almost
certainly get more reliable (and in case of failure, more debuggable)
results from Python 3 than from Python 2.  But people are happy to
deal with the devil they know, even though it's more noxious than the
devil they don't.  Counteracting FUD with words generally doesn't work
IME, unless the words are a "magic spell" that reduces the unknown to
the known.


From cmjohnson.mailinglist at gmail.com  Sat Feb 11 06:04:07 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Fri, 10 Feb 2012 19:04:07 -1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <50BA6538-76D0-4B1B-8C2A-6DBEB9B1B94B@gmail.com>


On Feb 10, 2012, at 5:32 PM, Stephen J. Turnbull wrote:

> will founder on '?scar Fuentes' as author, unless you know what coding
> system is used, or know enough to use latin-1 (because it's
> effectively binary, not because it's the actual encoding).

Or just use errors="surrogateescape". I think we should tell people who are scared of unicode and refuse to learn how to use it to just add an errors="surrogateescape" keyword to their file open arguments. Obviously, it's the wrong thing to do, but it's wrong in the same way that Python 2 bytes are wrong, so if you're absolutely committed to remaining ignorant of encodings, you can continue to do that.

From ncoghlan at gmail.com  Sat Feb 11 07:15:52 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 11 Feb 2012 16:15:52 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Nm8T5cPThhQJAf+j4X6TzJObRSmXa6-5WDrtSUef1mSw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
	<jh4mq4$vth$1@dough.gmane.org>
	<CAB4yi1Nm8T5cPThhQJAf+j4X6TzJObRSmXa6-5WDrtSUef1mSw@mail.gmail.com>
Message-ID: <CADiSq7d-Cyc7S9JA--95_6rBZcZLJr3KnnWU_pPQUT-HDoy1VQ@mail.gmail.com>

On Sat, Feb 11, 2012 at 1:40 PM, Matt Joiner <anacrolix at gmail.com> wrote:
> I'm asking if it'd actually be accepted in 3.

Why is that relevant? If free threading is the all-singing all dancing
wonderment you believe:

1. Fork CPython
2. Make it free-threaded (while retaining backwards compatibility with
all the C extensions out there!)
3. Watch the developer hordes flock to your door (after all, it's the
lack of free-threading that has held Python's growth back for the last
two decades, so everyone will switch in a heartbeat the second you, or
anyone else, publishes a free-threaded alternative where all their C
extensions work. Right?).

> I know well, and have seen how
> quickly things are blocked and rejected in core (dabeaz and shannon's
> attempts come to mind). I'm well familiar with previous attempts.

If that's what you think happened, then no, you're not familiar with
them at all.

python-dev has just a few simple rules for accepting a free-threading patch:

1. All current third party C extension modules must continue to work
(ask the folks working on Ironclad for IronPython and cpyext for PyPy
how much fun *that* requirement is)
2. Calls to builtin functions and methods should remain atomic (the
Jython experience can help a lot here)
3. The performance impact on single threaded scripts must be minimal
(which basically means eliding all the locks in single-threaded mode
the way CPython currently does with the GIL, but then creating those
elided locks in the correct state when Python's threading support gets
initialised)

That's it, that's basically all the criteria we have for accepting a
free-threading patch. However, while most people are quite happy to
say "Hey, someone should make CPython free-threaded!", they're
suddenly far less interested when faced with the task of implementing
it *themselves* while preserving backwards compatibility (and if you
think the Python 2 -> Python 3 transition is rough going, you are
definitely *not* prepared for the enormity of the task of trying to
move the entire C extension ecosystem away from the refcounting APIs.
The refcounting C API compatibility requirement is *not* optional if
you want a free-threaded CPython to be the least bit interesting in
real world terms).

When we can't even get enough volunteers willing to contribute back
their fixes and workarounds for the known flaws in multiprocessing, do
people think there is some magical concurrency fairy that will
sprinkle free threading pixie dust over CPython and the GIL will be
gone?

Removing the GIL *won't* be fun. Just like fixing multiprocessing, or
making improvements to the GIL itself, it will be a long, tedious
grind dealing with subtleties of the locking and threading
implementations on Windows, Linux, Mac OS X, *BSD, Solaris and various
other platforms where CPython is supported (or at least runs). For
extra fun, try to avoid breaking the world for CPython variants on
platforms that don't even *have* threading (e.g. PyMite). And your
reward for all that effort? A CPython with better support for what is
arguably one of the *worst* approaches to concurrency that computer
science has ever invented.

If a fraction of the energy that people put into asking for free
threading was instead put into asking "how can we make inter-process
communication better?", we'd no doubt have a good shared object
implementation in the mmap module by now (and someone might have
actually responded to Jesse's request for a new multiprocessing
maintainer when circumstances forced him to step down).

But no, this is the internet: it's much easier to incessantly berate
python-dev for pointing out that free threading would be
extraordinarily hard to implement correctly and isn't the panacea that
many folks seem to think it is than it is to go *do* something that's
more likely to offer a good return on the time investment required.

My own personal wishlist for Python's concurrency support?

* I want to see mmap2 up on PyPI, with someone working on fast shared
object IPC that can then be incorporated into the stdlib's mmap module
* I want to see multiprocessing2 on PyPI, with someone working on the
long list of multiprocessing bugs on the python.org bug tracker
(including adding support for Windows-style, non-fork based child
processes on POSIX platforms)
* I want to see progress on PEP 3153, so that some day we can have a
"Python event loop" instead of a variety of framework specific event
loops, as well as solid cross-platform async IO support in the stdlib.

As Jesse said earlier, asking for free threading in CPython is like
asking for free pie. Sure, free pie would be nice, but who's going to
bake it? And what else could those people be doing with their time if
they weren't busy baking pie?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stefan_ml at behnel.de  Sat Feb 11 09:11:36 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 11 Feb 2012 09:11:36 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
Message-ID: <jh57ro$mh3$1@dough.gmane.org>

Matt Joiner, 11.02.2012 03:24:
> Threading is a tool (the most popular, and most flexible tool) for
> concurrency and parallelism. Compared to forking, multiprocessing, shared
> memory, mmap, and dozens of other auxiliary OS concepts it's also the
> easiest.

Sure, "easy" as in "nothing is easier to get wrong". You did read my post
on this matter, right? I'm yet to see a piece of non-trivially parallel
code that uses threading and is known to be completely safe under all
circumstances. And I've seen a lot.


> Not all problems are clearly chunkable or fit some alternative
> parallelism pattern. Threading is arguably the cheapest method for
> parallelism, as we've heard throughout this thread.

Wrong again. Threading can be pretty expensive in terms of unexpected data
dependencies, but it certainly is in terms of debugging time. Debugging
spurious threading issues is amongst the hardest problems for a programmer.


> Just because it can be dangerous is no reason to discourage it. Many
> alternatives are equally as dangerous, more difficult and less portable.

Seriously - how is running separate processes less portable than threading?


> Python is a very popular language.someone mentioned earlier that popularity
> shouldn't be an argument for features but here it's fair ground. If Python
> 3 had unrestrained threading

Note that this is not a "Python 2 vs. Python 3" issue. In fact, it has
nothing to do with Python 3 in particular.

[stripped some additional garbage]

Stefan


From masklinn at masklinn.net  Sat Feb 11 09:26:14 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 11 Feb 2012 09:26:14 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
Message-ID: <6142B265-EAE6-4D04-8A2D-8F289344A06E@masklinn.net>

On 2012-02-11, at 03:24 , Matt Joiner wrote:
> Threading is a tool (the most popular, and most flexible tool) for
> concurrency and parallelism. Compared to forking, multiprocessing, shared
> memory, mmap, and dozens of other auxiliary OS concepts it's also the
> easiest.

Such a statement unqualified can only be declared wrong. Threading is
the most common due to Windows issues (historically, Unix parallelism
used multiple processes and the switch happened with the advent of
multiplatform tools, which focused on threading due to Windows's poor
performances and high overhead with processes), and it is also the
easiest tool *to start using*, because you just say "start a thread".

Which is equivalent to saying grenades are the easiest tool to handle
conversations because you just pull the pin.

Threads are by far the hardest concurrency tool to use because they throw
out the window all determinism in the whole program, and that determinism
then needs to be reclaimed through (very) careful analysis and the use of
locks or other such sub-tools.

And the flexibility claim is complete nonsense.

Oh, and so are your comparisons, "shared memory" and "mmap" are not
comparable to threading since they *are used* by and in threading. And
forking and multiprocessing are the same thing, only the initialization
call changes.

Finally, multiprocessing has a far better upgrade path (as e.g. Erlang
demonstrates): if your non-deterministic points are well delineated and
your interfaces to other concurrent execution points are well defined,
scaling from multiple cores to multiple machines becomes possible.

> Not all problems are clearly chunkable or fit some alternative
> parallelism pattern. Threading is arguably the cheapest method for
> parallelism, as we've heard throughout this thread.
> 
> Just because it can be dangerous is no reason to discourage it.

Of course it is, just as manual memory management is "discouraged".

> Many alternatives are equally as dangerous, more difficult and less portable.

The main alternative to threading is multiprocessing (via fork or via starting
new processes does not matter), it is significantly less dangerous, it is
only more difficult in that you can't take extremely dangerous shortcuts and
it is just as portable (if not more).

> Python is a very popular language.someone mentioned earlier that popularity
> shouldn't be an argument for features but here it's fair ground. If Python
> 3 had unrestrained threading, this transition plunge would not be
> happening.

Threading is a red herring, nobody fundamentally cares about threading,
what users want is a way to exploit their cores. If `multiprocessing` was
rock-solid and easier to use `threading` could just be killed and nobody
would care. And we'd probably find ourselves in far better a world.


From p.f.moore at gmail.com  Sat Feb 11 11:40:20 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 11 Feb 2012 10:40:20 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>

On 11 February 2012 04:12, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> This is especially the case if you work with older text data on Mac or
> modern Linux where UTF-8 is used, because you're almost certain to run
> into Latin-1-encoded files. ?My favorite example is ChangeLogs, which
> broke my Gentoo package manager when I experimented with using Python
> 3 as the default Python. ?Most packages would work fine, but for some
> reason some Python program in the PMS was actually reading the
> ChangeLogs, and sometimes they'd be impure ASCII (I don't recall
> whether it was utf-8 or latin-1), giving a fatal UnicodeError and
> everything grinds to a halt.
>
> That is reason enough for the naive to embrace fear, uncertainty, and
> doubt about Python 3's use of Unicode.

My concern about Unicode in Python 3 is that the principle is, you
specify the right encoding. But often, I don't *know* the encoding ;-(
Text files, like changelogs as a good example, generally have no
marker specifying the encoding, and they can have all sorts (depending
on where the package came from). Worse, I am on Windows and changelogs
usually come from Unix developers - so I'm not familiar with the
common conventions ("well, of course it's in UTF-8, that's what
everyone uses"...)

In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
but for my uses, that's not a disaster. Mostly-readable works. But in
Python 3, I get an error and can't process the file.

I can just use latin-1, or surrogateescape. But that doesn't come
naturally to me yet. Maybe it will in time... Or maybe there's a
better solution I don't know about yet.

To be clear - I am fully in favour of the Python 3 approach, and I
completely support the idea that people should know the encodings of
the stuff they are working with (I've seen others naively make
encoding mistakes often enough to know that when it matters, it really
does matter). But having to worry, not so much about the encoding to
use, but rather about the fact that Python is asking you a question
you can't answer, is a genuine stumbling block. And from what I've
seen, it's at the root of the problems many people have with Unicode
in Python 3.

I'm not arguing for changes to the default behaviour of Python 3. But
if we had a good place to put it, a FAQ entry about "what to do if I
need to process a file whose encoding I don't know" would be useful.
And certainly having a standard answer that people could give when the
question comes up (something practical, not a purist answer like "all
files have an encoding, so you should find out") would help.

Paul.


From p.f.moore at gmail.com  Sat Feb 11 11:47:44 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 11 Feb 2012 10:47:44 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh4bha$4ag$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
Message-ID: <CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>

On 11 February 2012 00:07, Terry Reedy <tjreedy at udel.edu> wrote:
>>> ?Nor is there in 3.x.
>
> I view that claim as FUD, at least for many users, and at least until the
> persons making the claim demonstrate it. In particular, I claim that people
> who use Python2 knowing nothing of unicode do not need to know much more to
> do the same things in Python3.

Concrete example, then.

I have a text file, in an unknown encoding (yes, it does happen to
me!) but opening in an editor shows it's mainly-ASCII. I want to find
all the lines starting with a '*'. The simple

with open('myfile.txt') as f:
    for line in f:
        if line.startswith('*'):
            print(line)

fails with encoding errors. What do I do? Short answer, grumble and go
and use grep (or in more complex cases, awk) :-(

Paul.


From ubershmekel at gmail.com  Sat Feb 11 11:51:52 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Sat, 11 Feb 2012 12:51:52 +0200
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
Message-ID: <CANSw7KwXYZrdT+UDLTR-Bv2FByZEEnWTRK9-0mdOfAcoBfw7pw@mail.gmail.com>

On Feb 11, 2012 12:41 PM, "Paul Moore" <p.f.moore at gmail.com> wrote
> if we had a good place to put it, a FAQ entry about "what to do if I
> need to process a file whose encoding I don't know" would be useful.
> And certainly having a standard answer that people could give when the
> question comes up (something practical, not a purist answer like "all
> files have an encoding, so you should find out") would help.

I think if the bytes type behaved exactly like python2's string it would
have been the best option. When you work with "wb" or "rb" you get quite a
hint that you're doing it wrong. But devs would have a viable ambiguous
*string* type (vs bytes and their integer cells).

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120211/3cbffd53/attachment.html>

From stefan_ml at behnel.de  Sat Feb 11 13:33:44 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 11 Feb 2012 13:33:44 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
Message-ID: <jh5n78$k3g$1@dough.gmane.org>

Paul Moore, 11.02.2012 11:47:
> On 11 February 2012 00:07, Terry Reedy wrote:
>>>>  Nor is there in 3.x.
>>
>> I view that claim as FUD, at least for many users, and at least until the
>> persons making the claim demonstrate it. In particular, I claim that people
>> who use Python2 knowing nothing of unicode do not need to know much more to
>> do the same things in Python3.
> 
> Concrete example, then.
> 
> I have a text file, in an unknown encoding (yes, it does happen to
> me!) but opening in an editor shows it's mainly-ASCII. I want to find
> all the lines starting with a '*'. The simple
> 
> with open('myfile.txt') as f:
>     for line in f:
>         if line.startswith('*'):
>             print(line)
> 
> fails with encoding errors. What do I do? Short answer, grumble and go
> and use grep (or in more complex cases, awk) :-(

Or just use the ISO-8859-1 encoding.

Stefan


From masklinn at masklinn.net  Sat Feb 11 13:41:19 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 11 Feb 2012 13:41:19 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh5n78$k3g$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
Message-ID: <3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>


On 2012-02-11, at 13:33 , Stefan Behnel wrote:

> Paul Moore, 11.02.2012 11:47:
>> On 11 February 2012 00:07, Terry Reedy wrote:
>>>>> Nor is there in 3.x.
>>> 
>>> I view that claim as FUD, at least for many users, and at least until the
>>> persons making the claim demonstrate it. In particular, I claim that people
>>> who use Python2 knowing nothing of unicode do not need to know much more to
>>> do the same things in Python3.
>> 
>> Concrete example, then.
>> 
>> I have a text file, in an unknown encoding (yes, it does happen to
>> me!) but opening in an editor shows it's mainly-ASCII. I want to find
>> all the lines starting with a '*'. The simple
>> 
>> with open('myfile.txt') as f:
>>    for line in f:
>>        if line.startswith('*'):
>>            print(line)
>> 
>> fails with encoding errors. What do I do? Short answer, grumble and go
>> and use grep (or in more complex cases, awk) :-(
> 
> Or just use the ISO-8859-1 encoding.

It's true that requires to handle encodings upfront where Python 2 allowed you
to play fast-and-lose though.

And using latin-1 in that context looks and feels weird/icky, the file is not
encoded using latin-1, the encoding just happens to work to manipulate bytes as
ascii text + non-ascii stuff.

From stefan_ml at behnel.de  Sat Feb 11 13:53:40 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 11 Feb 2012 13:53:40 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
Message-ID: <jh5ocl$ru9$1@dough.gmane.org>

Masklinn, 11.02.2012 13:41:
> On 2012-02-11, at 13:33 , Stefan Behnel wrote:
>> Paul Moore, 11.02.2012 11:47:
>>> On 11 February 2012 00:07, Terry Reedy wrote:
>>>>>> Nor is there in 3.x.
>>>>
>>>> I view that claim as FUD, at least for many users, and at least until the
>>>> persons making the claim demonstrate it. In particular, I claim that people
>>>> who use Python2 knowing nothing of unicode do not need to know much more to
>>>> do the same things in Python3.
>>>
>>> Concrete example, then.
>>>
>>> I have a text file, in an unknown encoding (yes, it does happen to
>>> me!) but opening in an editor shows it's mainly-ASCII. I want to find
>>> all the lines starting with a '*'. The simple
>>>
>>> with open('myfile.txt') as f:
>>>    for line in f:
>>>        if line.startswith('*'):
>>>            print(line)
>>>
>>> fails with encoding errors. What do I do? Short answer, grumble and go
>>> and use grep (or in more complex cases, awk) :-(
>>
>> Or just use the ISO-8859-1 encoding.
> 
> It's true that requires to handle encodings upfront where Python 2 allowed you
> to play fast-and-lose though.

Well, except for the cases where that didn't work. Remember that implicit
encoding behaves in a platform dependent way in Python 2, so even if your
code runs on your machine doesn't mean it will work for anyone else.


> And using latin-1 in that context looks and feels weird/icky, the file is not
> encoded using latin-1, the encoding just happens to work to manipulate bytes as
> ascii text + non-ascii stuff.

Correct. That's precisely the use case described above.

Besides, it's perfectly possible to process bytes in Python 3. You just
have to open the file in binary mode and do the processing at the byte
string level. But if you don't care (and if most of the data is really
ASCII-ish), using the ISO-8859-1 encoding in and out will work just fine
for problems like the above.

Stefan


From sturla at molden.no  Sat Feb 11 14:18:50 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 11 Feb 2012 14:18:50 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
Message-ID: <4F366ABA.8090903@molden.no>

Den 11.02.2012 03:24, skrev Matt Joiner:
> Threading is a tool (the most popular, and most flexible tool) for
> concurrency and parallelism. Compared to forking, multiprocessing, shared
> memory, mmap, and dozens of other auxiliary OS concepts it's also the
> easiest.

I see you really know your stuff.


> Not all problems are clearly chunkable or fit some alternative
> parallelism pattern.

Then they don't fit threading either.


Sturla


From sturla at molden.no  Sat Feb 11 15:01:47 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 11 Feb 2012 15:01:47 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7d-Cyc7S9JA--95_6rBZcZLJr3KnnWU_pPQUT-HDoy1VQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
	<jh4mq4$vth$1@dough.gmane.org>
	<CAB4yi1Nm8T5cPThhQJAf+j4X6TzJObRSmXa6-5WDrtSUef1mSw@mail.gmail.com>
	<CADiSq7d-Cyc7S9JA--95_6rBZcZLJr3KnnWU_pPQUT-HDoy1VQ@mail.gmail.com>
Message-ID: <4F3674CB.6060201@molden.no>

Den 11.02.2012 07:15, skrev Nick Coghlan:
> Why is that relevant? If free threading is the all-singing all dancing
> wonderment you believe:
>
> 1. Fork CPython
> 2. Make it free-threaded (while retaining backwards compatibility with
> all the C extensions out there!)
> 3. Watch the developer hordes flock to your door (after all, it's the
> lack of free-threading that has held Python's growth back for the last
> two decades, so everyone will switch in a heartbeat the second you, or
> anyone else, publishes a free-threaded alternative where all their C
> extensions work. Right?).

There are several solutions to this, I think.

One is to use one interpreter per thread, and share no data between 
them, similar to tcl and PerlFork. The drawback is developers who forget 
to duplicate file handles, so one interpreter can close a handle used by 
another.

Another solution is transactional memory. Consider a database with 
commit and rollback. Not sure how to fit this with C extensions though, 
but one could in theory build a multithreaded interpreter like that.

> If a fraction of the energy that people put into asking for free 
> threading was instead put into asking "how can we make inter-process 
> communication better?", we'd no doubt have a good shared object 
> implementation in the mmap module by now (and someone might have 
> actually responded to Jesse's request for a new multiprocessing 
> maintainer when circumstances forced him to step down).

I think I already explained why BSD mmap is a dead end. We need named 
kernel objects (System V IPC or Unix domain sockets) as they can be 
communicated between processes.

There is also reasons to prefer SysV message queues over shared memory 
(Sys V or BSD), such a thread safety. I.e. access is synchronized by the 
kernel. SysV message queues also have atomic read/write, unlike sockets, 
and they are generally faster than pipes. With sockets we have to ensure 
that the correct number of bytes were read or written, which is a PITA 
for any IPC use (or any other messaging for that matter). In the 
meanwhile, take a look at ZeroMQ (actually written ?MQ). ZeroMQ also 
have atomic read/write messages.

Sturla


From p.f.moore at gmail.com  Sat Feb 11 15:29:34 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 11 Feb 2012 14:29:34 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
Message-ID: <CACac1F8oebNKWUPY8Ch30q-9GczzhFum=oDLcjbWC9P0Uv2_7g@mail.gmail.com>

On 11 February 2012 12:41, Masklinn <masklinn at masklinn.net> wrote:
>>> with open('myfile.txt') as f:
>>> ? ?for line in f:
>>> ? ? ? ?if line.startswith('*'):
>>> ? ? ? ? ? ?print(line)
>>>
>>> fails with encoding errors. What do I do? Short answer, grumble and go
>>> and use grep (or in more complex cases, awk) :-(
>>
>> Or just use the ISO-8859-1 encoding.
>
> It's true that requires to handle encodings upfront where Python 2 allowed you
> to play fast-and-lose though.
>
> And using latin-1 in that context looks and feels weird/icky, the file is not
> encoded using latin-1, the encoding just happens to work to manipulate bytes as
> ascii text + non-ascii stuff.

To be honest, I'm fine with the answer "use latin1" for this case.
Practicality beats purity and all that. But as you say, it feels wrong
somehow. I suspect that errors=surrogateescape is the better "I don't
really care" option. And I still maintain it would be useful for
combating FUD if there was a commonly-accepted idiom for this.

Interestingly, on my Windows PC, if I open a file using no encoding in
Python 3, I seem to get code page 1252:

Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> f = open("unicode.txt")
>>> f.encoding
'cp1252'
>>>

So actually, on this PC, I can't really provoke these sorts of
decoding error problems (CP1252 accepts all bytes, it's basically
latin1). Whether this is a good thing or a bad thing, I'm not sure :-)

Paul


From sturla at molden.no  Sat Feb 11 16:10:03 2012
From: sturla at molden.no (Sturla Molden)
Date: Sat, 11 Feb 2012 16:10:03 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F35A9EF.7030309@molden.no>
References: <4F35A9EF.7030309@molden.no>
Message-ID: <4F3684CB.3090502@molden.no>

Den 11.02.2012 00:36, skrev Sturla Molden:
>
>
> Proof of concept:
>
> http://dl.dropbox.com/u/12464039/sharedmem-feb12-2009.zip
>

Sorry, wrong version. Use this instead:

http://dl.dropbox.com/u/12464039/sharedmem.zip

Sturla


From solipsis at pitrou.net  Sat Feb 11 16:27:08 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 11 Feb 2012 16:27:08 +0100
Subject: [Python-ideas] multiprocessing IPC
References: <4F35A9EF.7030309@molden.no>
Message-ID: <20120211162708.03111da7@pitrou.net>

On Sat, 11 Feb 2012 00:36:15 +0100
Sturla Molden <sturla at molden.no> wrote:
> 
> Finally, I'd like to say that I think Python's standard lib should 
> support high-performance asynchronous I/O for concurrency. That is not 
> poll/select (on Windows it does not even work properly). Rather, I want 
> IOCP on Windows, epoll on Linux, and kqueue on Mac. (Yes I know about 
> twisted.)

This is not trivial (especially the IOCP part, if I consider the amount
of code Twisted has for that).

> There should also be a requirement that it works with 
> multiprocessing. E.g. if we open a process pool, the processes should be 
> able to use the same IOCP. In other words some highly scalable 
> asynchronous I/O that works with multiprocessing.

Ouch.

Regards

Antoine.


From masklinn at masklinn.net  Sat Feb 11 17:18:34 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 11 Feb 2012 17:18:34 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh5ocl$ru9$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
Message-ID: <08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>

On 2012-02-11, at 13:53 , Stefan Behnel wrote:
> Well, except for the cases where that didn't work. Remember that implicit
> encoding behaves in a platform dependent way in Python 2, so even if your
> code runs on your machine doesn't mean it will work for anyone else.

Sure, I said it allowed you, not that this allowance actually worked.

>> And using latin-1 in that context looks and feels weird/icky, the file is not
>> encoded using latin-1, the encoding just happens to work to manipulate bytes as
>> ascii text + non-ascii stuff.
> 
> Correct. That's precisely the use case described above.

Yes, but now instead of just ignoring that stuff you have to actively and
knowingly lie to Python to get it to shut up.

> Besides, it's perfectly possible to process bytes in Python 3. You just
> have to open the file in binary mode and do the processing at the byte
> string level.

I think that's the route which should be taken, but (and I'll readily
admit not to have followed the current state of this story) I'd understood
manipulations of bytes-as-ascii-characters-and-stuff to be far more annoying
(in Python 3) than string manipulation even for simple use cases.


From tjreedy at udel.edu  Sat Feb 11 17:24:38 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 11 Feb 2012 11:24:38 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <jh64oq$fh6$1@dough.gmane.org>

On 2/10/2012 10:32 PM, Stephen J. Turnbull wrote:

The issue is whether Python 3 has a "strong imposition of Unicode
awareness" that Python 2 does not. If the OP only meant awareness of the 
fact that something called 'unicode' exists, then I suppose that could 
be argued. I interpreted the claim as being about some substantive 
knowledge of unicode.

In any case, the claim that I disagree a not about people's reactions to 
Python 3 or about human psychology and the propensity to stick with the 
known.

In response to Jim Jewett, you wrote
 > The fact is that with a little bit of knowledge, you can almost
 > certainly get more reliable (and in case of failure, more debuggable)
 > results from Python 3 than from Python 2.

That is pretty much my counterclaim, with the note that the 'little bit 
of knowledge' is most about non-unicode encodings and the change to some 
Python details.

> The point is that the user case you discuss is a toy case.

Thanks for dismissing me and perhaps a hundred thousand users as a 'toy 
cases'.

> the problem goes away if you get to define the problem away.

Doing case analysis, starting with the easiest cases is not defining the 
problem away. It is rather, an attempt to find the 'little bit on 
knowledge' needed in various cases. In your response, you went on to write

 > Counteracting FUD with words generally doesn't work
 > unless the words are a "magic spell" that reduces the unknown to
 > the known.

Exactly, and finding the Python 3 version of the magic spells needed in 
various cases, so they can be documented and publicized, is what I have 
been trying to do. For ascii-only use, the magic spell in 'ascii' in 
bytes() calls. For some other uses, it is 'encoding=latin-1' in open(), 
str(), and bytes() calls, and perhaps elsewhere. Neither of these 
constitute substantial 'unicode awareness'.

> I don't know of any nice way to say that.

There was no need to say it.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Feb 11 17:44:59 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 11 Feb 2012 11:44:59 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
Message-ID: <jh65uv$ng5$1@dough.gmane.org>

On 2/11/2012 5:47 AM, Paul Moore wrote:
> On 11 February 2012 00:07, Terry Reedy<tjreedy at udel.edu>  wrote:
>>>>   Nor is there in 3.x.
>>
>> I view that claim as FUD, at least for many users, and at least until the
>> persons making the claim demonstrate it. In particular, I claim that people
>> who use Python2 knowing nothing of unicode do not need to know much more to
>> do the same things in Python3.
>
> Concrete example, then.
>
> I have a text file, in an unknown encoding (yes, it does happen to
> me!) but opening in an editor shows it's mainly-ASCII. I want to find
> all the lines starting with a '*'. The simple
>
> with open('myfile.txt') as f:
>      for line in f:
>          if line.startswith('*'):
>              print(line)
>
> fails with encoding errors. What do I do?

Good example. I believe adding ", encoding='latin-1'" to open() is 
sufficient. (And from your response elsewhere to Stephen, you seem to 
know that.) This should be in the tutorial if not already. But in 
reference to what I wrote above, knowing that magic phrase is not 
'knowledge of unicode'. And I include it in the 'not much more 
knowledge' needed for Python 3.

-- 
Terry Jan Reedy


From masklinn at masklinn.net  Sat Feb 11 18:00:17 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 11 Feb 2012 18:00:17 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh65uv$ng5$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh65uv$ng5$1@dough.gmane.org>
Message-ID: <FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>


On 2012-02-11, at 17:44 , Terry Reedy wrote:

> On 2/11/2012 5:47 AM, Paul Moore wrote:
>> On 11 February 2012 00:07, Terry Reedy<tjreedy at udel.edu>  wrote:
>>>>>  Nor is there in 3.x.
>>> 
>>> I view that claim as FUD, at least for many users, and at least until the
>>> persons making the claim demonstrate it. In particular, I claim that people
>>> who use Python2 knowing nothing of unicode do not need to know much more to
>>> do the same things in Python3.
>> 
>> Concrete example, then.
>> 
>> I have a text file, in an unknown encoding (yes, it does happen to
>> me!) but opening in an editor shows it's mainly-ASCII. I want to find
>> all the lines starting with a '*'. The simple
>> 
>> with open('myfile.txt') as f:
>>     for line in f:
>>         if line.startswith('*'):
>>             print(line)
>> 
>> fails with encoding errors. What do I do?
> 
> Good example. I believe adding ", encoding='latin-1'" to open() is sufficient.

Why not open the file in binary mode in stead? (and replace `'*'` by `b'*'` in
the startswith call)

From tjreedy at udel.edu  Sat Feb 11 18:25:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 11 Feb 2012 12:25:49 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh5ocl$ru9$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
Message-ID: <jh68bh$7u9$1@dough.gmane.org>

On 2/11/2012 7:53 AM, Stefan Behnel wrote:
> Masklinn, 11.02.2012 13:41:
>> On 2012-02-11, at 13:33 , Stefan Behnel wrote:
>>> Paul Moore, 11.02.2012 11:47:
>>>> On 11 February 2012 00:07, Terry Reedy wrote:
>>>>>>> Nor is there in 3.x.
>>>>>
>>>>> I view that claim as FUD, at least for many users, and at least until the
>>>>> persons making the claim demonstrate it. In particular, I claim that people
>>>>> who use Python2 knowing nothing of unicode do not need to know much more to
>>>>> do the same things in Python3.
>>>>
>>>> Concrete example, then.
>>>>
>>>> I have a text file, in an unknown encoding (yes, it does happen to
>>>> me!) but opening in an editor shows it's mainly-ASCII. I want to find
>>>> all the lines starting with a '*'. The simple
>>>>
>>>> with open('myfile.txt') as f:
>>>>     for line in f:
>>>>         if line.startswith('*'):
>>>>             print(line)
>>>>
>>>> fails with encoding errors. What do I do? Short answer, grumble and go
>>>> and use grep (or in more complex cases, awk) :-(
>>>
>>> Or just use the ISO-8859-1 encoding.
>>
>> It's true that requires to handle encodings upfront where Python 2 allowed you
>> to play fast-and-lose though.
>
> Well, except for the cases where that didn't work. Remember that implicit
> encoding behaves in a platform dependent way in Python 2, so even if your
> code runs on your machine doesn't mean it will work for anyone else.
>
>
>> And using latin-1 in that context looks and feels weird/icky, the file is not
>> encoded using latin-1, the encoding just happens to work to manipulate bytes as
>> ascii text + non-ascii stuff.
>
> Correct. That's precisely the use case described above.
>
> Besides, it's perfectly possible to process bytes in Python 3. You just
> have to open the file in binary mode and do the processing at the byte
> string level. But if you don't care (and if most of the data is really
> ASCII-ish), using the ISO-8859-1 encoding in and out will work just fine
> for problems like the above.

If one has ascii text + unspecified 'other stuff', one can either 
process as 'polluted text' or as 'bytes with some ascii character 
codes'. Since (as I just found out) one can iterate binary mode files by 
line just as with text mode, I am not sure what the tradeoffs are. I 
would guess it is mostly whether one wants to process a sequence of 
characters or a sequence of character codes (ints).

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Sat Feb 11 18:43:41 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 11 Feb 2012 12:43:41 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh65uv$ng5$1@dough.gmane.org>
	<FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
Message-ID: <jh69d1$enh$1@dough.gmane.org>

On 2/11/2012 12:00 PM, Masklinn wrote:
>
> On 2012-02-11, at 17:44 , Terry Reedy wrote:
>
>> On 2/11/2012 5:47 AM, Paul Moore wrote:
>>> I have a text file, in an unknown encoding (yes, it does happen
>>> to me!) but opening in an editor shows it's mainly-ASCII. I want
>>> to find all the lines starting with a '*'. The simple
>>>
>>> with open('myfile.txt') as f:
 >>>   for line in f:
 >>>     if line.startswith('*'): print(line)
>>>
>>> fails with encoding errors. What do I do?
>>
>> Good example. I believe adding ", encoding='latin-1'" to open() is
>> sufficient.
>
> Why not open the file in binary mode in stead? (and replace `'*'` by
> `b'*'` in the startswith call)

When I wrote that response, I thought that 'for line in f' would not 
work for binary-mode files. I then opened IDLE, experimented with 'rb', 
and discovered otherwise. So the remaining issue is how one wants the 
unknown encoding bytes to appear when printed -- as hex escapes, or as 
arbitrary but more readable non-ascii latin-1 chars.

-- 
Terry Jan Reedy


From stefan_ml at behnel.de  Sat Feb 11 20:35:28 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 11 Feb 2012 20:35:28 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
Message-ID: <jh6fu0$qdg$1@dough.gmane.org>

Masklinn, 11.02.2012 17:18:
> On 2012-02-11, at 13:53 , Stefan Behnel wrote:
>> Well, except for the cases where that didn't work. Remember that implicit
>> encoding behaves in a platform dependent way in Python 2, so even if your
>> code runs on your machine doesn't mean it will work for anyone else.
> 
> Sure, I said it allowed you, not that this allowance actually worked.
> 
>>> And using latin-1 in that context looks and feels weird/icky, the file is not
>>> encoded using latin-1, the encoding just happens to work to manipulate bytes as
>>> ascii text + non-ascii stuff.
>>
>> Correct. That's precisely the use case described above.
> 
> Yes, but now instead of just ignoring that stuff you have to actively and
> knowingly lie to Python to get it to shut up.

The advantage is that it becomes explicit what you are doing. In Python 2,
without any encoding, you are implicitly assuming that the encoding is
Latin-1, because that's how you are processing it. You're just not spelling
it out anywhere, thus leaving it to the innocent reader to guess what's
happening. In Python 3, and in better Python 2 code (using codecs.open(),
for example), you'd make it clear right in the open() call that Latin-1 is
the way you are going to process the data.


>> Besides, it's perfectly possible to process bytes in Python 3. You just
>> have to open the file in binary mode and do the processing at the byte
>> string level.
> 
> I think that's the route which should be taken

Oh, absolutely not. When it's text, it's best to process it as Unicode.

Stefan


From masklinn at masklinn.net  Sat Feb 11 20:46:52 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 11 Feb 2012 20:46:52 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh6fu0$qdg$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
Message-ID: <AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>

On 2012-02-11, at 20:35 , Stefan Behnel wrote:
> 
>> Yes, but now instead of just ignoring that stuff you have to actively and
>> knowingly lie to Python to get it to shut up.
> 
> The advantage is that it becomes explicit what you are doing. In Python 2,
> without any encoding, you are implicitly assuming that the encoding is
> Latin-1, because that's how you are processing it. You're just not spelling
> it out anywhere, thus leaving it to the innocent reader to guess what's
> happening. In Python 3, and in better Python 2 code (using codecs.open(),
> for example), you'd make it clear right in the open() call that Latin-1 is
> the way you are going to process the data.

I'm not sure going from "ignoring it" to "explicitly lying about it" is a
great step forward. latin-1 is not "the way you are going to process the data"
in this case, it's just the easiest way to get Python to shut up and open the
damn thing.

>>> Besides, it's perfectly possible to process bytes in Python 3. You just
>>> have to open the file in binary mode and do the processing at the byte
>>> string level.
>> 
>> I think that's the route which should be taken
> 
> Oh, absolutely not. When it's text, it's best to process it as Unicode.

Except it's not processed as text, it's processed as "stuff with ascii
characters in it". Might just as well be cp-1252, or UTF-8, or Shift JIS
(which is kinda-sorta-extended-ascii but not exactly), and while using
an ISO-8859 will yield unicode data that's about the only thing you can
say about it and the actual result will probably be mojibake either way.

By processing it as bytes, it's made explicit that this is not known and
decoded text (which is what unicode strings imply) but that it's some
semi-arbitrary ascii-compatible encoding and that's the extent of the
developer's knowledge and interest in it.

From stefan_ml at behnel.de  Sat Feb 11 21:08:38 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 11 Feb 2012 21:08:38 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
Message-ID: <jh6hs7$7tb$1@dough.gmane.org>

Masklinn, 11.02.2012 20:46:
> On 2012-02-11, at 20:35 , Stefan Behnel wrote:
>>
>>> Yes, but now instead of just ignoring that stuff you have to actively and
>>> knowingly lie to Python to get it to shut up.
>>
>> The advantage is that it becomes explicit what you are doing. In Python 2,
>> without any encoding, you are implicitly assuming that the encoding is
>> Latin-1, because that's how you are processing it. You're just not spelling
>> it out anywhere, thus leaving it to the innocent reader to guess what's
>> happening. In Python 3, and in better Python 2 code (using codecs.open(),
>> for example), you'd make it clear right in the open() call that Latin-1 is
>> the way you are going to process the data.
> 
> I'm not sure going from "ignoring it" to "explicitly lying about it" is a
> great step forward. latin-1 is not "the way you are going to process the data"
> in this case, it's just the easiest way to get Python to shut up and open the
> damn thing.
> 
>>>> Besides, it's perfectly possible to process bytes in Python 3. You just
>>>> have to open the file in binary mode and do the processing at the byte
>>>> string level.
>>>
>>> I think that's the route which should be taken
>>
>> Oh, absolutely not. When it's text, it's best to process it as Unicode.
> 
> Except it's not processed as text, it's processed as "stuff with ascii
> characters in it". Might just as well be cp-1252, or UTF-8, or Shift JIS

Well, you are still processing it as text because you are (again,
implicitly) assuming those ASCII characters to be just that: ASCII encoded
characters. You couldn't apply the same byte processing algorithm to UCS2
encoded text or a compressed gzip file, for example, at least not with a
useful outcome.

Mind you, I'm not regarding any text semantics here. I'm not considering
whether the thus decoded data results in French, Danish, German or other
human words, or in completely incomprehensible garbage. That's not
relevant. What is relevant is that the program assumes an identity mapping
from 1 byte to 1 character to work correctly, which, speaking in Unicode
terms, implies Latin-1 decoding. Therefore my advice to make that
assumption explicit.

Stefan


From p.f.moore at gmail.com  Sun Feb 12 00:14:23 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 11 Feb 2012 23:14:23 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh65uv$ng5$1@dough.gmane.org>
	<FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
Message-ID: <CACac1F9fY1uc_vrWwg5PN1zQx66NRiJbNa=6AuGmBAoRrzzxJA@mail.gmail.com>

On 11 February 2012 17:00, Masklinn <masklinn at masklinn.net> wrote:
>> Good example. I believe adding ", encoding='latin-1'" to open() is sufficient.
>
> Why not open the file in binary mode in stead? (and replace `'*'` by `b'*'` in
> the startswith call)

In my view, that's less scalable to more complex cases. It's likely
you'll hit things you need to do that don't translate easily to bytes
sooner than if you stick in a string-only world. A simple example,
check for a regex rather than a simple starting character.

The problem I have with encoding="latin-1" is that in many cases I
*know* that's a lie. From what's been said in this discussion so far,
I think that the "better" way to say "I know this file contains mostly
ASCII, but there's some other bits I'm not sure about but don't care
too much as long as they round-trip cleanly" is
encoding="ascii",errors="surrogateescape". But as we've seen here,
that's not the idiom that gets recommended by everyone (the "One
Obvious Way", if you like).

I suspect that if the community did embrace a "one obvious way", that
would reduce the "Python 3 makes me need to know Unicode" FUD that's
around. But as long as people get 3 different answers when they ask
the question,  there's going to be uncertainty and doubt (and hence,
probably, fear...)

Paul.

PS I'm pretty confident that I have *my* answer now
(ascii/surrogateescape). So this thread was of benefit to me, if
nothing else, and my thanks for that.


From cs at zip.com.au  Sun Feb 12 00:18:06 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 12 Feb 2012 10:18:06 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F341561.3050409@pearwood.info>
References: <4F341561.3050409@pearwood.info>
Message-ID: <20120211231805.GA7853@cskk.homeip.net>

On 10Feb2012 05:50, Steven D'Aprano <steve at pearwood.info> wrote:
| Python 4.x (Python 4000) is pure vapourware. It it irresponsible to tell 
| people to stick to Python 2.7 (there will be no 2.8) in favour of something 
| which may never exist.
| 
| http://www.python.org/dev/peps/pep-0404/

Please tell me this PEP number is deliberate!
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Once I reached adulthood, I never had enemies until I posted to Usenet.
        - Barry Schwartz <bbs at hankel.rutgers.edu>


From p.f.moore at gmail.com  Sun Feb 12 00:24:04 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 11 Feb 2012 23:24:04 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
Message-ID: <CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>

On 11 February 2012 19:46, Masklinn <masklinn at masklinn.net> wrote:
>>>> Besides, it's perfectly possible to process bytes in Python 3. You just
>>>> have to open the file in binary mode and do the processing at the byte
>>>> string level.
>>>
>>> I think that's the route which should be taken
>>
>> Oh, absolutely not. When it's text, it's best to process it as Unicode.
>
> Except it's not processed as text, it's processed as "stuff with ascii
> characters in it". Might just as well be cp-1252, or UTF-8, or Shift JIS
> (which is kinda-sorta-extended-ascii but not exactly), and while using
> an ISO-8859 will yield unicode data that's about the only thing you can
> say about it and the actual result will probably be mojibake either way.

No, not at all. It *is* text. I *know* it's text. I know that it is
encoded in an ASCII-superset (because I can read it in a text editor
and *see* that it is). What I *don't* know is what those funny bits of
mojibake I see in the text editor are. But I don't really care. Yes, I
could do some analysis based on the surrounding text and confirm
whether it's latin-1, utf-8, or something similar. But it honestly
doesn't matter to me, as all I care about is parsing the file to find
the change authors, and printing their names (to re-use the
"manipulating a ChangeLog file" example). And even if it did matter,
the next file might be in a different ASCII-superset encoding, but I
*still* won't care because the parsing code will be exactly the same.

Saying "it's bytes" is even more of a lie than "it's latin-1". The
honest truth is "it's an ASCII superset", and that's all I need to
know to do the job manually, so I'd like to write code to do the same
job without needing to lie about what I know. I'm now 100% convinced
that encoding="ascii",errors="surrogateescape" is the way to say this
in code.

Paul.


From mwm at mired.org  Sun Feb 12 02:52:00 2012
From: mwm at mired.org (Mike Meyer)
Date: Sat, 11 Feb 2012 20:52:00 -0500
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F35A9EF.7030309@molden.no>
References: <4F35A9EF.7030309@molden.no>
Message-ID: <20120211205200.2667c68f@bhuda.mired.org>

pwdOn Sat, 11 Feb 2012 00:36:15 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 10.02.2012 22:15, skrev Mike Meyer:
> > In what way does the mmap module fail to provide your binary file 
> > interface? <mike 
> The short answer is that BSD mmap creates an anonymous kernel object. 

First, I didn't ask about "BSD mmap", I asked about the "mmap
module". They aren't the same thing.

> When working with multiprocessing for a while, one comes to the 
> conclusion that we really need named kernel objects.

And both the BSD mmap (at least in recent systems) and the mmap module
provide objects with names in the file system space. IIUC, while there
are systems that won't let you create anonymous objects (like early
versions of the mmap module), there aren't any - at least any longer -
that won't let you create named objects.

> Here are two simple fail cases for anonymous kernel objects:

[elided, since the restriction doesn't exist]

> All of multiprocessing's IPC classes suffer from this!

Some of them may. The one I asked about doesn't.

> Solution:
> 
> Use named kernel objects for IPC, pickle the name.

You don't need to pickle the name if you use mmap's native name system
- it's just a string.

> There is another drawback too:
> 
> The speed of pickle. For example, sharing NumPy arrays with pickle is 
> not faster with shared memory. The overhead from pickle completely 
> dominate the time needed for IPC . That is why I want a type specialized 
> or a binary channel. Making this from the named shared memory class I 
> already have is a no-brainer.

> So that is my other objection against multiprocessing.
>
> 1. Object sharing by handle inheritance fails when kernel objects must 
> be passed back to the parent process or to a process pool. We need IPC 
> objects that have a name in the kernel, so they can be created and 
> shared in retrospect.

We've already got that one. You just need to learn how to use it.

> 2. IPC with multiprocessing is too slow due to pickle. We need something 
> that does not use pickle. (E.g. shared memory, but not by means of 
> mmap.) It might be that the pipe or socket in multiprocessing will do 
> this (I have not looked at it carefully enough), but they still don't have

Since can use pickle, you're only dealing with small amounts of
data. There are better performing serialization tools available (or
they can easily be created if you have to deal with large amounts of
data), and those work fine for a large variety of problems. If they
aren't fast enough, neither a socket nor a pipe will solve the basic
issue of needing to serialize the data in order to communicate it.

This isn't a problem with mmap per se, and it's not a problem that
anything that can be accurately described as a "file" - as in your
"binary file interface" - is going to solve.

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From cmjohnson.mailinglist at gmail.com  Sun Feb 12 03:27:27 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sat, 11 Feb 2012 16:27:27 -1000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
Message-ID: <9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>


On Feb 11, 2012, at 12:40 AM, Paul Moore wrote:

> In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
> but for my uses, that's not a disaster. Mostly-readable works. But in
> Python 3, I get an error and can't process the file.
> 
> I can just use latin-1, or surrogateescape. But that doesn't come
> naturally to me yet. Maybe it will in time... Or maybe there's a
> better solution I don't know about yet.

I'm confused what you're asking for. Setting errors to surrogateescape or encoding to Latin-1 causes Python 3 to behave the exact same way as Python 2: it's doing the "wrong" thing and may result in mojibake, but at least it isn't screwing up anything new so long as the stuff you add to the file is in ASCII. The only way to make Python 3 slightly more like Python 2 would be to set errors="surrogateescape" by default instead of asking the programmer to know to use it. I think that would be going too far, but it could be done. I think it would be simpler though to just publicize errors="surrogateescape" more. 

"Dear people who don't care about encodings and don't want to take the time to get them right, just put errors='surrogateescape' into your open commands and Python 3 will behave almost exactly like Python 2. The end." 

Is that really so hard? I'm confused about what else people want. 

From greg at krypto.org  Sun Feb 12 03:27:19 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 11 Feb 2012 18:27:19 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120210183801.59921627@pitrou.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<20120210183801.59921627@pitrou.net>
Message-ID: <CAGE7PN+qgdOkPKh9sb02ediABhSMz+8LjErRbqtAfRCXcJejMw@mail.gmail.com>

On Fri, Feb 10, 2012 at 9:38 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Fri, 10 Feb 2012 08:52:16 -0600
> Massimo Di Pierro
> <massimo.dipierro at gmail.com> wrote:
>> The way I see it is not whether Python has threads, fibers, coroutines, etc.
>> The problem is that in 5 years we going to have on the market CPUs with
>> 100 cores
>
> This is definitely untrue. No CPU maker has plans for a general-purpose
> 100-core CPU.

Intel already has immediate plans for 10 core cpus, those have well
functioning HT so they should be considered 20 core.  Two socket
boards are quite common, there's 40 cores.  4+ socket boards exist
bringing your total to 80+ cores connected to a bucket of dram on a
single motherboard. These are the types of systems in data centers
being made available to people to run their computationally intensive
software on. That counts as general purpose in my book.

-gps


From anacrolix at gmail.com  Sun Feb 12 03:33:10 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 12 Feb 2012 10:33:10 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAGE7PN+qgdOkPKh9sb02ediABhSMz+8LjErRbqtAfRCXcJejMw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<20120210183801.59921627@pitrou.net>
	<CAGE7PN+qgdOkPKh9sb02ediABhSMz+8LjErRbqtAfRCXcJejMw@mail.gmail.com>
Message-ID: <CAB4yi1OcW+ecKrw7wYp7ynCFkagormfyXbEj+_sj81fNG5k2Og@mail.gmail.com>

Damn straight.
On Feb 12, 2012 10:29 AM, "Gregory P. Smith" <greg at krypto.org> wrote:

> On Fri, Feb 10, 2012 at 9:38 AM, Antoine Pitrou <solipsis at pitrou.net>
> wrote:
> > On Fri, 10 Feb 2012 08:52:16 -0600
> > Massimo Di Pierro
> > <massimo.dipierro at gmail.com> wrote:
> >> The way I see it is not whether Python has threads, fibers, coroutines,
> etc.
> >> The problem is that in 5 years we going to have on the market CPUs with
> >> 100 cores
> >
> > This is definitely untrue. No CPU maker has plans for a general-purpose
> > 100-core CPU.
>
> Intel already has immediate plans for 10 core cpus, those have well
> functioning HT so they should be considered 20 core.  Two socket
> boards are quite common, there's 40 cores.  4+ socket boards exist
> bringing your total to 80+ cores connected to a bucket of dram on a
> single motherboard. These are the types of systems in data centers
> being made available to people to run their computationally intensive
> software on. That counts as general purpose in my book.
>
> -gps
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120212/835fb58b/attachment.html>

From greg at krypto.org  Sun Feb 12 03:33:02 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 11 Feb 2012 18:33:02 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3575FB.60700@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
Message-ID: <CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>

On Fri, Feb 10, 2012 at 11:54 AM, Sturla Molden <sturla at molden.no> wrote:
>
> - Windows has no fork system call. SunOS used to have a very slow fork
> system call. The majority of Java developers worked with Windows or Sun, and
> learned to work with threads.
>
> For which the current summary is:
>
> - The GIL sucks because Windows has no fork.
>
> Which some might say is the equivalent of:
>
> - Windows sucks.

Please do not claim that fork() semantics and copy-on-write are good
things to build off of... They are not.  fork() was designed in a
world *before threads* existed.  It simply can not be used reliably in
a process that uses threads and tons of real world practical C and C++
software that Python programs need to interact with, be embedded in or
use via extension modules these days uses threads quite effectively.

The multiprocessing module on posix would be better off if it offered
a windows CreateProcess() work-a-like mode that spawns a *new* python
interpreter process rather than depending on fork().  The fork() means
multithreaded processes cannot reliably use the multiprocessing module
(and those other threads could come from libraries or C/C++ extension
modules that you cannot control within the scope of your own software
that desires to use multiprocessing).  This is likely not hard to
implement, if nobody has done it already, as I believe the windows
support already has to do much the same thing today.

-gps


From greg at krypto.org  Sun Feb 12 03:39:41 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 11 Feb 2012 18:39:41 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6142B265-EAE6-4D04-8A2D-8F289344A06E@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CAB4yi1Nya9--ZfrPsmZ=WWomKMTzFiiefzN_=5+eGKr708hBkQ@mail.gmail.com>
	<6142B265-EAE6-4D04-8A2D-8F289344A06E@masklinn.net>
Message-ID: <CAGE7PNLSB=1moHfVAc6Lq9C9yjerHNnc_YLz14OCiN0S5jon-g@mail.gmail.com>

On Sat, Feb 11, 2012 at 12:26 AM, Masklinn <masklinn at masklinn.net> wrote:
>
> Finally, multiprocessing has a far better upgrade path (as e.g. Erlang
> demonstrates): if your non-deterministic points are well delineated and
> your interfaces to other concurrent execution points are well defined,
> scaling from multiple cores to multiple machines becomes possible.

+10  :)


From ericsnowcurrently at gmail.com  Sun Feb 12 04:10:22 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 11 Feb 2012 20:10:22 -0700
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
Message-ID: <CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>

On Sat, Feb 11, 2012 at 7:27 PM, Carl M. Johnson
<cmjohnson.mailinglist at gmail.com> wrote:
>
> On Feb 11, 2012, at 12:40 AM, Paul Moore wrote:
>
>> In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
>> but for my uses, that's not a disaster. Mostly-readable works. But in
>> Python 3, I get an error and can't process the file.
>>
>> I can just use latin-1, or surrogateescape. But that doesn't come
>> naturally to me yet. Maybe it will in time... Or maybe there's a
>> better solution I don't know about yet.
>
> I'm confused what you're asking for. Setting errors to surrogateescape or encoding to Latin-1 causes Python 3 to behave the exact same way as Python 2: it's doing the "wrong" thing and may result in mojibake, but at least it isn't screwing up anything new so long as the stuff you add to the file is in ASCII. The only way to make Python 3 slightly more like Python 2 would be to set errors="surrogateescape" by default instead of asking the programmer to know to use it. I think that would be going too far, but it could be done. I think it would be simpler though to just publicize errors="surrogateescape" more.
>
> "Dear people who don't care about encodings and don't want to take the time to get them right, just put errors='surrogateescape' into your open commands and Python 3 will behave almost exactly like Python 2. The end."

So something like this:

    import functools, builtins
    open = builtins.open = functools.partial(open, encoding="ascii",
errors="surrogateescape")

-eric


From gahtune at gmail.com  Sun Feb 12 04:09:58 2012
From: gahtune at gmail.com (Gabriel AHTUNE)
Date: Sun, 12 Feb 2012 11:09:58 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
Message-ID: <CAH3V-b5KPG2BjnkBwwffJbOO_gRjLKhy2Jv9umR=0nPmtA4FJA@mail.gmail.com>

2012/2/11 Paul Moore <p.f.moore at gmail.com>

> On 11 February 2012 00:07, Terry Reedy <tjreedy at udel.edu> wrote:
> >>>  Nor is there in 3.x.
> >
> > I view that claim as FUD, at least for many users, and at least until the
> > persons making the claim demonstrate it. In particular, I claim that
> people
> > who use Python2 knowing nothing of unicode do not need to know much more
> to
> > do the same things in Python3.
>
> Concrete example, then.
>
> I have a text file, in an unknown encoding (yes, it does happen to
> me!) but opening in an editor shows it's mainly-ASCII. I want to find
> all the lines starting with a '*'. The simple
>
> with open('myfile.txt') as f:
>    for line in f:
>        if line.startswith('*'):
>            print(line)
>
> fails with encoding errors. What do I do? Short answer, grumble and go
> and use grep (or in more complex cases, awk) :-(
>
> Paul.


I just look at the Python 3 documentation (
http://docs.python.org/release/3.1.3/library/functions.html#open), there is
a "error" parameter to the open function. when set to "ignore" or "replace"
it will solved your problem.

Another way is to try to guess the encoding programaticaly (I found chardet
module http://pypi.python.org/pypi/chardet) and pass it to decode your file
with unknown encoding.

Then why not put a value "auto" available for "encoding" parameter which
makes "open" call a detector before opening and throw error when the guess
is less than a certain percentage.

Gabriel AHTUNE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120212/fe778199/attachment.html>

From cmjohnson.mailinglist at gmail.com  Sun Feb 12 04:19:41 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sat, 11 Feb 2012 17:19:41 -1000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
Message-ID: <ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>


On Feb 11, 2012, at 5:10 PM, Eric Snow wrote:

> So something like this:
> 
>    import functools, builtins
>    open = builtins.open = functools.partial(open, encoding="ascii",
> errors="surrogateescape")


We could pack it in and call it something like "python2open". :-) 


From merwok at netwok.org  Sun Feb 12 04:30:50 2012
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Sun, 12 Feb 2012 04:30:50 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120211231805.GA7853@cskk.homeip.net>
References: <4F341561.3050409@pearwood.info>
	<20120211231805.GA7853@cskk.homeip.net>
Message-ID: <4F37326A.7010904@netwok.org>

Le 12/02/2012 00:18, Cameron Simpson a ?crit :
> On 10Feb2012 05:50, Steven D'Aprano <steve at pearwood.info> wrote:
> | Python 4.x (Python 4000) is pure vapourware. It it irresponsible to tell 
> | people to stick to Python 2.7 (there will be no 2.8) in favour of something 
> | which may never exist.
> | 
> | http://www.python.org/dev/peps/pep-0404/
> 
> Please tell me this PEP number is deliberate!

It is, sir!

At first the number was taken by the virtualenv PEP with no special
meaning, just the next number in sequence, but when Barry wrote up the
2.8 Unrelease PEP and took the number 405, the occasion was too good to
be missed and the numbers were swapped.

Cheers


From sturla at molden.no  Sun Feb 12 04:46:00 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 04:46:00 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <20120211205200.2667c68f@bhuda.mired.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
Message-ID: <4F3735F8.10607@molden.no>

Den 12.02.2012 02:52, skrev Mike Meyer:
> First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> They aren't the same thing. 

Take a look at the implementation.

>> When working with multiprocessing for a while, one comes to the
>> conclusion that we really need named kernel objects.
> And both the BSD mmap (at least in recent systems) and the mmap module
> provide objects with names in the file system space. IIUC, while there
> are systems that won't let you create anonymous objects (like early
> versions of the mmap module), there aren't any - at least any longer -
> that won't let you create named objects.

Sure, you can memory map named files. You can even memory map from 
/dev/shm on a system that supports it, if you are willing to reserve 
some RAM for ramdisk.

But apart from that, show me how you would use the mmap module to make 
named shared memory on Linux or Windows. No, memory mapping file object 
-1 or 0 don't count, you get an anonymous memory mapping.

Here is a task for you to try:

1. start a process
2. in the new process, create some shared memory (use the mmap module)
3. make the parent process get access to it (should be easy, right?)

Can you do this? No?

Then try the same thing with a lock (multiprocessing.Lock) or an event.

Show me how you would code this.

>
> >  Use named kernel objects for IPC, pickle the name.
> You don't need to pickle the name if you use mmap's native name system
> - it's just a string.

Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
you ever looked at the code?


> Since can use pickle, you're only dealing with small amounts of
> data.

What on earth are you talking about?

Every object passed in the "args" keyword argument to 
multiprocessing.Process is pickled. Same thing for any object you pass 
to multiprocessing.Queue.

Look at the code.


Sturla


From pyideas at rebertia.com  Sun Feb 12 05:17:31 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sat, 11 Feb 2012 20:17:31 -0800
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
	<ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
Message-ID: <CAMZYqRSs0x=w=vXiXUV6dXNbzVEYruD1pb4SfEkxANmWyi7dVg@mail.gmail.com>

On Sat, Feb 11, 2012 at 7:19 PM, Carl M. Johnson
<cmjohnson.mailinglist at gmail.com> wrote:
> On Feb 11, 2012, at 5:10 PM, Eric Snow wrote:
>> So something like this:
>>
>> ? ?import functools, builtins
>> ? ?open = builtins.open = functools.partial(open, encoding="ascii",
>> errors="surrogateescape")
>
> We could pack it in and call it something like "python2open". :-)

Or just add a keyword-only argument to open():
americentric=True    :-P


From ncoghlan at gmail.com  Sun Feb 12 05:19:12 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 12 Feb 2012 14:19:12 +1000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
	<ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
Message-ID: <CADiSq7do6gpXk5kezvveOrMJHi49tmsdpxvToZzvgbGHo1xP+Q@mail.gmail.com>

On Sun, Feb 12, 2012 at 1:19 PM, Carl M. Johnson
<cmjohnson.mailinglist at gmail.com> wrote:
>
> On Feb 11, 2012, at 5:10 PM, Eric Snow wrote:
>
>> So something like this:
>>
>> ? ?import functools, builtins
>> ? ?open = builtins.open = functools.partial(open, encoding="ascii",
>> errors="surrogateescape")
>
>
> We could pack it in and call it something like "python2open". :-)

An open_ascii() builtin isn't as crazy as it may initially sound -
it's not at all uncommon to have a file that's almost certainly in
some ASCII compatible encoding like utf-8, latin-1 or one of the other
extended ASCII encodings, but you don't know which one specifically.

By offering open_ascii(), we'd be making it trivial to process such
files without blowing up (or having to figure out exactly *which*
ASCII compatible encoding you have). When you wrote them back to disk,
if you'd added any non-ASCII chars of your own, you'd get a
UnicodeEncodeError, but any encoded data from the original would be
reproduced in the original encoding.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From cs at zip.com.au  Sun Feb 12 05:34:11 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 12 Feb 2012 15:34:11 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20120212043411.GA442@cskk.homeip.net>

On 11Feb2012 13:12, Stephen J. Turnbull <stephen at xemacs.org> wrote:
| Jim Jewett writes:
|  > Are you saying that some (many?  all?) platforms make a bad choice there?
| 
| No.  I'm saying that whatever choice is made (except for 'latin-1'
| because it accepts all bytes regardless of the actual encoding of the
| data, or PEP 383 "errors='surrogateescape'" for the same reason, both
| of which are unacceptable defaults for production code *for the same
| reason*), there is data that will cause that idiom to fail on Python 3
| where it would not on Python 2.

But...

By your own argument here, the failing is on the part of Python 2
becuase it is passing when it should fail, because it is effectively
using the equivalent of 'latin-1'. And you say right there that that is
unacceptable.

At least with Python 3 you find out early that you're doing something
dodgy.

Disclaimer: I may be talking our my arse here; my personal code is all
Python 2 at present because I haven't found an idle weekend (or, more
likely, week) to spend getting it python 3 ready (meaning parsing ok but
probably failing a bunch of tests to start with).

I do know that in Python 2 I've tripped over a heap of unicode versus
latin-1/maybe-ascii text issues and python unicode-vs-str issues
just recently in Python 2 and a lot of the ambiguity I've been juggling
would be absent in Python 3 (because at least all the strings will be
unicode and I can concentrate on the encoding/decode stuff instead).

[...snip...]
| The fact is that with a little bit of knowledge, you can almost
| certainly get more reliable (and in case of failure, more debuggable)
| results from Python 3 than from Python 2.

That's my hope.

| But people are happy to
| deal with the devil they know, even though it's more noxious than the
| devil they don't.

Not me :-) I speak as one who once moved to MH mail folders and
vi-with-a-few-macros as a mail reader just to break my use of the mail
reader I had been using:-(

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

No system, regardless of how sophisticated, can repeal the laws of physics or
overcome careless driving actions.      - Mercedes Benz


From ncoghlan at gmail.com  Sun Feb 12 05:34:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 12 Feb 2012 14:34:38 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
Message-ID: <CADiSq7dsu6vA6BD7McGQU3j2Si05hawEZy8pGQfvv-48MC272A@mail.gmail.com>

On Sun, Feb 12, 2012 at 9:24 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> Saying "it's bytes" is even more of a lie than "it's latin-1". The
> honest truth is "it's an ASCII superset", and that's all I need to
> know to do the job manually, so I'd like to write code to do the same
> job without needing to lie about what I know. I'm now 100% convinced
> that encoding="ascii",errors="surrogateescape" is the way to say this
> in code.

I created http://bugs.python.org/issue13997 to suggest codifying this
explicitly as an open_ascii() builtin.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Sun Feb 12 06:03:01 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 12 Feb 2012 16:03:01 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
Message-ID: <4F374805.9000606@pearwood.info>

Paul Moore wrote:

> My concern about Unicode in Python 3 is that the principle is, you
> specify the right encoding. But often, I don't *know* the encoding ;-(
> Text files, like changelogs as a good example, generally have no
> marker specifying the encoding, and they can have all sorts (depending
> on where the package came from). Worse, I am on Windows and changelogs
> usually come from Unix developers - so I'm not familiar with the
> common conventions ("well, of course it's in UTF-8, that's what
> everyone uses"...)

<raises eyebrow>

But you obviously do know the convention -- use UTF-8.


> In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
> but for my uses, that's not a disaster. Mostly-readable works. But in
> Python 3, I get an error and can't process the file.
> 
> I can just use latin-1, or surrogateescape. But that doesn't come
> naturally to me yet. Maybe it will in time... Or maybe there's a
> better solution I don't know about yet.

So why don't you use UTF-8?

As far as those who actually don't know the convention, isn't it better to 
teach them the convention "use UTF-8, unless dealing with legacy data" rather 
than to avoid dealing with the issue by using errors='surrogateescape'?

I'd hate for "surrogateescape" to become the One Obvious Way for dealing with 
unknown encodings, because this is 2012 and people should be more savvy about 
non-ASCII characters by now. I suppose it's marginally better than just 
throwing them away with errors='ignore', but still.

I recently bought a book from Amazon UK. It was ?12 not \udcc2\udca312.

This isn't entirely a rhetorical question. I'm not on Windows, so perhaps 
there's a problem I'm unaware of.


-- 
Steven


From steve at pearwood.info  Sun Feb 12 06:26:24 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 12 Feb 2012 16:26:24 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CADiSq7do6gpXk5kezvveOrMJHi49tmsdpxvToZzvgbGHo1xP+Q@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>	<ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
	<CADiSq7do6gpXk5kezvveOrMJHi49tmsdpxvToZzvgbGHo1xP+Q@mail.gmail.com>
Message-ID: <4F374D80.7030309@pearwood.info>

Nick Coghlan wrote:
> On Sun, Feb 12, 2012 at 1:19 PM, Carl M. Johnson
> <cmjohnson.mailinglist at gmail.com> wrote:
>> On Feb 11, 2012, at 5:10 PM, Eric Snow wrote:
>>
>>> So something like this:
>>>
>>>    import functools, builtins
>>>    open = builtins.open = functools.partial(open, encoding="ascii",
>>> errors="surrogateescape")
>>
>> We could pack it in and call it something like "python2open". :-)
> 
> An open_ascii() builtin isn't as crazy as it may initially sound -
> it's not at all uncommon to have a file that's almost certainly in
> some ASCII compatible encoding like utf-8, latin-1 or one of the other
> extended ASCII encodings, but you don't know which one specifically.

To me, "open_ascii" suggests either:

- it opens ASCII files, and raises an error if they are not ASCII; or

- it opens non-ASCII files, and magically translates their content to ASCII 
using some variant of "The Unicode Hammer" recipe:

http://code.activestate.com/recipes/251871-latin1-to-ascii-the-unicode-hammer/

We should not be discouraging developers from learning even the most trivial 
basics of Unicode. I'm not suggesting that we try to force people to become 
Unicode experts (they wouldn't, even if we tried) but making this a built-in 
is dumbing things down too much. I don't believe that it is an imposition for 
people to explicitly use open(filename, 'ascii', 'surrogateescape') if that's 
what they want.

If they want open_ascii, let them define this at the top of their modules:

open_ascii = (lambda name:
     open(name, encoding='ascii', errors='surrogateescape'))

A one liner, if you don't mind long lines.

I'm not entirely happy with the surrogateescape solution, but I can see it's 
possibly the least worst *simple* solution for the case where you don't know 
the source encoding. (Encoding guessing heuristics are awesome but hardly 
simple.) So put the recipe in the FAQs, in the docs, and the docstring for 
open[1], and let people copy and paste the recipe. That's a pretty gentle 
introduction to Unicode.


[1] Which is awfully big and complex in Python 3.1, but that's another story.


-- 
Steven


From ncoghlan at gmail.com  Sun Feb 12 07:01:42 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 12 Feb 2012 16:01:42 +1000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <4F374D80.7030309@pearwood.info>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
	<ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
	<CADiSq7do6gpXk5kezvveOrMJHi49tmsdpxvToZzvgbGHo1xP+Q@mail.gmail.com>
	<4F374D80.7030309@pearwood.info>
Message-ID: <CADiSq7dqcDf3vWvkttb9ckmEGDCZcTt9TPrcozrGtfL04NmRwg@mail.gmail.com>

On Sun, Feb 12, 2012 at 3:26 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> I'm not entirely happy with the surrogateescape solution, but I can see it's
> possibly the least worst *simple* solution for the case where you don't know
> the source encoding. (Encoding guessing heuristics are awesome but hardly
> simple.) So put the recipe in the FAQs, in the docs, and the docstring for
> open[1], and let people copy and paste the recipe. That's a pretty gentle
> introduction to Unicode.

Yeah, it didn't take long for me to come back around to that point of
view, so I morphed http://bugs.python.org/issue13997 into a docs bug
about clearly articulating the absolute bare minimum knowledge of
Unicode needed to process text in a robust cross-platform manner in
Python 3 instead.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Sun Feb 12 09:02:07 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 03:02:07 -0500
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F3735F8.10607@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no>
Message-ID: <20120212030207.7cd2a5dc@bhuda.mired.org>

On Sun, 12 Feb 2012 04:46:00 +0100
Sturla Molden <sturla at molden.no> wrote:

> Den 12.02.2012 02:52, skrev Mike Meyer:
> > First, I didn't ask about "BSD mmap", I asked about the "mmap module". 
> > They aren't the same thing. 
> Take a look at the implementation.

True, but we're talking about an API, not a specific implementation.

> >> When working with multiprocessing for a while, one comes to the
> >> conclusion that we really need named kernel objects.
> > And both the BSD mmap (at least in recent systems) and the mmap module
> > provide objects with names in the file system space. IIUC, while there
> > are systems that won't let you create anonymous objects (like early
> > versions of the mmap module), there aren't any - at least any longer -
> > that won't let you create named objects.
> Sure, you can memory map named files. You can even memory map from 
> /dev/shm on a system that supports it, if you are willing to reserve 
> some RAM for ramdisk.

And that's *not* the anonymous kernel object you complained about
getting from mmap.

> But apart from that, show me how you would use the mmap module to make 
> named shared memory on Linux or Windows. No, memory mapping file object 
> -1 or 0 don't count, you get an anonymous memory mapping.

The linux mmap has the same arguments as the BSD one, so I'd expect it
to work the same. I expect that the Python core will have made the
semantics work properly on Windows, but don't really care, and don't
have a Windows system to test it on. And that's why I'm talking about
the API, not the implementation.

> Here is a task for you to try:
> 
> 1. start a process
> 2. in the new process, create some shared memory (use the mmap module)
> 3. make the parent process get access to it (should be easy, right?)
> Can you do this? No?

Works exactly like I'd expect it to.

> Show me how you would code this.

Here's the code that creates the shared file:

    share_name = '/tmp/xyzzy'
    with open(share_name, 'wb') as f:
        f.write(b'hello')

Here's the code for the child:

    with open(share_name, 'r+b') as f:
        share = mmap(f.fileno(), 0)
        share[:5] = b'gone\n'

Here's the code for the parent:

    child = Process(target=proc)
    child.start()
    with open(share_name, mode='r+b') as f:
        share = mmap(f.fileno(), 0)
        while share[0] == ord('h'):
            sleep(1)
        print('main:', share.readline())

> > >  Use named kernel objects for IPC, pickle the name.
> > You don't need to pickle the name if you use mmap's native name system
> > - it's just a string.
> Sure, multiprocessing does not pickle strings objects. Or whatever. Have 
> you ever looked at the code?

I didn't say multiprocessing wouldn't pickle the name, *or* anything
else about the multiprocessing module. I said *you* didn't need to
pickle it. And I didn't. Did you read what I wrote?
  
> Every object passed in the "args" keyword argument to 
> multiprocessing.Process is pickled. Same thing for any object you pass 
> to multiprocessing.Queue.

Yes, but we're not talking about multiprocessing.Queue. We're talking
about mmap. multiprocessing.Queue doesn't use mmap. For that, you want
to us multiprocessing.Value and multiprocessing.Array.

> Look at the code.

Look at the text.

     <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From p.f.moore at gmail.com  Sun Feb 12 13:54:13 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 12 Feb 2012 12:54:13 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <4F374805.9000606@pearwood.info>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
Message-ID: <CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>

On 12 February 2012 05:03, Steven D'Aprano <steve at pearwood.info> wrote:
> Paul Moore wrote:
>
>> My concern about Unicode in Python 3 is that the principle is, you
>> specify the right encoding. But often, I don't *know* the encoding ;-(
>> Text files, like changelogs as a good example, generally have no
>> marker specifying the encoding, and they can have all sorts (depending
>> on where the package came from). Worse, I am on Windows and changelogs
>> usually come from Unix developers - so I'm not familiar with the
>> common conventions ("well, of course it's in UTF-8, that's what
>> everyone uses"...)
>
>
> <raises eyebrow>
>
> But you obviously do know the convention -- use UTF-8.

No. I know that a lot of Unix people advocate UTF-8, and I gather it's
rapidly becoming standard in the Unix world. But I work on Windows,
and UTF-8 is not the standard there. I have no idea if UTF-8 is
accepted cross-platform, or if it's just what has grown as most
ChangeLog files are written on Unix and Unix users don't worry about
what's convenient on Windows (no criticism there, just acknowledgement
of a fact). And I have seen ChangeLog files with non-UTF-8 encodings
of names in them. I have no idea if that's a bug or just a preference
- and anyway, "be permissive in what you accept" applies...

Get beyond ChangeLog files and it's anybody's guess. My PC has text
files from many, many places (some created on my PC, some created by
others on various flavours and ages of Unix , and some downloaded from
who-knows-where on the internet). Not one of them comes with an
encoding declaration. Of course every file is encoded in some way. But
it's incredibly naive to assume the user knows that encoding. Hey, I
still have to dump out the content of files to check the line ending
convention when working in languages other than Python - universal
newlines saves me needing to care about that, why is it so disastrous
to consider having something similar for encodings?

>> In Python 2, I can ignore the issue. Sure, I can end up with mojibake,
>> but for my uses, that's not a disaster. Mostly-readable works. But in
>> Python 3, I get an error and can't process the file.
>>
>> I can just use latin-1, or surrogateescape. But that doesn't come
>> naturally to me yet. Maybe it will in time... Or maybe there's a
>> better solution I don't know about yet.
>
> So why don't you use UTF-8?

Decoding errors.

> As far as those who actually don't know the convention, isn't it better to
> teach them the convention "use UTF-8, unless dealing with legacy data"
> rather than to avoid dealing with the issue by using
> errors='surrogateescape'?

Fair comment. My point here is that I *am* dealing with "legacy" data
in your sense. And I do so on a day to day basis. UTF-8 is very, very
rare in my world (Windows). Latin-1 (or something close) is common.

There is no cross-platform standard yet. And probably won't be until
Windows moves to UTF-8 as the standard encoding. Which ain't happening
soon.

> I'd hate for "surrogateescape" to become the One Obvious Way for dealing
> with unknown encodings, because this is 2012 and people should be more savvy
> about non-ASCII characters by now. I suppose it's marginally better than
> just throwing them away with errors='ignore', but still.

I think people are much more aware of the issues, but cross-platform
handling remains a hard problem. I don't wish to make assumptions, but
your insistence that UTF-8 is a viable solution suggests to me that
you don't know much about the handling of Unicode on Windows. I wish I
had that luxury...

> I recently bought a book from Amazon UK. It was ?12 not \udcc2\udca312.

?12 in what encoding? :-)

> This isn't entirely a rhetorical question. I'm not on Windows, so perhaps
> there's a problem I'm unaware of.

I think that's the key here. Even excluding places that don't use the
Roman alphabet, Windows encoding handling is complex. CP1252, CP850,
Latin-1, Latin-14 (Euro zone), UTF-16, BOMs. All are in use on my PC
to some extent. And that's even without all this foreign UTF-8 I get
from the Unix guys :-) Apart from the blasted UTF-16, all of it's
"ASCII most of the time".

Paul.


From stefan_ml at behnel.de  Sun Feb 12 14:33:13 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sun, 12 Feb 2012 14:33:13 +0100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
Message-ID: <jh8f2p$2m1$1@dough.gmane.org>

Paul Moore, 12.02.2012 13:54:
> Latin-1, Latin-14 (Euro zone)

OT-remark: I assume you meant ISO8859-15 (aka. Latin-9) here. However,
that's not for the "Euro zone", it's just Latin-1 with the Euro character
wangled in and a couple of other changes. It still lacks characters that
are commonly used by languages within the Euro zone, e.g. the Slovenian
language (a Slavic descendant), but also Gaelic or Welsh.

https://en.wikipedia.org/wiki/ISO/IEC_8859-15#Coverage

https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Languages_commonly_supported_but_with_incomplete_coverage

Stefan


From ubershmekel at gmail.com  Sun Feb 12 14:33:42 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Sun, 12 Feb 2012 15:33:42 +0200
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
Message-ID: <CANSw7KyoZ4YFBd-qu0=4H86ZmuZto6x-i7GJ-RWiJZjOici8NA@mail.gmail.com>

On Sun, Feb 12, 2012 at 2:54 PM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 12 February 2012 05:03, Steven D'Aprano <steve at pearwood.info> wrote:
> > Paul Moore wrote:
> >
> >> My concern about Unicode in Python 3 is that the principle is, you
> >> specify the right encoding. But often, I don't *know* the encoding ;-(
> >> Text files, like changelogs as a good example, generally have no
> >> marker specifying the encoding, and they can have all sorts (depending
> >> on where the package came from). Worse, I am on Windows and changelogs
> >> usually come from Unix developers - so I'm not familiar with the
> >> common conventions ("well, of course it's in UTF-8, that's what
> >> everyone uses"...)
> >
> >
> > <raises eyebrow>
> >
> > But you obviously do know the convention -- use UTF-8.
>
> No. I know that a lot of Unix people advocate UTF-8, and I gather it's
> rapidly becoming standard in the Unix world. But I work on Windows,
> and UTF-8 is not the standard there. I have no idea if UTF-8 is
> accepted cross-platform, or if it's just what has grown as most
> ChangeLog files are written on Unix and Unix users don't worry about
> what's convenient on Windows (no criticism there, just acknowledgement
> of a fact). And I have seen ChangeLog files with non-UTF-8 encodings
> of names in them. I have no idea if that's a bug or just a preference
> - and anyway, "be permissive in what you accept" applies...
>
>
Windows NT started with UCS-16 and from Windows 2000 it's UTF-16
internally. It was an uplifting thought that unicode is just 2 bytes per
letter so they did a huge refactoring of the entire windows API
(ReadFileA/ReadFileW etc) thinking they won't have to worry about it again.
Nowadays windows INTERNALS have the worst of all worlds - a variable
char-length, uncommon unicode format, and twice the API to maintain.

Notepad can open and save utf-8 files perfectly much like most other
windows programs.

UTF-8 is the internet standard and I suggest we keep that fact crystal
clear. UTF-8 Is the goto codec, it is the convention.

It's ok to use other codecs for whatever reasons, constraints, use cases,
etc. But these are all exceptions to the convention - UTF8.


Yuval (Also a windows dev)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120212/6b5f9583/attachment.html>

From shibturn at gmail.com  Sun Feb 12 14:52:20 2012
From: shibturn at gmail.com (shibturn)
Date: Sun, 12 Feb 2012 13:52:20 +0000
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F3735F8.10607@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no>
Message-ID: <jh8g6n$aca$1@dough.gmane.org>

On 12/02/2012 3:46am, Sturla Molden wrote:
> 1. start a process
> 2. in the new process, create some shared memory (use the mmap module)
> 3. make the parent process get access to it (should be easy, right?)

As Mike says, on Unix you can just create a file in /tmp to back an 
mmap.  On Linux, posix mmaps created with shm_open() seem to be normal 
files on a tmpfs file system, usually /dev/shm.  Since /tmp is also 
usually a tmpfs file system on Linux, I assume this whould be equivalent 
in terms of overhead.

On Windows you can use the tagname argument of mmap.mmap().

Maybe a BinaryBlob wrapper class could be created which lets an mmap be 
"pickled by reference".  Managing life time and reliable cleanup might 
be awkward though.

If the pickle overhead is the problem you could try 
Connection.send_bytes() and Connection.recv_bytes().  I suppose Queue 
objects could grow put_bytes() and get_bytes() methods too.  Or a 
BytesQueue class could be created.

> Can you do this? No?
>
> Then try the same thing with a lock (multiprocessing.Lock) or an event.

I have a patch (http://bugs.python.org/issue8713) to make 
multiprocessing on Unix work with fork+exec which has to do this because 
semaphores cannot be inherited across exec.  Making sure all the named 
semaphores get removed if the program terminates abnormally is a bit 
awkward though.  It could be modified to make them picklable in general.

On Windows dealing with "named objects" is easier since they are 
refcounted by the operating system and deleted when no more processes 
have handles for them.

If you make a feature request at bugs.python.org I might work on a patch.

Cheers

sbt


From sturla at molden.no  Sun Feb 12 15:15:46 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 15:15:46 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <20120212030207.7cd2a5dc@bhuda.mired.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no>
	<20120212030207.7cd2a5dc@bhuda.mired.org>
Message-ID: <4F37C992.3090101@molden.no>

Den 12.02.2012 09:02, skrev Mike Meyer:
> True, but we're talking about an API, not a specific implementation. 

You have been complaining about the GIL which is a specific implementation.

I am talking about how multiprocessing actually works, i.e. implementation.


>
>> But apart from that, show me how you would use the mmap module to make
>> named shared memory on Linux or Windows. No, memory mapping file object
>> -1 or 0 don't count, you get an anonymous memory mapping.
> The linux mmap has the same arguments as the BSD one, so I'd expect it
> to work the same.

It calls BSD mmap in the implementation on Linux. It calls 
CreateFileMapping and MapViewOfFile on Windows.


> Works exactly like I'd expect it to.
>
>> Show me how you would code this.
> Here's the code that creates the shared file:
>
>      share_name = '/tmp/xyzzy'
>      with open(share_name, 'wb') as f:
>          f.write(b'hello')
>
> Here's the code for the child:
>
>      with open(share_name, 'r+b') as f:
>          share = mmap(f.fileno(), 0)
>          share[:5] = b'gone\n'
>
> Here's the code for the parent:
>
>      child = Process(target=proc)
>      child.start()
>      with open(share_name, mode='r+b') as f:
>          share = mmap(f.fileno(), 0)
>          while share[0] == ord('h'):
>              sleep(1)
>          print('main:', share.readline())


Here you are memory mapping a temporary file, not shared memory.

On Linux, shared memory with mmap does not have a share_name.
It has fileno -1. So go ahead and replace f.fileno() with -1 and see if
it still works for you.

This is how mmap is used for shared memory on Linux:

shm = mmap.mmap(-1, 4096)
os.fork()

See how the fork comes after the mmap. Which means it must always
be allocated in the parent process. That is why we need an implementation
with System V IPC instead of mmap.


> Yes, but we're not talking about multiprocessing.Queue. We're talking 
> about mmap. multiprocessing.Queue doesn't use mmap. For that, you want 
> to us multiprocessing.Value and multiprocessing.Array.

Pass multiprocessing.Value or multiprocessing.Array to
multiprocessing.Queue and see what happens.

And while you are at it, pass multiprocessing.Lock to
multiprocessing.Queue and see what happens as well.
Contemplate how we can pass an object with a lock
as a message between two processes. Should we change
the implementation?

And then, look up the implementation for multiprocessing.Value
and Array and see if (and how) they use mmap. Perhaps you just
told me to use mmap instead of mmap.


Sturla


From sturla at molden.no  Sun Feb 12 15:35:19 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 15:35:19 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <jh8g6n$aca$1@dough.gmane.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
Message-ID: <4F37CE27.5070908@molden.no>

Den 12.02.2012 14:52, skrev shibturn:
>
> As Mike says, on Unix you can just create a file in /tmp to back an 
> mmap.  On Linux, posix mmaps created with shm_open() seem to be normal 
> files on a tmpfs file system, usually /dev/shm.  Since /tmp is also 
> usually a tmpfs file system on Linux, I assume this whould be 
> equivalent in terms of overhead.

Mark did not use shm_open, he memory mapped from disk.


> I have a patch (http://bugs.python.org/issue8713) to make 
> multiprocessing on Unix work with fork+exec which has to do this 
> because semaphores cannot be inherited across exec.  Making sure all 
> the named semaphores get removed if the program terminates abnormally 
> is a bit awkward though.  It could be modified to make them picklable 
> in general.
>
> On Windows dealing with "named objects" is easier since they are 
> refcounted by the operating system and deleted when no more processes 
> have handles for them.
>
> If you make a feature request at bugs.python.org I might work on a patch.

Cleaning up SysV ipc semaphores and shared memory is similar (semctl 
instead of shmctl to get refrerence count). And then we need a monkey 
patch for os._exit.

Look at the Cython code here:
http://dl.dropbox.com/u/12464039/sharedmem.zip

Sturla


From shibturn at gmail.com  Sun Feb 12 16:20:11 2012
From: shibturn at gmail.com (shibturn)
Date: Sun, 12 Feb 2012 15:20:11 +0000
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F37CE27.5070908@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
	<4F37CE27.5070908@molden.no>
Message-ID: <jh8lbe$cev$1@dough.gmane.org>

On 12/02/2012 2:35pm, Sturla Molden wrote:
> Mark did not use shm_open, he memory mapped from disk.

But if his /tmp is a tmpfs file system (which it usually is on Linux) 
then I think it is entirely equivalent.  Or he could create the file in 
/dev/shm instead.

Below is Blob class which seems to work.  Note that the process which 
created the blob needs to wait for the other process to unpickle it 
before allowing it to be garbage collected.


import multiprocessing as mp
from multiprocessing.util import Finalize, get_temp_dir
import mmap, sys, os, itertools

class Blob(object):

     _counter = itertools.count()

     def __init__(self, length, name=None):
         self.length = length
         if sys.platform == 'win32':
             if name is None:
                 name = 'blob-%s-%d' % (os.getpid(), next(self._counter))
             self.name = name
             self.mmap = mmap.mmap(-1, length, self.name)
         else:
             if name is None:
                 self.name = '%s/blob-%s-%d' % (get_temp_dir(),
                                                os.getpid(),
                                                next(self._counter))
                 flags = os.O_RDWR | os.O_CREAT | os.O_EXCL
             else:
                 self.name = name
                 flags = os.O_RDWR
             fd = os.open(self.name, flags, 0o600)
             try:
                 if name is None:
                     os.ftruncate(fd, length)
                     Finalize(self, os.unlink, (self.name,),
                              exitpriority=0)
                 self.mmap = mmap.mmap(fd, length)
             finally:
                 os.close(fd)

     def __reduce__(self):
         return Blob, (self.length, self.name)

def child(conn):
     b = Blob(20)
     b.mmap[:5] = "hello"
     conn.send(b)
     conn.recv()                 # wait for acknowledgement before
                                 # allowing garbage collection

if __name__ == '__main__':
     conn, child_conn = mp.Pipe()
     p = mp.Process(target=child, args=(child_conn,))
     p.start()
     b = conn.recv()
     conn.send(None)             # acknowledge receipt
     print repr(b.mmap[:])


From p.f.moore at gmail.com  Sun Feb 12 17:30:59 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 12 Feb 2012 16:30:59 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <jh8f2p$2m1$1@dough.gmane.org>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<jh8f2p$2m1$1@dough.gmane.org>
Message-ID: <CACac1F-n9iZ3N7RS4Dj0wg2_A_UH+ksYjNySEku2xPozmh1pYw@mail.gmail.com>

On 12 February 2012 13:33, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Paul Moore, 12.02.2012 13:54:
>> Latin-1, Latin-14 (Euro zone)
>
> OT-remark: I assume you meant ISO8859-15 (aka. Latin-9) here. However,
> that's not for the "Euro zone", it's just Latin-1 with the Euro character
> wangled in and a couple of other changes. It still lacks characters that
> are commonly used by languages within the Euro zone, e.g. the Slovenian
> language (a Slavic descendant), but also Gaelic or Welsh.
>
> https://en.wikipedia.org/wiki/ISO/IEC_8859-15#Coverage
>
> https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Languages_commonly_supported_but_with_incomplete_coverage

Yes, sorry. I misremembered and was sloppy in my wording. My
apologies, and thanks for the correction.
Paul


From sturla at molden.no  Sun Feb 12 21:33:30 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 21:33:30 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <jh8lbe$cev$1@dough.gmane.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
	<4F37CE27.5070908@molden.no> <jh8lbe$cev$1@dough.gmane.org>
Message-ID: <4F38221A.5050208@molden.no>

Den 12.02.2012 16:20, skrev shibturn:
>
> But if his /tmp is a tmpfs file system (which it usually is on Linux) 
> then I think it is entirely equivalent.  Or he could create the file 
> in /dev/shm instead.

It seems that on Linux /tmp is backed by shared memory.

Which sounds rather strange to a Windows user, as the raison d'etre for 
tempfiles is temporary storage space that goes beyond physial RAM.

I've also read that the use of ftruncate in this context can result in 
SIGBUS.

>
> Below is Blob class which seems to work.  Note that the process which 
> created the blob needs to wait for the other process to unpickle it 
> before allowing it to be garbage collected.
>

I would look at kernel refcounts before unlinking. (But I am not that 
familiar with Linux.)


Sturla


From mwm at mired.org  Sun Feb 12 21:56:11 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 15:56:11 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
Message-ID: <CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>

Sorry for the late reply, but this itch finally got to me...

> Please do not claim that fork() semantics and copy-on-write are good
> things to build off of...

They work just fine for large classes of problems that require
hundreds or thousands of cores.

> They are not. ?fork() was designed in a
> world *before threads* existed.

This is wrong. While the "name" thread may not have existed when
fork() was created. the *concept* of concurrent execution in a shared
address space predates the creation of Unix by a good decade. Most
notably, Multics - what the creators of Unix were working on before
they did Unix - at least discussed the idea, though it may never have
been implemented (a common fate of Multics features).

Also notable is that Unix introduced the then ground-breaking idea of
having the command processor create a new process to run user
programs. Before Unix, user commands were run in the process (and
hence address space) of the command processor. Running things in what
is now called "the background" (which this architecture made a major
PITA) gave you concurrent execution in a shared address space - what
we today call threads.

The reason those systems did this was because creating a process was
*expensive*. That's also why the Multics folks looked at threads. The
Unix fork/exec pair was cheap and flexible, allowing the creation of a
command processor that supported easy backgrounding, pipes, and IO
redirection. Fork has since gotten more expensive, in spite of the
ongoing struggles to keep it cheap.

> It simply can not be used reliably in
> a process that uses threads and tons of real world practical C and C++
> software that Python programs need to interact with, be embedded in or
> use via extension modules these days uses threads quite effectively.

Personally, I find that threads can't be used reliably in a process
that forks makes threads bad things to build off of. After all,
there's tons of real world practical software in many languages that
python needs to interact with that use fork effectively.

> The multiprocessing module on posix would be better off if it offered
> a windows CreateProcess() work-a-like mode that spawns a *new* python
> interpreter process rather than depending on fork().

While it's a throwback to the 60s, it would make using threads and
processes more convenient, but I don't need it. Why don't you submit a
patch?

   <mike


From jsbueno at python.org.br  Sun Feb 12 22:32:53 2012
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Sun, 12 Feb 2012 19:32:53 -0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
Message-ID: <CAH0mxTRCsTXH17zG4vYAFmqaBLRMoabEJ8mg14UNnNTcPTQBhA@mail.gmail.com>

On 11 February 2012 21:24, Paul Moore <p.f.moore at gmail.com> wrote:

> What I *don't* know is what those funny bits of
> mojibake I see in the text editor are.
>

So, do yourself and to us, "the rest of the world", a favor, and open the
file in binary mode.

Also, I'd suggest you and anyone being picky about encoding to read
http://www.joelonsoftware.com/articles/Unicode.html so you can finally have
in your mind that *** ASCII is not text *** .

It used to be text when to get to non-[A-Z|a-z] text you had to have
someone recording a file in  a tape, pack it in the luggage, and take a
plane to "overseas" to the U.S.A. . That is not the case anymore, and that,
as far as I understand, is the reasoning to Python 3 to default to unicode.

Anyone can work "ignoring text" and treating bytes as bytes, opening a file
in binary mode. You can use "os.linesep" instead of a hard-coded "\n" to
overcome linebreaking. (Of course you might accidentally break a line
inside a multi-byte character in some enconding, since you prefer to ignore
them altogether, but it should be rare).

  js
 -><-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120212/57499f7a/attachment.html>

From sturla at molden.no  Sun Feb 12 23:14:51 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 23:14:51 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
Message-ID: <4F3839DB.7080804@molden.no>

Den 12.02.2012 21:56, skrev Mike Meyer:
>
> While it's a throwback to the 60s, it would make using threads and
> processes more convenient, but I don't need it. Why don't you submit a
> patch?

I suppose the Windows implementation would do this on Linux as well? At 
least it uses the subprocess module to spawn a new process.  Though I am 
not sure how subprocess interacts with threads in Linux.

Sturla


From mwm at mired.org  Sun Feb 12 23:14:50 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 17:14:50 -0500
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F37C992.3090101@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no>
	<20120212030207.7cd2a5dc@bhuda.mired.org>
	<4F37C992.3090101@molden.no>
Message-ID: <20120212171450.20366678@bhuda.mired.org>

On Sun, 12 Feb 2012 15:15:46 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 12.02.2012 09:02, skrev Mike Meyer:
> > True, but we're talking about an API, not a specific implementation. 
> You have been complaining about the GIL which is a specific implementation.

No, I haven't. To me, the GIL is one of the minor reasons to avoid
using threads in Python. I doubt that I've mentioned it at all.

Given how much attention you pay to details, I no longer care about
getting an answer to my question, as I suspect that it will have as
much accuracy as that statement.

     <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From sturla at molden.no  Sun Feb 12 23:20:05 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 23:20:05 +0100
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <20120212171450.20366678@bhuda.mired.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no>
	<20120212030207.7cd2a5dc@bhuda.mired.org>
	<4F37C992.3090101@molden.no>
	<20120212171450.20366678@bhuda.mired.org>
Message-ID: <4F383B15.2060305@molden.no>

Den 12.02.2012 23:14, skrev Mike Meyer:
>
> lementation.
>> You have been complaining about the GIL which is a specific implementation.
> No, I haven't. To me, the GIL is one of the minor reasons to avoid
> using threads in Python. I doubt that I've mentioned it at all.
>
> Given how much attention you pay to details, I no longer care about
> getting an answer to my question, as I suspect that it will have as
> much accuracy as that statement.
>
>

My apologies, I was confusing you with Matt Joiner.

Sturla


From mwm at mired.org  Sun Feb 12 23:21:05 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 17:21:05 -0500
Subject: [Python-ideas] The concurrency discussion is off-topic!
Message-ID: <20120212172105.2c98f820@bhuda.mired.org>

Please take the concurrency discussion to:

http://mail.python.org/mailman/listinfo/concurrency-sig

-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From sturla at molden.no  Sun Feb 12 23:24:32 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 23:24:32 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
Message-ID: <4F383C20.1050205@molden.no>

Den 12.02.2012 21:56, skrev Mike Meyer:
> The reason those systems did this was because creating a process was 
> *expensive*. That's also why the Multics folks looked at threads. The 
> Unix fork/exec pair was cheap and flexible, allowing the creation of a 
> command processor that supported easy backgrounding, pipes, and IO 
> redirection. Fork has since gotten more expensive, in spite of the 
> ongoing struggles to keep it cheap.

The "expensive" argument is also why the Windows API has no fork, 
although the Windows NT-kernel supports it. (There is even a COW fork in 
Windows' SUA.) I think fork() is the one function I have missed most 
when programming for Windows. It is the best reason to use SUA or Cygwin 
instead of the Windows API.

Sturla


From sturla at molden.no  Sun Feb 12 23:31:10 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 23:31:10 +0100
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120212172105.2c98f820@bhuda.mired.org>
References: <20120212172105.2c98f820@bhuda.mired.org>
Message-ID: <4F383DAE.7000702@molden.no>

Den 12.02.2012 23:21, skrev Mike Meyer:
> Please take the concurrency discussion to:
>
> http://mail.python.org/mailman/listinfo/concurrency-sig
>

It might have diverged into something off-topic. But it started up as a 
response to Jesse Noller on improvement of multiprocessing's IPC 
objects. That is, e.g. being able to send an object with a mp.Lock 
accross a mp.Queue. That is not off-topic AFAIK. I think it is important 
with discussion and feedback on how these objects should work.

Sturla


From sturla at molden.no  Sun Feb 12 23:33:00 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 12 Feb 2012 23:33:00 +0100
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120212172105.2c98f820@bhuda.mired.org>
References: <20120212172105.2c98f820@bhuda.mired.org>
Message-ID: <4F383E1C.90905@molden.no>

Den 12.02.2012 23:21, skrev Mike Meyer:
> Please take the concurrency discussion to:
>
> http://mail.python.org/mailman/listinfo/concurrency-sig
>

It seems that list has nearly zero traffic. Why post to a list that 
nobody reads?

Sturla


From mwm at mired.org  Sun Feb 12 23:42:53 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 17:42:53 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3839DB.7080804@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
Message-ID: <20120212174253.156c3660@bhuda.mired.org>

[Replies have been sent to concurrency-sig at python.org]

On Sun, 12 Feb 2012 23:14:51 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 12.02.2012 21:56, skrev Mike Meyer:
> > While it's a throwback to the 60s, it would make using threads and
> > processes more convenient, but I don't need it. Why don't you submit a
> > patch?
> I suppose the Windows implementation would do this on Linux as well? At 
> least it uses the subprocess module to spawn a new process.  Though I am 
> not sure how subprocess interacts with threads in Linux.

subprocess and threads interact *really* badly on Unix
systems. Python is missing the tools needed to deal with this
situation properly. See http://bugs.python.org/issue6923.

Just another of the minor reasons not to use threads in Python.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From mwm at mired.org  Sun Feb 12 23:47:16 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 17:47:16 -0500
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <4F383E1C.90905@molden.no>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no>
Message-ID: <20120212174716.2dc90c95@bhuda.mired.org>

On Sun, 12 Feb 2012 23:33:00 +0100
Sturla Molden <sturla at molden.no> wrote:

Apologies for the blank response you got.

> Den 12.02.2012 23:21, skrev Mike Meyer:
> > Please take the concurrency discussion to:
> > http://mail.python.org/mailman/listinfo/concurrency-sig
> It seems that list has nearly zero traffic. Why post to a list that 
> nobody reads?

Because that way, we won't be annoying the people who don't care about
concurrency with an off-topic discussion.

If you're interested in concurrency in Python, you should be reading
that list. Given the amount of discussion here, I was surprised at how
quite that list was. I suspect many of those here didn't know about
it, and set about to correct that.

Most of this discussion should be there, and then when that SIG has
thrashed out a proposal for a change, it can be brought back here.

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From anacrolix at gmail.com  Mon Feb 13 01:06:03 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 13 Feb 2012 08:06:03 +0800
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <4F383E1C.90905@molden.no>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no>
Message-ID: <CAB4yi1OXcctdyetMuj5CCyTtuDXJnhn47PSpyQXQieJ7BL0Q7Q@mail.gmail.com>

+1, that list is dead
On Feb 13, 2012 6:33 AM, "Sturla Molden" <sturla at molden.no> wrote:

> Den 12.02.2012 23:21, skrev Mike Meyer:
>
>> Please take the concurrency discussion to:
>>
>> http://mail.python.org/**mailman/listinfo/concurrency-**sig<http://mail.python.org/mailman/listinfo/concurrency-sig>
>>
>>
> It seems that list has nearly zero traffic. Why post to a list that nobody
> reads?
>
> Sturla
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/bcb5b63e/attachment.html>

From anacrolix at gmail.com  Mon Feb 13 01:13:36 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 13 Feb 2012 08:13:36 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120212174253.156c3660@bhuda.mired.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
Message-ID: <CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>

This attitude is exemplary of the status quo in Python on threads: Pretend
they don't exist or you'll get hurt.
On Feb 13, 2012 6:45 AM, "Mike Meyer" <mwm at mired.org> wrote:

> [Replies have been sent to concurrency-sig at python.org]
>
> On Sun, 12 Feb 2012 23:14:51 +0100
> Sturla Molden <sturla at molden.no> wrote:
> > Den 12.02.2012 21:56, skrev Mike Meyer:
> > > While it's a throwback to the 60s, it would make using threads and
> > > processes more convenient, but I don't need it. Why don't you submit a
> > > patch?
> > I suppose the Windows implementation would do this on Linux as well? At
> > least it uses the subprocess module to spawn a new process.  Though I am
> > not sure how subprocess interacts with threads in Linux.
>
> subprocess and threads interact *really* badly on Unix
> systems. Python is missing the tools needed to deal with this
> situation properly. See http://bugs.python.org/issue6923.
>
> Just another of the minor reasons not to use threads in Python.
>
>      <mike
> --
> Mike Meyer <mwm at mired.org>              http://www.mired.org/
> Independent Software developer/SCM consultant, email for more information.
>
> O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/738c04d0/attachment.html>

From shibturn at gmail.com  Mon Feb 13 01:31:42 2012
From: shibturn at gmail.com (shibturn)
Date: Mon, 13 Feb 2012 00:31:42 +0000
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F38221A.5050208@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
	<4F37CE27.5070908@molden.no> <jh8lbe$cev$1@dough.gmane.org>
	<4F38221A.5050208@molden.no>
Message-ID: <jh9llg$16l$1@dough.gmane.org>

On 12/02/2012 8:33pm, Sturla Molden wrote:
> It seems that on Linux /tmp is backed by shared memory.
>
> Which sounds rather strange to a Windows user, as the raison d'etre for
> tempfiles is temporary storage space that goes beyond physial RAM.

In reality /tmp is backed by swap space, so physical RAM does not impose 
a limit.  Anonymous mmaps are also backed by swap space.

> I've also read that the use of ftruncate in this context can result in
> SIGBUS.

Isn't that if you truncate the file to a smaller size *after* it has 
been mapped.  As far as I am aware, using ftruncate to set the length 
*before* it can be mapped for the first time is standard practice and 
harmless.

>> Below is Blob class which seems to work. Note that the process which
>> created the blob needs to wait for the other process to unpickle it
>> before allowing it to be garbage collected.
>>
>
> I would look at kernel refcounts before unlinking. (But I am not that
> familiar with Linux.)

Even if you have automatic refcounting like on Windows, you still need 
to cope with lifetime management issues.  If you put an object on a 
queue it may be a long time before the target process will unpickle the 
object and increase its refcount, and you must not decref the object 
until it has, or else it will disappear.

I don't know how to get the ref count for a file descriptor on Unix. 
(And posix shared memory does not seems to get a refcount either, even 
though System V shared memory does.)

sbt


From mwm at mired.org  Mon Feb 13 01:36:20 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 19:36:20 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
Message-ID: <20120212193620.65adad41@bhuda.mired.org>

On Mon, 13 Feb 2012 08:13:36 +0800
Matt Joiner <anacrolix at gmail.com> wrote:

> This attitude is exemplary of the status quo in Python on threads: Pretend
> they don't exist or you'll get hurt.

Yup. After all, the answer to the question "Which modules in the
standard library are thread-safe?" is "threading, queue, logging and
functools" (at least, that's my best guess). Any effort to "fix"
threading in Python is pretty much doomed until the authoritative
answer to that question includes most of the standard library.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From sturla at molden.no  Mon Feb 13 01:41:48 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 13 Feb 2012 01:41:48 +0100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
Message-ID: <4F385C4C.2060600@molden.no>

Den 13.02.2012 01:13, skrev Matt Joiner:
> This attitude is exemplary of the status quo in Python on threads: Pretend
> they don't exist or you'll get hurt.

It's more that status quo on threads anywhere.

Sturla


From shibturn at gmail.com  Mon Feb 13 01:42:41 2012
From: shibturn at gmail.com (shibturn)
Date: Mon, 13 Feb 2012 00:42:41 +0000
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <jh9llg$16l$1@dough.gmane.org>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
	<4F37CE27.5070908@molden.no> <jh8lbe$cev$1@dough.gmane.org>
	<4F38221A.5050208@molden.no> <jh9llg$16l$1@dough.gmane.org>
Message-ID: <jh9ma2$4p6$1@dough.gmane.org>

On 13/02/2012 12:31am, shibturn wrote:
> Isn't that if you truncate the file to a smaller size *after* it has
> been mapped.  As far as I am aware, using ftruncate to set the length
> *before* it can be mapped for the first time is standard practice and
> harmless.

Ah, on some Unixes ftruncate() limits the size of the file, but will not 
increase it.

sbt


From greg.ewing at canterbury.ac.nz  Sun Feb 12 22:43:03 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 13 Feb 2012 10:43:03 +1300
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
Message-ID: <4F383267.2060600@canterbury.ac.nz>

Paul Moore wrote:
> I'd like to write code to do the same
> job without needing to lie about what I know. I'm now 100% convinced
> that encoding="ascii",errors="surrogateescape" is the way to say this
> in code.

Perhaps there should be a more shortwinded way of
spelling this?

-- 
Greg


From pyideas at rebertia.com  Mon Feb 13 01:50:34 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 12 Feb 2012 16:50:34 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F383267.2060600@canterbury.ac.nz>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<4F383267.2060600@canterbury.ac.nz>
Message-ID: <CAMZYqRQr7dk=hr8k3zcwJQcfXh+MLRjy+kG+EQ8JFfgE2w1L8w@mail.gmail.com>

On Sun, Feb 12, 2012 at 1:43 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Paul Moore wrote:
>> I'd like to write code to do the same
>> job without needing to lie about what I know. I'm now 100% convinced
>> that encoding="ascii",errors="surrogateescape" is the way to say this
>> in code.
>
> Perhaps there should be a more shortwinded way of
> spelling this?

See http://bugs.python.org/issue13997 , mentioned earlier in the thread.

Cheers,
Chris


From mwm at mired.org  Mon Feb 13 01:53:16 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 12 Feb 2012 19:53:16 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F385C4C.2060600@molden.no>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
	<4F385C4C.2060600@molden.no>
Message-ID: <20120212195316.66eab1a5@bhuda.mired.org>

On Mon, 13 Feb 2012 01:41:48 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 13.02.2012 01:13, skrev Matt Joiner:
> > This attitude is exemplary of the status quo in Python on threads: Pretend
> > they don't exist or you'll get hurt.
> It's more that status quo on threads anywhere.

Not (quite) true. There are a few fringe languages that have embraced
threading and been built (or worked over) from the ground up to work
well with it. I haven't seen any let you mix multiprocessing and
threading safely, though, so the attitude there is "pretend fork
doesn't exist or you'll get hurt."

These are the places where I've seen safe (as in, I trusted them as
much as I'd have trusted a version written using processes)
non-trivial (as in, they were complex enough that if they'd been
written in a mainstream language like Python, I wouldn't have trusted
them) threaded applications.

I strongly believe we need better concurrency solutions in Python. I'm
not convinced that threading is best general solution, because
threading is like the GIL: a kludge that solves the problem by fixing
*everything*, whether it needs it or not, and at very high cost.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ctb at msu.edu  Mon Feb 13 03:35:12 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Sun, 12 Feb 2012 18:35:12 -0800
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <4F383E1C.90905@molden.no>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no>
Message-ID: <20120213023512.GE27683@idyll.org>

On Sun, Feb 12, 2012 at 11:33:00PM +0100, Sturla Molden wrote:
> Den 12.02.2012 23:21, skrev Mike Meyer:
>> Please take the concurrency discussion to:
>>
>> http://mail.python.org/mailman/listinfo/concurrency-sig
>
> It seems that list has nearly zero traffic. Why post to a list that  
> nobody reads?

It's the right place to discuss these things:

	concurrency-sig: Discussion of concurrency issues in python. 

and presumably you won't be e-mailing as many people who *aren't* interested
in concurrency.

Python-ideas is rapidly becoming the *wrong* place for this discussion:

	Python-ideas: This list is to contain discussion of speculative
	language ideas for Python for possible inclusion into the language.
	If an idea gains traction it can then be discussed and honed to the
	point of becoming a solid proposal to put to either python-dev or
	python-3000 as appropriate. 

So, whether or not it was the right place to begin with, could you please
move it to concurrency-sig?

thanks,
--titus (moderator)
-- 
C. Titus Brown, ctb at msu.edu


From tjreedy at udel.edu  Mon Feb 13 04:41:15 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 12 Feb 2012 22:41:15 -0500
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
Message-ID: <jha0ot$t3v$1@dough.gmane.org>

On 2/12/2012 7:54 AM, Paul Moore wrote:

> No. I know that a lot of Unix people advocate UTF-8, and I gather it's
> rapidly becoming standard in the Unix world. But I work on Windows,

Unicode and utf-8 is a standard for the world, not Unix. It surpassed 
us-ascii as the most used character encoding for the WWW about 4 years 
ago. https://en.wikipedia.org/wiki/ASCII

XML is unicode based. I think it fair to say that UTF-8 (and UTF-16) are 
preferred encodings, as 'Encodings other than UTF-8 and UTF-16 will not 
necessarily be recognized by every XML parser'
https://en.wikipedia.org/wiki/Xml#Encoding_detection
OpenDocument is one of many xml-based formats.

Any modern database program that intends to store arbitrary text must 
store unicode (or at least the BMP subset).

So any text-oriented Windows program that gets input from the rest of 
the world has to handle unicode and at least the utf-8 encoding thereof. 
My impression is that Windows itself now uses unicode for text storage. 
It is a shame that it still somewhat hides that by using limited subset 
codepage facades.

None of this minimizes the problem of dealing with text in the 
multiplicity of national and language encodings. None that is not the 
fault of unicode, and unicode makes dealing with multiple encodings at 
the same time much easier. It is too bad that unicode was only developed 
in the 1990s instead of the 1960s.

-- 
Terry Jan Reedy


From stephen at xemacs.org  Mon Feb 13 04:55:37 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 12:55:37 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <50BA6538-76D0-4B1B-8C2A-6DBEB9B1B94B@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<50BA6538-76D0-4B1B-8C2A-6DBEB9B1B94B@gmail.com>
Message-ID: <87aa4nfkue.fsf@uwakimon.sk.tsukuba.ac.jp>

Carl M. Johnson writes:
 > On Feb 10, 2012, at 5:32 PM, Stephen J. Turnbull wrote:
 > 
 > > will founder on '?scar Fuentes' as author, unless you know what
 > > coding system is used, or know enough to use latin-1 (because
 > > it's effectively binary, not because it's the actual encoding).
 > 
 > Or just use errors="surrogateescape". I think we should tell people
 > who are scared of unicode and refuse to learn how to use it to just
 > add an errors="surrogateescape" keyword to their file open
 > arguments. Obviously, it's the wrong thing to do, but it's wrong in
 > the same way that Python 2 bytes are wrong, so if you're absolutely
 > committed to remaining ignorant of encodings, you can continue to
 > do that.

No, it's not the same as Python 2, and it's *subtly* the wrong thing
to do, too.  surrogateescape is intended to roundtrip on input from a
specific API to unchanged output to that same API, and that's all it
it is guaranteed to do.

Less pedantically, if you use latin-1, the internal representation is
valid Unicode but (partially) incorrect content.  No UnicodeErrors.
If you use errors="surrogateescape", any code that insists on valid
Unicode will crash.  Here I'm talking about a use case where the
user believes that as long as the ASCII content is correct they will
get correct output.

It's arguable that using errors="surrogateescape" is a better
approach, *because* of the possibility of a validity check.  I tend to
think not.  But that's a different argument from "same as Python 2".


From stephen at xemacs.org  Mon Feb 13 05:03:29 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 13:03:29 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
Message-ID: <878vk7fkha.fsf@uwakimon.sk.tsukuba.ac.jp>

Masklinn writes:

 > > Or just use the ISO-8859-1 encoding.
 > 
 > It's true that requires to handle encodings upfront where Python 2
 > allowed you to play fast-and-lose though.
 > 
 > And using latin-1 in that context looks and feels weird/icky, the
 > file is not encoded using latin-1, the encoding just happens to
 > work to manipulate bytes as ascii text + non-ascii stuff.

So give latin-1 an additional name.  Emacsen use "raw-text" (there's
also binary, but raw-text will do a loose equivalent of universal
newlines for you, binary doesn't).  You could also use a name more
exact and less English-biased like "ascii-compatible-bytes".  Same
codec, name denotes different semantics.


From stephen at xemacs.org  Mon Feb 13 05:43:35 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 13:43:35 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jh64oq$fh6$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<87vcnevyd7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh64oq$fh6$1@dough.gmane.org>
Message-ID: <877gzrfimg.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:
 > On 2/10/2012 10:32 PM, Stephen J. Turnbull wrote:
 > 
 > The issue is whether Python 3 has a "strong imposition of Unicode
 > awareness" that Python 2 does not. If the OP only meant awareness of the 
 > fact that something called 'unicode' exists, then I suppose that could 
 > be argued. I interpreted the claim as being about some substantive 
 > knowledge of unicode.

I interpreted the claim as being about changing their coding practice,
including maintaining existing scripts and modules that deal with
textual input that people may need/want to transition to Python 3.  As
Paul Moore pointed out, adding "encoding='latin-1'" to their scripts
doesn't come naturally to everyone.

I'm sure that at a higher level, that's the stance you intend to take,
too.  I think there's a disconnect between that high-level stance, and
the interpretation that it's about "substantive knowledge of Unicode".

 > In any case, the claim that I disagree a not about people's
 > reactions to Python 3 or about human psychology and the propensity
 > to stick with the known.

OK.  But then I think you are failing to deal with the problem,
because I think *that* is the problem.  Python 3 doesn't lack simple
idioms for making (most naive, near-English) processing look like
Python 2 to a greater or lesser extent.  The question is which of
those idioms we should teach, and AFAICS what's controversial about
that depends on human psychology, not on the admitted facts about
Python 3.

 > In response to Jim Jewett, you wrote
 >  > The fact is that with a little bit of knowledge, you can almost
 >  > certainly get more reliable (and in case of failure, more debuggable)
 >  > results from Python 3 than from Python 2.
 > 
 > That is pretty much my counterclaim, with the note that the 'little
 > bit of knowledge' is most about non-unicode encodings and the
 > change to some Python details.

And my counterrebuttal is "true -- but that's not what these users
want, and they probably don't need it."  That is, they don't want to
debug a crash when they don't care what happens to non-ASCII in their
mostly-ASCII, nearly-readable-as-English byte streams.

 > > The point is that the user case you discuss is a toy case.
 > 
 > Thanks for dismissing me and perhaps a hundred thousand users as a
 > 'toy cases'.

Thanks for unwarrantedly dissing me.  I do *not* dismiss people.  I
claim that the practical use case for these users is *not* 6-sigma-
pure ASCII.  You, too, will occasionally see Mr. Fuentes or even his
Israeli sister-in-law show up in your "pure ASCII, or so I thought"
texts.  Better-than-Ivory-soap-pure *is* a "toy" case.  Only in one's
own sandbox can that be guaranteed.  Otherwise, Python 3 needs to be
instructed to prepare for (occasional) non-ASCII.

 > Exactly, and finding the Python 3 version of the magic spells
 > needed in various cases, so they can be documented and publicized,
 > is what I have been trying to do. For ascii-only use, the magic
 > spell in 'ascii' in bytes() calls.

Except that AFAIK Python 3 already handles pure ASCII pretty much
automatically.  But pure ASCII doesn't exist for most people any more,
even in Kansas; that magic spell will crash.  'latin-1' is a much
better spell (except for people who want to crash in appropriate
circumstances -- but AFAIK in the group whose needs this thread
addresses, they are a tiny minority).

 > > I don't know of any nice way to say that.
 > 
 > There was no need to say it.

Maybe not, but I think there was.  Some of your well-intended
recommendations are unrealistic, and letting them pass would be a
disservice to the users we are *both* trying to serve.


From stephen at xemacs.org  Mon Feb 13 05:50:04 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 13:50:04 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh65uv$ng5$1@dough.gmane.org>
	<FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
Message-ID: <8762fbfibn.fsf@uwakimon.sk.tsukuba.ac.jp>

Masklinn writes:

 > Why not open the file in binary mode in stead? (and replace `'*'`
 > by `b'*'` in the startswith call)

This will often work, but it's task-dependent.  In particular, I
believe not just `.startswith(), but general regexps work with either
bytes or str in Python 3.  But other APIs may not. and you're going to
need to prefix *all* literals (including those in modules your code
imports!) with `b`.  So you import a module that does exactly what you
want, and be stymied by a TypeError because the module wants Unicode.

This would not happen with Python 2, and there's the rub.


From stephen at xemacs.org  Mon Feb 13 06:04:40 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 14:04:40 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
Message-ID: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>

Masklinn writes:

 > Except it's not processed as text, it's processed as "stuff with ascii
 > characters in it". Might just as well be cp-1252, or UTF-8, or Shift JIS
 > (which is kinda-sorta-extended-ascii but not exactly), and while using
 > an ISO-8859 will yield unicode data that's about the only thing you can
 > say about it and the actual result will probably be mojibake either
 > way.

That's the coding pedant's way to look at it.  However, people who
speak only ASCII or Latin 1 are in general not going to see it that
way.

The ASCII speakers are a pretty clear-cut case.  Using 'latin-1' as
the codec, almost all things they can do with a 100% ASCII program and
a sanely-encoded text (which leaves out Shift JIS, Big 5, and maybe
some obsolete Vietnamese encodings, but not much else AFAIK) will pass
through the non-ASCII verbatim, or delete it.

Latin 1 speakers are harder, because they might do things like convert
accented characters to their base, which would break multibyte
characters in Asian languages.  Still, one suspects that they mostly
won't care terribly much about that (if they did, they'd be interested
in using Unicode properly, and it would be worth investing the small
amount of time required to learn a couple of recipes).

 > By processing it as bytes, it's made explicit that this is not
 > known and decoded text (which is what unicode strings imply) but
 > that it's some semi-arbitrary ascii-compatible encoding and that's
 > the extent of the developer's knowledge and interest in it.

No, decoding with 'latin-1' is a far better approach for this purpose.
If the name bothers you, give it an alias like
'semi-arbitrary-ascii-compatible'.

The problem is that for many operations, b'*' and 'trailing text' are
incompatible.  Try concatenating them, or testing one against the
other with .startswith(), or whatever.  Such literals are buried in
many modules, and you will lose if you're using bytes because those
modules generally assume you're working with str.


From stephen at xemacs.org  Mon Feb 13 06:12:56 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 14:12:56 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
Message-ID: <8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Moore writes:

 > I'm now 100% convinced that
 > encoding="ascii",errors="surrogateescape" is the way to say this in
 > code.

It probably is, for you.  If that ever gives you a UnicodeError, you
know how to find out how to deal with it.  And it probably won't.<wink/>

That may also be a good universal default for Python 3, as it will
pass through non-ASCII text unchanged, while raising an error if the
program tries to manipulate it (or hand it to a module that
validates).  (encoding='latin-1' definitely is not a good default.)
But I'm not sure of that, and the current approach of using the
preferred system encoding is probably better.

I don't think either argument applies to everybody who needs such a
recipe, though.  Many will be best served with encoding='latin-1' by
some name.


From ncoghlan at gmail.com  Mon Feb 13 06:16:09 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 13 Feb 2012 15:16:09 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <8762fbfibn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh65uv$ng5$1@dough.gmane.org>
	<FBF5C2AC-0F9F-4C42-9F75-415BF04E9342@masklinn.net>
	<8762fbfibn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7cF6LVDSbioUYTvC38sxasjwkRO-NWUieemSeE=q4wgeA@mail.gmail.com>

On Mon, Feb 13, 2012 at 2:50 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Masklinn writes:
>
> ?> Why not open the file in binary mode in stead? (and replace `'*'`
> ?> by `b'*'` in the startswith call)
>
> This will often work, but it's task-dependent. ?In particular, I
> believe not just `.startswith(), but general regexps work with either
> bytes or str in Python 3. ?But other APIs may not. and you're going to
> need to prefix *all* literals (including those in modules your code
> imports!) with `b`. ?So you import a module that does exactly what you
> want, and be stymied by a TypeError because the module wants Unicode.
>
> This would not happen with Python 2, and there's the rub.

The other trap is APIs like urllib.parse which explicitly refuse the
temptation to guess when it comes to bytes data, and decodes it as
"ascii+strict". If you want it to do something else that's more
permissive (e.g. "latin-1" or "ascii+surrogateescape") then you *have*
to decode it to Unicode yourself before handing it over.

Really, Python 3 forces programmers to learn enough about Unicode to
be able to make the choice between the 4 possible options for
processing ASCII-compatible encodings:

1. Process them as binary data. This is often *not* going to be what
you want, since many text processing APIs will either only accept
Unicode, or only pure ASCII, or require you to supply encoding+errors
if you want them to process binary data.

2. Process them as "latin-1". This is the answer that completely
bypasses all Unicode integrity checks. If you get fed non-ASCII data,
you *will* silently produce gibberish as output.

3. Process them as "ascii+surrogateescape". This is the *right* answer
if you plan solely to manipulate the text and then write it back out
in the same encoding as was originally received. You will get errors
if you try to write a string with escaped characters out to a
non-ascii channel or an ascii channel without surrogateescape enabled.
To write such strings to non-ascii channels (e.g. sys.stdout), you
need to remember to use something like "ascii+replace" to mask out the
values with unknown encoding first. You may still get hard to debug
UnicodeEncodeError exceptions when handed data in a non-ASCII
compatible encoding (like UTF-16 or UTF-32), but your odds of silently
corrupting data are fairly low.

4. Get a third party encoding guessing library and use that instead of
waving away the problem of ASCII-incompatible encodings.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stephen at xemacs.org  Mon Feb 13 06:24:54 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 14:24:54 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CADiSq7dqcDf3vWvkttb9ckmEGDCZcTt9TPrcozrGtfL04NmRwg@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<9CBC5148-A454-4FE3-9F2E-18A7FCB27CE7@gmail.com>
	<CALFfu7DGvJy-a3Hx50XpKrJGtR69HG0Qf6iZ+N+R_VHCm13Vzw@mail.gmail.com>
	<ABEB711D-0138-4AFD-9470-33ADB368571C@gmail.com>
	<CADiSq7do6gpXk5kezvveOrMJHi49tmsdpxvToZzvgbGHo1xP+Q@mail.gmail.com>
	<4F374D80.7030309@pearwood.info>
	<CADiSq7dqcDf3vWvkttb9ckmEGDCZcTt9TPrcozrGtfL04NmRwg@mail.gmail.com>
Message-ID: <871upzfgpl.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > Yeah, it didn't take long for me to come back around to that point of
 > view, so I morphed http://bugs.python.org/issue13997 into a docs bug
 > about clearly articulating the absolute bare minimum knowledge of
 > Unicode needed to process text in a robust cross-platform manner in
 > Python 3 instead.

+1

I think (as I've said more verbosely elsewhere) that there are two
common use cases, corresponding to two different definitions of
"robust text processing".

(1) Use cases where you would rather risk occasionally corrupting
    non-ASCII text than risk *any* UnicodeErrors at all *anywhere*.

    They use encoding='latin-1'.

(2) Use cases where you do not want to deal with encodings just to
    "pass through" non-ASCII text, but do want that text preserved
    enough to be willing to risk (rare) UnicodeErrors or validation
    errors from pedantic Unicode-oriented modules.

    They use encoding='ascii', errors='surrogateescape'.


From stephen at xemacs.org  Mon Feb 13 06:42:00 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 14:42:00 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
Message-ID: <87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Moore writes:

 > > But you obviously do know the convention -- use UTF-8.
 > 
 > No. I know that a lot of Unix people advocate UTF-8, and I gather it's
 > rapidly becoming standard in the Unix world. But I work on Windows,
 > and UTF-8 is not the standard there. I have no idea if UTF-8 is
 > accepted cross-platform,

It is.  All of Microsoft's programs (and I suppose most third-party
software, too) that I know of will happily import UTF-8-encoded text,
and produce it as well.  Most Microsoft-specific file formats (eg,
Word) use UTF-16 internally, but they can't be read by most
text-oriented programs, so in practice they're app/octet-strm.

The problem is the one you point out: files you receive from third
parties are still fairly likely to be in a non-Unicode encoding.

 > Fair comment. My point here is that I *am* dealing with "legacy" data
 > in your sense. And I do so on a day to day basis. UTF-8 is very, very
 > rare in my world (Windows). Latin-1 (or something close) is common.
 > 
 > There is no cross-platform standard yet. And probably won't be until
 > Windows moves to UTF-8 as the standard encoding. Which ain't happening
 > soon.

True.  But for personal use, and for communicating with people you
have some influence over, you can use/recommend UTF-8 safely as far I
know.  I occasionally get asked by Japanese people why files I send in
UTF-8 are broken; it invariably turns out that they sent me a file in
Shift JIS that contained a non-JIS (!) character and my software
translated it to REPLACEMENT CHARACTER before sending as UTF-8.

 > I think people are much more aware of the issues, but cross-platform
 > handling remains a hard problem. I don't wish to make assumptions, but
 > your insistence that UTF-8 is a viable solution suggests to me that
 > you don't know much about the handling of Unicode on Windows. I wish I
 > had that luxury...

I don't understand what you mean by that.  Windows doesn't make
handling any non-Unicode encodings easy, in my experience, except for
the local code page.  So, OK, if you're in a monolingual Windows
environment (eg, the typical Japanese office), everybody uses a common
legacy encoding for file exchange (including URLs and MIME filename=
:-(, in particular Shift JIS), and only that encoding works well (ie,
without the assistance of senior tech support personnel).  Handling
Unicode, though, isn't really an issue; all of Microsoft's programs
happily deal with UTF-8 and UTF-16 (in its several varieties).

 > And that's even without all this foreign UTF-8 I get from the Unix
 > guys :-) Apart from the blasted UTF-16, all of it's "ASCII most of
 > the time".

Indeed.  Do you really see UTF-16 in files that you process with
Python?


From stephen at xemacs.org  Mon Feb 13 06:49:19 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 14:49:19 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F383267.2060600@canterbury.ac.nz>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<4F383267.2060600@canterbury.ac.nz>
Message-ID: <87y5s7e10g.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:
 > Paul Moore wrote:
 > > I'd like to write code to do the same
 > > job without needing to lie about what I know. I'm now 100% convinced
 > > that encoding="ascii",errors="surrogateescape" is the way to say this
 > > in code.
 > 
 > Perhaps there should be a more shortwinded way of
 > spelling this?

Yes!  However, I don't think this 1.5-liner needs to be a built-in.
(The 1.5-liner for 'open_as_ascii_compatible' was posted elsewhere.)

There's also the issue of people who strongly prefer sloppy encoding
and Read My Lips: No UnicodeErrors.  I disagree with them in all
purity, but you know ....


From ncoghlan at gmail.com  Mon Feb 13 06:54:24 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 13 Feb 2012 15:54:24 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>

On Mon, Feb 13, 2012 at 3:04 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> The ASCII speakers are a pretty clear-cut case. ?Using 'latin-1' as
> the codec, almost all things they can do with a 100% ASCII program and
> a sanely-encoded text (which leaves out Shift JIS, Big 5, and maybe
> some obsolete Vietnamese encodings, but not much else AFAIK) will pass
> through the non-ASCII verbatim, or delete it.

I'd hazard a guess that the non-ASCII compatible encoding mostly
likely to be encountered outside Asia is UTF-16. The choice is really
between "never give me UnicodeErrors, but feel free to silently
corrupt the data stream if I do the wrong thing with that data" (i.e.
"latin-1") and "correctly handle any ASCII compatible encoding, but
still throw UnicodeEncodeError if I'm about to emit corrupted data"
("ascii+surrogateescape").

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Mon Feb 13 06:57:40 2012
From: mwm at mired.org (Mike Meyer)
Date: Mon, 13 Feb 2012 00:57:40 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120213023640.GF27683@idyll.org>
References: <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
	<4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
	<20120212193620.65adad41@bhuda.mired.org>
	<20120213023640.GF27683@idyll.org>
Message-ID: <20120213005740.0db1cb38@bhuda.mired.org>

On Sun, 12 Feb 2012 18:36:40 -0800
"C. Titus Brown" <ctb at msu.edu> wrote:

> "All of them except subprocess, on some platforms" is the answer, AFAIK.  Which
> is kind of the point.

Do you have any documentation to back this up? For instance, The
collections and random module are both known to have code in them that
isn't thread safe. For the random module, you can check the docstring:

    Help on method gauss in module random:

    gauss(self, mu, sigma) method of random.Random instance
	Gaussian distribution.

	mu is the mean, and sigma is the standard deviation.  This is
	slightly faster than the normalvariate() function.

	Not thread-safe without a lock around calls.

For the collections module, I quote the functools module:

   lock = Lock()                   # needed because ordereddicts aren't threadsafe

The argparse and pprint modules both use ordereddicts without either
locking them providing an explanation as to why they don't need to,
which makes both of them suspect as well.

Given those cases, I'm not willing to trust a simple assertion that a
module is thread-safe, unless it's from the author or a primary
maintainer of the module, or someone who's actually audited the module
in question for thread safety.

   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ctb at msu.edu  Mon Feb 13 07:03:50 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Sun, 12 Feb 2012 22:03:50 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120213005740.0db1cb38@bhuda.mired.org>
References: <4F3563C4.2050703@egenix.com> <4F3575FB.60700@molden.no>
	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>
	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>
	<4F3839DB.7080804@molden.no>
	<20120212174253.156c3660@bhuda.mired.org>
	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>
	<20120212193620.65adad41@bhuda.mired.org>
	<20120213023640.GF27683@idyll.org>
	<20120213005740.0db1cb38@bhuda.mired.org>
Message-ID: <20120213060350.GA11284@idyll.org>

On Mon, Feb 13, 2012 at 12:57:40AM -0500, Mike Meyer wrote:
> On Sun, 12 Feb 2012 18:36:40 -0800
> "C. Titus Brown" <ctb at msu.edu> wrote:
> 
> > "All of them except subprocess, on some platforms" is the answer, AFAIK.  Which
> > is kind of the point.
> 
> Do you have any documentation to back this up? For instance, The
> collections and random module are both known to have code in them that
> isn't thread safe. For the random module, you can check the docstring:
> 
>     Help on method gauss in module random:
> 
>     gauss(self, mu, sigma) method of random.Random instance
> 	Gaussian distribution.
> 
> 	mu is the mean, and sigma is the standard deviation.  This is
> 	slightly faster than the normalvariate() function.
> 
> 	Not thread-safe without a lock around calls.
> 
> For the collections module, I quote the functools module:
> 
>    lock = Lock()                   # needed because ordereddicts aren't threadsafe
> 
> The argparse and pprint modules both use ordereddicts without either
> locking them providing an explanation as to why they don't need to,
> which makes both of them suspect as well.
> 
> Given those cases, I'm not willing to trust a simple assertion that a
> module is thread-safe, unless it's from the author or a primary
> maintainer of the module, or someone who's actually audited the module
> in question for thread safety.

Good points; I was equating thread safety with not crashing, when I should
have been thinking about consistency in other ways.

thanks,
--titus

p.s. Why did you take a private e-mail response and reply to it to the group?
Bad netiquette & rather rude.  (Private not because I object to being pointed
out as being wrong, but because I'm tired of these long discussions being sent
to python-ideas.)
-- 
C. Titus Brown, ctb at msu.edu


From stephen at xemacs.org  Mon Feb 13 07:23:33 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 15:23:33 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <20120212043411.GA442@cskk.homeip.net>
References: <87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20120212043411.GA442@cskk.homeip.net>
Message-ID: <87r4xzdzfe.fsf@uwakimon.sk.tsukuba.ac.jp>

Cameron Simpson writes:

 > At least with Python 3 you find out early that you're doing something
 > dodgy.

The point is that there is a use case for "doing something dodgy."
See Paul Moore's subthread for an example and discussion.

However, I think people who do something dodgy should be forced to
make it explicit in their code.


From stephen at xemacs.org  Mon Feb 13 07:30:34 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 13 Feb 2012 15:30:34 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <87pqdjdz3p.fsf@uwakimon.sk.tsukuba.ac.jp>

Sorry for the self-reply, but this should be clarified.

Stephen J. Turnbull writes:

 > know.  I occasionally get asked by Japanese people why files I send in
 > UTF-8 are broken; it invariably turns out that they sent me a file in
 > Shift JIS that contained a non-JIS (!) character and my software
 > translated it to REPLACEMENT CHARACTER before sending as UTF-8.

Ie, the breakage that you're likely to encounter in using UTF-8
wherever possible is *very* minor, and typically related to somebody
else failing to conform to standards.


From cs at zip.com.au  Mon Feb 13 08:21:05 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Mon, 13 Feb 2012 18:21:05 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87r4xzdzfe.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <87r4xzdzfe.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20120213072105.GA8419@cskk.homeip.net>

On 13Feb2012 15:23, Stephen J. Turnbull <stephen at xemacs.org> wrote:
| Cameron Simpson writes:
|  > At least with Python 3 you find out early that you're doing something
|  > dodgy.
| 
| The point is that there is a use case for "doing something dodgy."
| See Paul Moore's subthread for an example and discussion.

Yes.

| However, I think people who do something dodgy should be forced to
| make it explicit in their code.

I think I agree here, too.
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

There are old climbers, and there are bold climbers; but there are no old
bold climbers.


From p.f.moore at gmail.com  Mon Feb 13 09:12:43 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 13 Feb 2012 08:12:43 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>

On 13 February 2012 05:12, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Paul Moore writes:
>
> ?> I'm now 100% convinced that
> ?> encoding="ascii",errors="surrogateescape" is the way to say this in
> ?> code.
>
> It probably is, for you. ?If that ever gives you a UnicodeError, you
> know how to find out how to deal with it. ?And it probably won't.<wink/>

And yet, after your earlier posting on latin-1, and your comments
here, I'm less certain. Thank you so much :-)

Seriously, I find these discussions about Unicode immensely useful. I
now have a much better feel for how to deal with (and think about)
text in "unknown but mostly ASCII" format, which can only be a good
thing.

> I don't think either argument applies to everybody who needs such a
> recipe, though. ?Many will be best served with encoding='latin-1' by
> some name.

Probably the key question is, how do we encapsulate this debate in a
simple form suitable for people to find out about *without* feeling
like they "have to learn all about Unicode"? A note in the Unicode
HOWTO seems worthwhile, but how to get people to look there? Given
that this is people who don't want to delve too deeply into Unicode
issues.

Just to be clear, my reluctance to "do the right thing" was *not*
because I didn't want to understand Unicode - far from it, I'm
interested in, and inclined towards, "doing Unicode right". The
problem is that I know enough to realise that "proper" handling of
files where I don't know the encoding, and it seems to be inconsistent
sometimes (both between files, and even on occasion within a file), is
a seriously hard issue. And I don't want to get into really hard
Unicode issues for what, in practical terms, is a simple problem as
it's one-off code and minor corruption isn't really an issue.

Paul.


From ubershmekel at gmail.com  Mon Feb 13 09:26:09 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Mon, 13 Feb 2012 10:26:09 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
Message-ID: <CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>

On Feb 13, 2012 10:13 AM, "Paul Moore" <p.f.moore at gmail.com> wrote:
>
> On 13 February 2012 05:12, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > Paul Moore writes:
> >
> >  > I'm now 100% convinced that
> >  > encoding="ascii",errors="surrogateescape" is the way to say this in
> >  > code.
> >
> > It probably is, for you.  If that ever gives you a UnicodeError, you
> > know how to find out how to deal with it.  And it probably won't.<wink/>
>
> And yet, after your earlier posting on latin-1, and your comments
> here, I'm less certain. Thank you so much :-)
>
> Seriously, I find these discussions about Unicode immensely useful. I
> now have a much better feel for how to deal with (and think about)
> text in "unknown but mostly ASCII" format, which can only be a good
> thing.
>
> > I don't think either argument applies to everybody who needs such a
> > recipe, though.  Many will be best served with encoding='latin-1' by
> > some name.
>
> Probably the key question is, how do we encapsulate this debate in a
> simple form suitable for people to find out about *without* feeling
> like they "have to learn all about Unicode"? A note in the Unicode
> HOWTO seems worthwhile, but how to get people to look there? Given
> that this is people who don't want to delve too deeply into Unicode
> issues.
>
> Just to be clear, my reluctance to "do the right thing" was *not*
> because I didn't want to understand Unicode - far from it, I'm
> interested in, and inclined towards, "doing Unicode right". The
> problem is that I know enough to realise that "proper" handling of
> files where I don't know the encoding, and it seems to be inconsistent
> sometimes (both between files, and even on occasion within a file), is
> a seriously hard issue. And I don't want to get into really hard
> Unicode issues for what, in practical terms, is a simple problem as
> it's one-off code and minor corruption isn't really an issue.
>
> Paul.

Adding a url for help in the exception string that points to a python
unicode faq sounds like a good idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/399ce555/attachment.html>

From christopherreay at gmail.com  Mon Feb 13 09:50:03 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 10:50:03 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
	<CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
Message-ID: <CAMgkT_8CD4OJ-SozEbPaEa5gP6werZtQqCYcYKz__tBq1sUqCw@mail.gmail.com>

+1 for the URL in the exception. Well in all exceptions

Bringing the language into the 21st century.
Great entry points for learning about the language.

Whilst google provides an excellent service in finding documentation, it
seems that a programming language has other methods of defining entry
points for learning, being a complex but (mostly) deterministic thing. So
exceptions with URLs. The URLs point to kind of "knowledge base wiki" sorts
of things where the "What is your intent/usecase" can be matched up with
the deterministic state we know the interpreter is in.

With something like encodings, which can be happily ignored by someone
until poof, suddenly they just have mush, finding out things like "Its
possible printing the string to the screen is giving the error", and "There
are libraries which guess encodings" and "latin-1" is a magic bullet can
take many many days of searching.

Also it may be possible, from this perspective, to show ways that the
developer can gather more deterministic information about his interpreter's
state to narrow down his intent for the Knowledge Base (e.g. if its a print
statement that throws the error, its possible the program doesnt have any
encoding issues, except debugging statements)

The encoding issue here is a great example of this because of the
complexity and mobility of encodings (i.e. they ve changed a lot). There
must be other good examples which can fireup equally strong and informative
discussion on "options" and their limitations and benefits.

Id be very interested in formalising the idea of a "KnowledgeBase Wiki
thing", maybe there already is one...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/2b9718c7/attachment.html>

From cmjohnson.mailinglist at gmail.com  Mon Feb 13 10:19:15 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sun, 12 Feb 2012 23:19:15 -1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAMgkT_8CD4OJ-SozEbPaEa5gP6werZtQqCYcYKz__tBq1sUqCw@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
	<CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
	<CAMgkT_8CD4OJ-SozEbPaEa5gP6werZtQqCYcYKz__tBq1sUqCw@mail.gmail.com>
Message-ID: <FF578A55-1B47-404A-9D83-2835CECDE069@gmail.com>


On Feb 12, 2012, at 10:50 PM, Christopher Reay wrote:

> +1 for the URL in the exception. Well in all exceptions
> 
> Bringing the language into the 21st century.
> Great entry points for learning about the language.

That's not a bad idea. We might want to use some kind of URL shortener for length and future proofing though. If the site changes, we can have redirection of the short URLs updated. Something like http://pyth.on/e1234  --> http://docs.python.org/library/exceptions.html

From p.f.moore at gmail.com  Mon Feb 13 12:14:42 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 13 Feb 2012 11:14:42 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>

On 13 February 2012 05:42, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Paul Moore writes:
>
> ?> > But you obviously do know the convention -- use UTF-8.
> ?>
> ?> No. I know that a lot of Unix people advocate UTF-8, and I gather it's
> ?> rapidly becoming standard in the Unix world. But I work on Windows,
> ?> and UTF-8 is not the standard there. I have no idea if UTF-8 is
> ?> accepted cross-platform,
>
> It is. ?All of Microsoft's programs (and I suppose most third-party
> software, too) that I know of will happily import UTF-8-encoded text,
> and produce it as well. ?Most Microsoft-specific file formats (eg,
> Word) use UTF-16 internally, but they can't be read by most
> text-oriented programs, so in practice they're app/octet-strm.

If I create a new text file in Notepad or Vim on my PC, it's not
created in UTF-8 by default. Vim uses Latin-1, and Notepad uses "ANSI"
(which I'm pretty sure translates to CP1252 (but there are so few
differences between this and latin-1, that I can't easily test this at
the moment). If I do "chcp" on a console window, I get codepage 850,
and in CMD, echo a?b >file.txt encodes the file in CP850.

echo a?b >file.txt in Powershell creates little-endian UTF-16 with a
BOM. The out-file cmdlet in Powershell (which lets me specify an
encoding to override the UTF-16 of the standard redirection) says this
about the encoding parameter:

 -Encoding <string>
     Specifies the type of character encoding used in the file. Valid
values are "Unicode", "UTF7", "UTF8", "UTF32
      "ASCII", "BigEndianUnicode", "Default", and "OEM". "Unicode" is
the default.

     "Default" uses the encoding of the system's current ANSI code page.

     "OEM" uses the current original equipment manufacturer code page
identifier for the operating system.

With this I can at least get UTF-8 (with BOM). But it's a long way
from simple to do so...

Basically, In my experience, Windows users are not likely to produce
UTF-8 formatted files unless they make specific efforts to do so.

I have heard anecdotal evidence that attempts to set the configuration
on Windows to produce UTF-8 by default hit significant issues. So
don't expect to see Windows users producing UTF-8 by default anytime
soon.

> The problem is the one you point out: files you receive from third
> parties are still fairly likely to be in a non-Unicode encoding.

And, if I don't concentrate, I produce non-UTF8 files myself.

The good news is that Python 3 generally works fine with files I
produce myself, as it follows the system encoding.

>python
Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getpreferredencoding()
'cp1252'

Near enough, as the only character I tend to use is ?, and latin-1 and
cp1252 concur on that (and I know what CP850 ? signs look like in
latin-1/cp1252, so I can spot that particular error).

Of course, that means that processing UTF-8 always needs me to
explicitly set the encoding. Which in turn means that (if I care -
back to the original point) I need to go checking for non-ASCII
characters, do a quick hex dump to check they look like utf-8 and set
the encoding. Or go with the default and risk mojibake (cp1252 is not
latin-1 AIUI, so won't roundtrip bytes). Or go the "don't care" route.
All of this simply because I feel that it's impolite to corrupt
someone's name in my output just because they have an accented letter
in their name :-)

As I say:
- I know what to do
- It can be a lot of work
- Frankly, the damage is minor (these are usually personal or low-risk scripts)
- The temptation to say "stuff it" and get on with my life is high
- It frustrates me that Python by default tempts me to *not* do the right thing

Maybe the answer is to have some form of encoding-detection function
in the standard library. It doesn't have to be 100% accurate, and it
certainly shouldn't be used anywhere by default, but it would be
available for people who want to do the right thing without
over-engineering things totally.

> True. ?But for personal use, and for communicating with people you
> have some influence over, you can use/recommend UTF-8 safely as far I
> know. ?I occasionally get asked by Japanese people why files I send in
> UTF-8 are broken; it invariably turns out that they sent me a file in
> Shift JIS that contained a non-JIS (!) character and my software
> translated it to REPLACEMENT CHARACTER before sending as UTF-8.

Maybe it's different in Japan, where character sets are more of a
common knowledge issue? But if I tried to say to one of my colleagues
that the spooled output of a SQL query they sent me (from a database
with one encoding, through a client with no real encoding handling
beyond global OS-level defaults) didn't use UTF-8, I'd get a blank
look at best.

I've had to debug encoding issues for database programmers only to
find that they don't even know what encodings are about - and they are
writing multilingual applications! (Before someone says, yes, of
course this is terrible, and shouldn't happen - but it does, and these
are the places I get weirdly-encoded text files from...)

> ?> I think people are much more aware of the issues, but cross-platform
> ?> handling remains a hard problem. I don't wish to make assumptions, but
> ?> your insistence that UTF-8 is a viable solution suggests to me that
> ?> you don't know much about the handling of Unicode on Windows. I wish I
> ?> had that luxury...
>
> I don't understand what you mean by that. ?Windows doesn't make
> handling any non-Unicode encodings easy, in my experience, except for
> the local code page. ?So, OK, if you're in a monolingual Windows
> environment (eg, the typical Japanese office), everybody uses a common
> legacy encoding for file exchange (including URLs and MIME filename=
> :-(, in particular Shift JIS), and only that encoding works well (ie,
> without the assistance of senior tech support personnel). ?Handling
> Unicode, though, isn't really an issue; all of Microsoft's programs
> happily deal with UTF-8 and UTF-16 (in its several varieties).

What I was trying to say was that typical Windows environments (where
people don't interact often with Unix utilities, or if they do it's
with ASCII characters almost exclusively) hide the details of Unicode
from the end user to the extent that they don't know what's going on
under the hood, and don't need to care. Much like Python 2, I guess
:-)

> Indeed. ?Do you really see UTF-16 in files that you process with
> Python?

Powershell generates it. See above. But no, not often, and it's easy
to fix. Meh, for easy read

   cmd /c "iconv -f utf-16 -t utf-8 u1 >u2"
or
   set-content u2 (get-content u1) -encoding utf8
if I don't mind a BOM.

No, Unicode on Windows isn't easy :-(

Paul


From christopherreay at gmail.com  Mon Feb 13 12:41:57 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 13:41:57 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120213023512.GE27683@idyll.org>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
Message-ID: <CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>

Its not like there is a huge amount of traffic in python-ideas

But if its annoying people ill sign up to concurrency sig

How many mailing lists do I need to sign up to to make sure Im not missing
something I might be intersted in.

Email "Subject" fields, I find, are quite useful
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/00cf82c3/attachment.html>

From christopherreay at gmail.com  Mon Feb 13 12:42:53 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 13:42:53 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
Message-ID: <CAMgkT_8kU4K7b97DkGg9P=xbCuOZpMLt-YLOy0F11akMu=_iQw@mail.gmail.com>

Also, coming up with ideas for new ways of doing things should include
significant discussions of what is already there

On 13 February 2012 13:41, Christopher Reay <christopherreay at gmail.com>wrote:

> Its not like there is a huge amount of traffic in python-ideas
>
> But if its annoying people ill sign up to concurrency sig
>
> How many mailing lists do I need to sign up to to make sure Im not missing
> something I might be intersted in.
>
> Email "Subject" fields, I find, are quite useful
>


-- 

Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/7bddc980/attachment.html>

From solipsis at pitrou.net  Mon Feb 13 13:10:36 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 13 Feb 2012 13:10:36 +0100
Subject: [Python-ideas] The concurrency discussion is off-topic!
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
Message-ID: <20120213131036.40aed9c2@pitrou.net>

On Mon, 13 Feb 2012 13:41:57 +0200
Christopher Reay
<christopherreay at gmail.com> wrote:
> Its not like there is a huge amount of traffic in python-ideas

Are you kidding?

The idiotic "TIOBE -3%" discussion thread is probably in the hundred of
answers now. That's for a completely vacuous thread launched 4 days ago
by a well-known troll.

python-ideas is not a playground for people with an opinion.
It's a communication tool for core development.

Thanks

Antoine.


From christopherreay at gmail.com  Mon Feb 13 13:21:17 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 14:21:17 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120213131036.40aed9c2@pitrou.net>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
Message-ID: <CAMgkT_9aHS64Qte2jubND1i+xMvRUjd9gZr-eU4VxCFmdPRnQQ@mail.gmail.com>

Hmm. Stimulating people to express what they believe the major hurdles to
uptake of use of the language are, discuss their solutions.

100 mails in 4 days is light traffic afaik.

But im fairly new to this, so ill bow out.

Still, I thought it was itneresting and informative discussion with lots of
specific information.

Where would you suggest discussion should have taken place?


-- 

Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/1cc96d62/attachment.html>

From solipsis at pitrou.net  Mon Feb 13 13:26:28 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 13 Feb 2012 13:26:28 +0100
Subject: [Python-ideas] The concurrency discussion is off-topic!
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
	<CAMgkT_9aHS64Qte2jubND1i+xMvRUjd9gZr-eU4VxCFmdPRnQQ@mail.gmail.com>
Message-ID: <20120213132628.6658bfa7@pitrou.net>


Hello,

> Still, I thought it was itneresting and informative discussion with
> lots of specific information.

Not really. The subjects discussed there, e.g. the GIL and
multithreading, have already been rehashed countless times.
Perhaps you may find them interesting if you haven't really followed the
mailing-lists in the past.

> Where would you suggest discussion should have taken place?

Well, in that case, it should really have been /dev/null :(

Regards

Antoine.


From christopherreay at gmail.com  Mon Feb 13 13:44:58 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 14:44:58 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120213132628.6658bfa7@pitrou.net>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
	<CAMgkT_9aHS64Qte2jubND1i+xMvRUjd9gZr-eU4VxCFmdPRnQQ@mail.gmail.com>
	<20120213132628.6658bfa7@pitrou.net>
Message-ID: <CAMgkT_-eranqBTQqXcoTeLCSK4kdHn_2ND69KNnO-sK4N3VHNA@mail.gmail.com>

lol

why, then, are people with experience on the lists using this as a space to
express themselves?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/286b22b7/attachment.html>

From breamoreboy at yahoo.co.uk  Mon Feb 13 13:50:24 2012
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Mon, 13 Feb 2012 12:50:24 +0000
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
Message-ID: <jhb0u3$h22$1@dough.gmane.org>

On 13/02/2012 11:41, Christopher Reay wrote:
> Its not like there is a huge amount of traffic in python-ideas
>
> But if its annoying people ill sign up to concurrency sig
>
> How many mailing lists do I need to sign up to to make sure Im not missing
> something I might be intersted in.

There are 326 listed under gmane.comp.python, please enjoy them all :)

>
> Email "Subject" fields, I find, are quite useful
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-- 
Cheers.

Mark Lawrence.


From solipsis at pitrou.net  Mon Feb 13 14:01:16 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 13 Feb 2012 14:01:16 +0100
Subject: [Python-ideas] The concurrency discussion is off-topic!
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
	<CAMgkT_9aHS64Qte2jubND1i+xMvRUjd9gZr-eU4VxCFmdPRnQQ@mail.gmail.com>
	<20120213132628.6658bfa7@pitrou.net>
	<CAMgkT_-eranqBTQqXcoTeLCSK4kdHn_2ND69KNnO-sK4N3VHNA@mail.gmail.com>
Message-ID: <20120213140116.46469912@pitrou.net>

On Mon, 13 Feb 2012 14:44:58 +0200
Christopher Reay
<christopherreay at gmail.com> wrote:
> lol
> 
> why, then, are people with experience on the lists using this as a space to
> express themselves?

Ask them.


From christopherreay at gmail.com  Mon Feb 13 14:07:59 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 15:07:59 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120213140116.46469912@pitrou.net>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
	<CAMgkT_9aHS64Qte2jubND1i+xMvRUjd9gZr-eU4VxCFmdPRnQQ@mail.gmail.com>
	<20120213132628.6658bfa7@pitrou.net>
	<CAMgkT_-eranqBTQqXcoTeLCSK4kdHn_2ND69KNnO-sK4N3VHNA@mail.gmail.com>
	<20120213140116.46469912@pitrou.net>
Message-ID: <CAMgkT__tn8daLqsWsaoH14CzauZGH5qkZhqMi234NO+k_ad_6g@mail.gmail.com>

Hey, Sturla, Mike and other people who clearly know what they are talking
about and this ecosystem, why are we using this space for this discussion?

Felt all kind of warm and fuzzy and community like to me.

On 13 February 2012 15:01, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Mon, 13 Feb 2012 14:44:58 +0200
> Christopher Reay
> <christopherreay at gmail.com> wrote:
> > lol
> >
> > why, then, are people with experience on the lists using this as a space
> to
> > express themselves?
>
> Ask them.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 

Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/d1372854/attachment.html>

From anacrolix at gmail.com  Mon Feb 13 14:10:28 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 13 Feb 2012 21:10:28 +0800
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <jhb0u3$h22$1@dough.gmane.org>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<jhb0u3$h22$1@dough.gmane.org>
Message-ID: <CAB4yi1M+9qzQR5+36bbF+OhzWip9S72of-_c_Vs_vHC=GZjDkg@mail.gmail.com>

Clearly more are required.
On Feb 13, 2012 8:50 PM, "Mark Lawrence" <breamoreboy at yahoo.co.uk> wrote:

> On 13/02/2012 11:41, Christopher Reay wrote:
>
>> Its not like there is a huge amount of traffic in python-ideas
>>
>> But if its annoying people ill sign up to concurrency sig
>>
>> How many mailing lists do I need to sign up to to make sure Im not missing
>> something I might be intersted in.
>>
>
> There are 326 listed under gmane.comp.python, please enjoy them all :)
>
>
>> Email "Subject" fields, I find, are quite useful
>>
>> ______________________________**_________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>>
>
> --
> Cheers.
>
> Mark Lawrence.
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/52bff477/attachment.html>

From jnoller at gmail.com  Mon Feb 13 14:36:58 2012
From: jnoller at gmail.com (Jesse Noller)
Date: Mon, 13 Feb 2012 08:36:58 -0500
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <CAB4yi1OXcctdyetMuj5CCyTtuDXJnhn47PSpyQXQieJ7BL0Q7Q@mail.gmail.com>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no>
	<CAB4yi1OXcctdyetMuj5CCyTtuDXJnhn47PSpyQXQieJ7BL0Q7Q@mail.gmail.com>
Message-ID: <EAC91721015E4AD08E8DD86186F5B375@gmail.com>


On Sunday, February 12, 2012 at 7:06 PM, Matt Joiner wrote:

> +1, that list is dead
> On Feb 13, 2012 6:33 AM, "Sturla Molden" <sturla at molden.no (mailto:sturla at molden.no)> wrote:
> > Den 12.02.2012 23:21, skrev Mike Meyer:
> > > Please take the concurrency discussion to:
> > > 
> > > http://mail.python.org/mailman/listinfo/concurrency-sig
> > 
> > It seems that list has nearly zero traffic. Why post to a list that nobody reads?
> > 
> > Sturla
That list is dead because no one posts to it with good ideas and brainstorming, or to discuss issues like this, not because it's dead. Do I need to put up a signpost that says "Oh god please come discuss patches and the like"? 


From christopherreay at gmail.com  Mon Feb 13 14:50:34 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 15:50:34 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <EAC91721015E4AD08E8DD86186F5B375@gmail.com>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no>
	<CAB4yi1OXcctdyetMuj5CCyTtuDXJnhn47PSpyQXQieJ7BL0Q7Q@mail.gmail.com>
	<EAC91721015E4AD08E8DD86186F5B375@gmail.com>
Message-ID: <CAMgkT_9Y6ohWvujjfw4edYmsjyDKyrKMBjz1R0jF8Lv1RZaBBA@mail.gmail.com>

On 13 February 2012 15:36, Jesse Noller <jnoller at gmail.com> wrote:

>
>
> On Sunday, February 12, 2012 at 7:06 PM, Matt Joiner wrote:
>
> > +1, that list is dead
> > On Feb 13, 2012 6:33 AM, "Sturla Molden" <sturla at molden.no (mailto:
> sturla at molden.no)> wrote:
> > > Den 12.02.2012 23:21, skrev Mike Meyer:
> > > > Please take the concurrency discussion to:
> > > >
> > > > http://mail.python.org/mailman/listinfo/concurrency-sig
> > >
> > > It seems that list has nearly zero traffic. Why post to a list that
> nobody reads?
> > >
> > > Sturla
> That list is dead because no one posts to it with good ideas and
> brainstorming, or to discuss issues like this, not because it's dead. Do I
> need to put up a signpost that says "Oh god please come discuss patches and
> the like"?
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 

Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/575a0b5f/attachment.html>

From steve at pearwood.info  Mon Feb 13 15:02:11 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 14 Feb 2012 01:02:11 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>	<4F374805.9000606@pearwood.info>	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
Message-ID: <4F3917E3.10004@pearwood.info>

Paul Moore wrote:

> Maybe the answer is to have some form of encoding-detection function
> in the standard library. It doesn't have to be 100% accurate, and it
> certainly shouldn't be used anywhere by default, but it would be
> available for people who want to do the right thing without
> over-engineering things totally.

Encoding guessers have their place, but they should only be used by those who 
know what they're getting themselves into.

http://blogs.msdn.com/b/oldnewthing/archive/2004/03/24/95235.aspx
http://blogs.msdn.com/b/oldnewthing/archive/2007/04/17/2158334.aspx

Note that even Raymond Chen makes the classic error of conflating encodings 
(UTF-16) with Unicode.

+0 on providing an encoding guesser, but -1 on making operate by default.


-- 
Steven


From techtonik at gmail.com  Mon Feb 13 16:15:29 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 13 Feb 2012 17:15:29 +0200
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <20120213131036.40aed9c2@pitrou.net>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
Message-ID: <CAPkN8x+nPV0sGEDSwcgbGQupPkT0VA1rqPJpfRPSxNUBirHcww@mail.gmail.com>

On Mon, Feb 13, 2012 at 3:10 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Mon, 13 Feb 2012 13:41:57 +0200
> Christopher Reay
> <christopherreay at gmail.com> wrote:
> > Its not like there is a huge amount of traffic in python-ideas
>
> Are you kidding?
>
> The idiotic "TIOBE -3%" discussion thread is probably in the hundred of
> answers now. That's for a completely vacuous thread launched 4 days ago
> by a well-known troll.
>

Well-known troll is +1 that proposal to write to an empty list sounds like
"you're not welcome here with your multiprocessing". I guess you didn't
mention that, so the problem probably that you can not handle the list
traffic, and read all interesting Python ideas - not speaking about
answering to all of them with your opinion. No problem with that either -
nobody can. That's why there was proposal about Etherpad, which summaries
would be as interesting for Python community as different blog posts from
core devs.

python-ideas is not a playground for people with an opinion.
> It's a communication tool for core development.
>

Since many of (potential) core devs are not able to cope up with traffic in
main lists, I'd propose to look for a better communication tool that at
least allows easy selective subscription (like Google Groups) and makes
sure interested parties have accessible instrument (for a reference to
accessibility read Steve Yegge's rant at
https://plus.google.com/112678702228711889851/posts/eVeouesvaVX) to
subscribe and participate (search, tree with all mailing lists and
one-button subscription). I've heard Pinax guys are rethinking their
Tribes/Groups feature -
https://groups.google.com/forum/?fromgroups#!topic/pinax-users/Wze7L2LlwjM-
perhaps you should communicate with them.


> Thanks


P.S. I am not changing the subject of this thread to stop spawning another
"completely vacuous thread".
P.P.S. Too bad I can not be at PyCon this year.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/d5e77afc/attachment.html>

From ctb at msu.edu  Mon Feb 13 16:18:54 2012
From: ctb at msu.edu (C. Titus Brown)
Date: Mon, 13 Feb 2012 07:18:54 -0800
Subject: [Python-ideas] The concurrency discussion is off-topic!
In-Reply-To: <CAPkN8x+nPV0sGEDSwcgbGQupPkT0VA1rqPJpfRPSxNUBirHcww@mail.gmail.com>
References: <20120212172105.2c98f820@bhuda.mired.org>
	<4F383E1C.90905@molden.no> <20120213023512.GE27683@idyll.org>
	<CAMgkT_9uEX_CE5mVMEf71+68femfAO7JcZoWoFBEivvA=EoD9g@mail.gmail.com>
	<20120213131036.40aed9c2@pitrou.net>
	<CAPkN8x+nPV0sGEDSwcgbGQupPkT0VA1rqPJpfRPSxNUBirHcww@mail.gmail.com>
Message-ID: <20120213151854.GD15826@idyll.org>

Please move any further discussion of both concurrency and how or where to
discuss concurrency issues to concurrency-sig.  Further discussion will be
moderated.

thanks all,
--titus

On Mon, Feb 13, 2012 at 05:15:29PM +0200, anatoly techtonik wrote:
> On Mon, Feb 13, 2012 at 3:10 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> > On Mon, 13 Feb 2012 13:41:57 +0200
> > Christopher Reay
> > <christopherreay at gmail.com> wrote:
> > > Its not like there is a huge amount of traffic in python-ideas
> >
> > Are you kidding?
> >
> > The idiotic "TIOBE -3%" discussion thread is probably in the hundred of
> > answers now. That's for a completely vacuous thread launched 4 days ago
> > by a well-known troll.
> >
> 
> Well-known troll is +1 that proposal to write to an empty list sounds like
> "you're not welcome here with your multiprocessing". I guess you didn't
> mention that, so the problem probably that you can not handle the list
> traffic, and read all interesting Python ideas - not speaking about
> answering to all of them with your opinion. No problem with that either -
> nobody can. That's why there was proposal about Etherpad, which summaries
> would be as interesting for Python community as different blog posts from
> core devs.
> 
> python-ideas is not a playground for people with an opinion.
> > It's a communication tool for core development.
> >
> 
> Since many of (potential) core devs are not able to cope up with traffic in
> main lists, I'd propose to look for a better communication tool that at
> least allows easy selective subscription (like Google Groups) and makes
> sure interested parties have accessible instrument (for a reference to
> accessibility read Steve Yegge's rant at
> https://plus.google.com/112678702228711889851/posts/eVeouesvaVX) to
> subscribe and participate (search, tree with all mailing lists and
> one-button subscription). I've heard Pinax guys are rethinking their
> Tribes/Groups feature -
> https://groups.google.com/forum/?fromgroups#!topic/pinax-users/Wze7L2LlwjM-
> perhaps you should communicate with them.
> 
> 
> > Thanks
> 
> 
> P.S. I am not changing the subject of this thread to stop spawning another
> "completely vacuous thread".
> P.P.S. Too bad I can not be at PyCon this year.
> -- 
> anatoly t.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


-- 
C. Titus Brown, ctb at msu.edu


From mwm at mired.org  Mon Feb 13 17:15:43 2012
From: mwm at mired.org (Mike Meyer)
Date: Mon, 13 Feb 2012 11:15:43 -0500
Subject: [Python-ideas] multiprocessing IPC
In-Reply-To: <4F38221A.5050208@molden.no>
References: <4F35A9EF.7030309@molden.no>
	<20120211205200.2667c68f@bhuda.mired.org>
	<4F3735F8.10607@molden.no> <jh8g6n$aca$1@dough.gmane.org>
	<4F37CE27.5070908@molden.no> <jh8lbe$cev$1@dough.gmane.org>
	<4F38221A.5050208@molden.no>
Message-ID: <CAD=7U2CeoY1xHBsj0NziaJKBLWF-+9+cbKJ2Q3V00HOOwY5YiQ@mail.gmail.com>

On Sun, Feb 12, 2012 at 3:33 PM, Sturla Molden <sturla at molden.no> wrote:
> Den 12.02.2012 16:20, skrev shibturn:
>> But if his /tmp is a tmpfs file system (which it usually is on Linux) then
>> I think it is entirely equivalent. ?Or he could create the file in /dev/shm
>> instead.
> It seems that on Linux /tmp is backed by shared memory.
> Which sounds rather strange to a Windows user, as the raison d'etre for
> tempfiles is temporary storage space that goes beyond physial RAM.

That's what /tmp was created for on Unix as well. But we've since
added virtual memory for that same purpose. Modern kernel virtual
address spaces are bigger than disks, and the IO and VM subsystem
buffer caches have similar performance, and may even share buffers. So
the major difference between memory-backed and fs-backed /tmp is that
an fs-backed one survives a reboot, which creates security issues on
multiuser systems.

In theory, you could create a file on a memory-backed /tmp that's
bigger than any data structure your process can hold.  But modern
software tends to use /tmp for things that need to be shared between
processes (unix-domain sockets, lock files, etc), and legacy software
is usually quite happy with a few tens of megabytes on /tmp. So it's
rather common for a systems per-process virtual address limit to be
bigger than /tmp.

    <mike


From ubershmekel at gmail.com  Mon Feb 13 19:58:01 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Mon, 13 Feb 2012 20:58:01 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <FF578A55-1B47-404A-9D83-2835CECDE069@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
	<CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
	<CAMgkT_8CD4OJ-SozEbPaEa5gP6werZtQqCYcYKz__tBq1sUqCw@mail.gmail.com>
	<FF578A55-1B47-404A-9D83-2835CECDE069@gmail.com>
Message-ID: <CANSw7Kz3YffVoeoDJ5U5i3cTJqLE8Dx_c601Zn2hhjVeuF9OFw@mail.gmail.com>

On Mon, Feb 13, 2012 at 11:19 AM, Carl M. Johnson <
cmjohnson.mailinglist at gmail.com> wrote:

>
> On Feb 12, 2012, at 10:50 PM, Christopher Reay wrote:
>
> > +1 for the URL in the exception. Well in all exceptions
> >
> > Bringing the language into the 21st century.
> > Great entry points for learning about the language.
>
> That's not a bad idea. We might want to use some kind of URL shortener for
> length and future proofing though. If the site changes, we can have
> redirection of the short URLs updated. Something like http://pyth.on/e1234 -->
> http://docs.python.org/library/exceptions.html
>
>
I think we can use wiki.python.org/ for hosting exception specific content.
E.g. http://wiki.python.org/moin/PrintFails needs a lot of love and care.

Microsoft actually has documentation for every single compiler and linker
error that ever existed. Not that we have the same amount of resources at
our disposal, but it is a nice concept.

Concerning the shortened url's - I'd go with trustworthiness over
compactness - http://python.org/s1 or http://python.org/s/1 would be better
than http://pyth.on/1 imo.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/c8bca8ef/attachment.html>

From christopherreay at gmail.com  Mon Feb 13 19:08:18 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Mon, 13 Feb 2012 20:08:18 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<CACac1F-_EDmHQibCQ8zsfyf11oaseYVaM8A539eYzuhaYRqPVQ@mail.gmail.com>
	<8739affh9j.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8D8CP2r+AszwprpAMR0PFVPbDeWraSK1muvOjShprF7A@mail.gmail.com>
	<CANSw7Ky2uBFu=2cnE-rK2S2tiLQ4ZFSd2Ldh50q3w7ohpE4T5w@mail.gmail.com>
Message-ID: <CAMgkT_9cB1OROJj+NthWRCPQDQDO25aX4frmcxsxgEdbyYQMWg@mail.gmail.com>

Entry Points:

Google:
  Natural Language user searches based on "intent of code"
  Module Name/Function names: user wants more details on something he
already knows exists
  Exception Name: Great, finds you the exception definition just like any
other Class name.
    Googling for "UnicodeEncodingError Python" gives me a link to the 2.7
documentation which says at the top "this is not yet updated for python 3"
- I dont know how important this is
    Googling for "UnicodeEncodingError Python 3" gives
http://docs.python.org/release/3.0.1/howto/unicode.html
    This is a great document. It explains encoding very well.
   The unicode tutorial doesnt mention anything about the terminal output
encoding to STDOUT, and whilst this is obvious after a while, it is not
always clear the printing to the terminal is the cause of the attempt to
encode as ascii during a print statement.
   To some extent, the unicode tutorial doesnt have the practical specifics
that are being discussed in this thread which is targetted at "learning
curve into Python"

I think the most important points here are:
  The exception knows what version of Python its from (which allows the
language to make changes
  It would be nice to have a wiki type document targetted by the
exception/error
    Sections like:

   - "Python Official Docs"
   - Murgh, Fix This NOW, Dont care how dirty
   - Contributed Docs we have none and loved/stack overflow etc...
   - Discussions from python-dev / python ideas
   - PEPs that apply

The point is that Google cant be responsible for making sure all these
sections are laid out, obvious correct or constant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120213/cf0e91a4/attachment.html>

From timothy.c.delaney at gmail.com  Mon Feb 13 22:53:40 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Tue, 14 Feb 2012 08:53:40 +1100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAN8CLgnPmHHGyGtQxS_GxSoeH=JdJm-X=tAZg9aBgom0bGtVrA@mail.gmail.com>

On 13 February 2012 16:42, Stephen J. Turnbull <stephen at xemacs.org> wrote:

> Paul Moore writes:
>

[Lots of stuff from Stephen that I agree with].


>  > And that's even without all this foreign UTF-8 I get from the Unix
>  > guys :-) Apart from the blasted UTF-16, all of it's "ASCII most of
>  > the time".
>
> Indeed.  Do you really see UTF-16 in files that you process with
> Python?


I've only had one real use-case (and it was Java, but could easily be
Python). We wanted to be able to export settings as a CSV file to be opened
in Excel, modified and then re-imported.

Turns out that if you want to open non-ascii CSV files in Excel, they must
be encoded as (IIRC) UTF-16LE (i.e. without a BOM). I think you can save as
other encodings, but that's the only one you can reliably open.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120214/fef22ed5/attachment.html>

From jimjjewett at gmail.com  Tue Feb 14 01:39:40 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 13 Feb 2012 19:39:40 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<CAB4yi1Ob5JWpYSkHgRFzymJwaypu0kuZZsv=H0i3rzfmtG7ngA@mail.gmail.com>
	<jh2b23$52o$1@dough.gmane.org> <4F34E393.9020105@hotpy.org>
	<6774A8D7-548B-4651-8879-1158621157E5@gmail.com>
Message-ID: <CA+OGgf7ZnxQEjacQPZC8SGVo010+0rZWLhsJ5BCx9FZEJ3ESpQ@mail.gmail.com>

On Fri, Feb 10, 2012 at 9:52 AM, Massimo Di Pierro
<massimo.dipierro at gmail.com> wrote:

> The GC vs reference counting (RC) is the hearth of the matter.
> With RC every time a variable is allocated or deallocated you need
> to lock the counter

uh... if you need to lock it for allocation, that is an issue with the
malloc, rather than refcounting.  And if you need to lock it for
deallocation, then your program already has a (possibly
threading-race-condition-related) bug.

The problem is that you need to lock the memory for writing every time
you acquire or release a view of the object, even if you won't be
modifying the object.  (And this changing of the refcount makes
copy-on-write copy too much.)

There are plenty of ways around that, mostly by using thread-local (or
process-local or machine-local) proxies; the original object only gets
one incref/decref from each remote thread; if sharable objects are
delegated to a memory-controller thread, even better.  Once you have
the infrastructure for this, you could also more easily support
"permanent" objects like None.

The catch is that the overhead of having the refcount+pointer (even
without the proxies) instead of just "refcount 4 bytes ahead" turns
out to be pretty high, so those forks (and extensions, if I remember
pyro http://irmen.home.xs4all.nl/pyro/ correctly) never really caught
on.  Maybe that will change when the number of cores that aren't
already in use for other processes really does skyrocket.

-jJ


From ethan at stoneleaf.us  Tue Feb 14 03:17:38 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 13 Feb 2012 18:17:38 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120213060350.GA11284@idyll.org>
References: <4F3563C4.2050703@egenix.com>
	<4F3575FB.60700@molden.no>	<CAGE7PNKoAboEL9couP+fraZ-gJhRj+87wF2F=LKAQ8A0jBNKxA@mail.gmail.com>	<CAD=7U2B523G-pnf2Py06YSSR4j-7=uKtZbTJBTOFm2PuPt_jSA@mail.gmail.com>	<4F3839DB.7080804@molden.no>	<20120212174253.156c3660@bhuda.mired.org>	<CAB4yi1PtFnn-U-DAZHYntnNQq05cEpuHnoghF1ckS1JktZqfCA@mail.gmail.com>	<20120212193620.65adad41@bhuda.mired.org>	<20120213023640.GF27683@idyll.org>	<20120213005740.0db1cb38@bhuda.mired.org>
	<20120213060350.GA11284@idyll.org>
Message-ID: <4F39C442.5030206@stoneleaf.us>

C. Titus Brown wrote:
>> p.s. Why did you take a private e-mail response and reply to it to the group?

Perhaps he thought it was private by mistake.  I'm glad he did, though, 
as I am learning quite a bit from this rather rambling thread.

~Ethan~


From stephen at xemacs.org  Tue Feb 14 09:02:16 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 14 Feb 2012 17:02:16 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
Message-ID: <87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > I'd hazard a guess that the non-ASCII compatible encoding mostly
 > likely to be encountered outside Asia is UTF-16.

In other words, only people who insist on messing with
application/octet-stream files (like Word ;-).  They don't deserve the
pain, but they're gonna feel it anyway.

 > The choice is really between "never give me UnicodeErrors, but feel
 > free to silently corrupt the data stream if I do the wrong thing
 > with that data" (i.e.  "latin-1")

Yes.

 > and "correctly handle any ASCII compatible encoding, but still
 > throw UnicodeEncodeError if I'm about to emit corrupted data"
 > ("ascii+surrogateescape").

Not if I understand what ascii+surrogateescape would do correctly.
Yes, you can pass through verbatim, but AFAICS you would have to work
quite hard to do anything to that stream that would cause a
UnicodeError in your program, even though you corrupt it.  (Eg, delete
half of a multibyte EUC character.)

The question is what happens if you run into a validating processor
internally -- then you'll see an error (even though you're just
passing it through verbatim!)


From ncoghlan at gmail.com  Tue Feb 14 09:45:24 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 14 Feb 2012 18:45:24 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>

On Tue, Feb 14, 2012 at 6:02 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> ?> and "correctly handle any ASCII compatible encoding, but still
> ?> throw UnicodeEncodeError if I'm about to emit corrupted data"
> ?> ("ascii+surrogateescape").
>
> Not if I understand what ascii+surrogateescape would do correctly.
> Yes, you can pass through verbatim, but AFAICS you would have to work
> quite hard to do anything to that stream that would cause a
> UnicodeError in your program, even though you corrupt it. ?(Eg, delete
> half of a multibyte EUC character.)
>
> The question is what happens if you run into a validating processor
> internally -- then you'll see an error (even though you're just
> passing it through verbatim!)

If you're only round-tripping (i.e. writing back out as
"ascii+surrogateescape") it's very hard to corrupt your data stream
with processing that assumes an ASCII compatible encoding (as you
point out, you'd have to be splitting on arbitrary codepoints instead
of searching for ASCII first).

However, it's trivial to get an error when you go to encode the data
stream without one of the silencing error handlers set. In particular,
sys.stdout has error handling set to strict, which I believe is likely
to throw UnicodeEncodeError if you try to feed a string containing
surrogate escaped bytes to an encoding that can't handle them. (Of
course, if sys.stdout.encoding is "UTF-8", then you're right, those
characters will just be displayed as gibberish, as they would in the
latin-1 case. I guess its only on Windows and in any other locations
with a more restrictive default stdout encoding that errors are
particularly likely).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stephen at xemacs.org  Tue Feb 14 10:36:54 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 14 Feb 2012 18:36:54 +0900
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
Message-ID: <87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Moore writes:

 > Basically, In my experience, Windows users are not likely to produce
 > UTF-8 formatted files unless they make specific efforts to do so.

Agreed.  All I meant was that if you make the effort to do so, your
Windows-based correspondents will be able to read it, and vice versa.

 > As I say:
 > - I know what to do
 > - It can be a lot of work
 > - Frankly, the damage is minor (these are usually personal or low-risk scripts)
 > - The temptation to say "stuff it" and get on with my life is high
 > - It frustrates me that Python by default tempts me to *not* do the right thing

Please don't blame it on Python.  Python tempts you because it offers
the choice to do it right.  There is no way that Python can do it
right *for* you, not even all the resources Microsoft or Apple can
bring to bear have managed to do it right (you can't get 100% even
within an all-Windows or all-Mac shop, let alone cross-platform).  Not
yet; it requires your help.

Thanks for caring!<wink/>

 > Maybe it's different in Japan, where character sets are more of a
 > common knowledge issue?

Mojibake is common knowledge in Japan; what to do about it requires a
specialized technical background.

 > But if I tried to say to one of my colleagues that the spooled
 > output of a SQL query they sent me (from a database with one
 > encoding, through a client with no real encoding handling beyond
 > global OS-level defaults) didn't use UTF-8, I'd get a blank look at
 > best.

Again, this is not the direction I have in mind (I'm thinking more in
terms of the RightThinkingAmongUs using UTF-8 as much as possible, and
whether the recipients will be able to read it -- AFAICT/IME they
can), and you certainly shouldn't presume that your correspondents
"should" "already" be using UTF-8.  That would be seriously rude on
Windows, where as you point out one has to do something rather
contorted to produce UTF-8 in most applications.

 > What I was trying to say was that typical Windows environments (where
 > people don't interact often with Unix utilities, or if they do it's
 > with ASCII characters almost exclusively) hide the details of Unicode
 > from the end user to the extent that they don't know what's going on
 > under the hood, and don't need to care.

Ah.  If you're in a monolingual environment, yes, it works that way.
But it works just well on Unix if you set LANG appropriately in your
environment.

 > Much like Python 2, I guess :-)

No, Python 2 is better and worse.  Many protocols use magic numbers
that look like ASCII-encoded English (eg, HTML tags).  Python 2 is
quite happy to process those magic numbers and the intervening content
(as long as each stretch of non-ASCII is treated as an atomic unit),
regardless of whether actual encoding matches local convention.  (This
is why the WSGI guys love Python 2 -- it can be multilingual without
knowing the encoding!)  On the other hand, the Windows environment
will be more seamless (and allow useful processing of the "intervening
content") as long as you stick to the local convention for encoding.


From cmjohnson.mailinglist at gmail.com  Tue Feb 14 12:39:42 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Tue, 14 Feb 2012 01:39:42 -1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
Message-ID: <70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>


On Feb 13, 2012, at 10:45 PM, Nick Coghlan wrote:

> (Of course, if sys.stdout.encoding is "UTF-8", then you're right, those
> characters will just be displayed as gibberish, as they would in the
> latin-1 case. I guess its only on Windows and in any other locations
> with a more restrictive default stdout encoding that errors are
> particularly likely).

I don't think that's right. I think that by default Python refuses to turn surrogate characters into UTF-8:

>>> bytes(range(256)).decode("ascii", errors="surrogateescape")
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\udc80\udc81\udc82\udc83\udc84\udc85\udc86\udc87\udc88\udc89\udc8a\udc8b\udc8c\udc8d\udc8e\udc8f\udc90\udc91\udc92\udc93\udc94\udc95\udc96\udc97\udc98\udc99\udc9a\udc9b\udc9c\udc9d\udc9e\udc9f\udca0\udca1\udca2\udca3\udca4\udca5\udca6\udca7\udca8\udca9\udcaa\udcab\udcac\udcad\udcae\udcaf\udcb0\udcb1\udcb2\udcb3\udcb4\udcb5\udcb6\udcb7\udcb8\udcb9\udcba\udcbb\udcbc\udcbd\udcbe\udcbf\udcc0\udcc1\udcc2\udcc3\udcc4\udcc5\udcc6\udcc7\udcc8\udcc9\udcca\udccb\udccc\udccd\udcce\udccf\udcd0\udcd1\udcd2\udcd3\udcd4\udcd5\udcd6\udcd7\udcd8\udcd9\udcda\udcdb\udcdc\udcdd\udcde\udcdf\udce0\udce1\udce2\udce3\udce4\udce5\udce6\udce7\udce8\udce9\udcea\udceb\udcec\udced\udcee\udcef\udcf0\udcf1\udcf2\udcf3\udcf4\udcf5\udcf6\udcf7\udcf8\udcf9\udcfa\udcfb\udcfc\udcfd\udcfe\udcff'
>>> _.encode("utf-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 128: surrogates not allowed
>>> _.encode("utf-8", errors="surrogateescape")
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'


OK, so concrete proposals: update the docs and maybe make a synonym for Latin-1 that makes it more semantically obvious that you're not really using it as Latin-1, just as a easy to pass through encoding. Anything else? Any bike shedding on the synonym?

-- Carl Johnson 


From christopherreay at gmail.com  Tue Feb 14 11:37:44 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Tue, 14 Feb 2012 12:37:44 +0200
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>

Web browsers can parse pages with multiple encodings seemingly perfectly
into the correct display characters. A quick copy and paste produces UTF-8
encoded text in the clip board. (on linux)

HOW DO THEY DO IT.. can we have their libraries? :)

Some of the web pages I tried decoding made be pull my hair out. One I just
cancelled the client.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120214/14a9e812/attachment.html>

From pyideas at rebertia.com  Tue Feb 14 15:38:04 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Tue, 14 Feb 2012 06:38:04 -0800
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
Message-ID: <CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>

On Tue, Feb 14, 2012 at 2:37 AM, Christopher Reay
<christopherreay at gmail.com> wrote:
> Web browsers can parse pages with multiple encodings seemingly perfectly
> into the correct display characters. A quick copy and paste produces UTF-8
> encoded text in the clip board. (on linux)
>
> HOW DO THEY DO IT.. can we have their libraries? :)

The "chardet" package is in fact a port of Mozilla's encoding guessing code.

Cheers,
Chris


From dirkjan at ochtman.nl  Tue Feb 14 15:54:50 2012
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Tue, 14 Feb 2012 15:54:50 +0100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
Message-ID: <CAKmKYaB5iXr9r3JZcnQZ=uU-UGczp6z7HOkNQuZ7-xXjSR1KeA@mail.gmail.com>

On Tue, Feb 14, 2012 at 15:38, Chris Rebert <pyideas at rebertia.com> wrote:
> The "chardet" package is in fact a port of Mozilla's encoding guessing code.

I thought at some point that it would be useful to have in the stdlib
(I still do). It's already fairly successful on PyPI, after all, and
it's very helpful when dealing with text of unknown character
encoding. However, there are licensing issues. At one point I asked
Van Lindberg to look into that... He forwarded me some email between
him and Mozilla guys about this, but it was not yet conclusive.

Cheers,

Dirkjan


From jimjjewett at gmail.com  Tue Feb 14 21:04:05 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 14 Feb 2012 15:04:05 -0500
Subject: [Python-ideas] Unicode surrogateescape [was: Re: Python 3000 TIOBE
	-3%]
Message-ID: <CA+OGgf7zAN=oEqSD1udTt-4uSXpnS81zbzJu+VqsAriFcDdxWA@mail.gmail.com>

On Mon, Feb 13, 2012 at 12:12 AM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Paul Moore writes:

> ?> I'm now 100% convinced that
> ?> encoding="ascii",errors="surrogateescape" is the way to say this in
> ?> code.

> That may also be a good universal default for Python 3, as it will
> pass through non-ASCII text unchanged, while raising an error if the
> program tries to manipulate it (or hand it to a module that
> validates). ?(encoding='latin-1' definitely is not a good default.)
> But I'm not sure of that, and the current approach of using the
> preferred system encoding is probably better.

The preferred system encoding is indeed better than universal ASCII.

But is there a good reason not to change the default errorhandler to
errors="surrogateescape"?

errors="strict" is already well-documented, and the sort of people
most eager to reject (rather than ignore) bad data also tend to be
explicit about their use of defaults.

And if the barrier is only backwards-compatibility, is there any
reason not to at least recommend a recipe of errors="surrogateescape"
for cases where you expect ASCII, but want to round-trip other data
just in case?

-jJ


From cmjohnson.mailinglist at gmail.com  Tue Feb 14 21:17:06 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Tue, 14 Feb 2012 10:17:06 -1000
Subject: [Python-ideas] Unicode surrogateescape [was: Re: Python 3000
	TIOBE -3%]
In-Reply-To: <CA+OGgf7zAN=oEqSD1udTt-4uSXpnS81zbzJu+VqsAriFcDdxWA@mail.gmail.com>
References: <CA+OGgf7zAN=oEqSD1udTt-4uSXpnS81zbzJu+VqsAriFcDdxWA@mail.gmail.com>
Message-ID: <04A64366-3F31-40AF-9E84-FFB3C3C1E690@gmail.com>


On Feb 14, 2012, at 10:04 AM, Jim Jewett wrote:

> But is there a good reason not to change the default errorhandler to
> errors="surrogateescape"?

It's a conflict in the Zen:

> Errors should never pass silently.
> Unless explicitly silenced.

OK, so default to strict. But:

> Although practicality beats purity.

Hmm, so maybe do use surrogates. Then again:

> In the face of ambiguity, refuse the temptation to guess.

Grr, I'm not nearly Dutch enough to make sense of this logical conflict!


From jimjjewett at gmail.com  Tue Feb 14 21:20:23 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 14 Feb 2012 15:20:23 -0500
Subject: [Python-ideas] Unicode surrogateescape [was: Re: Python 3000 TIOBE
	-3%]
Message-ID: <CA+OGgf43LvdsY5pbNr6yAzUbop3PA-YGDfKvtcgwKaPT5oGfcA@mail.gmail.com>

On Mon, Feb 13, 2012 at 12:16 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Really, Python 3 forces programmers ...
> to make the choice between the 4 possible options for
> processing ASCII-compatible encodings:

> 1. Process them as binary data.

[Code smell from lying; lots of pain from mismatch with external libraries.]

> 2. Process them as "latin-1".

[Code smell from lying; non-ASCII often turns to gibberish.]

> 3. Process them as "ascii+surrogateescape". This is the *right*
> answer if you plan solely to manipulate the text and then write it back
> out in the same encoding as was originally received.

[Note that the original "encoding" may well be internally
inconsistent; I've often seen that in log files.]

> You will get errors if you try to write a string with escaped
> characters out to a non-ascii channel or an ascii channel
> without surrogateescape enabled. ... (e.g. sys.stdout)

Is there any reason not to enable surrogate escape by default?  At
least on the console/terminal?

I can see an argument for replace or xmlcharreplace or something more
complicated, but ... if I'm sending output to myself, I would rather
see it (possibly with a mark indicating where it was corrupted) than
to get my program aborted (strict) and *not* be told what data caused
the problem.

> 4. Get a third party encoding guessing library and use that instead of
> waving away the problem of ASCII-incompatible encodings.

And I do think this needs to stay 3rd-party; domain information
matters, and n-gram guessing should not be subject to stability
guarantees.

-jJ


From p.f.moore at gmail.com  Tue Feb 14 22:08:09 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 14 Feb 2012 21:08:09 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F8mQ1q0hzzrOP3WKAaLv04j_u3raNrJN08e=HXBnRBMGw@mail.gmail.com>

On 14 February 2012 09:36, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> ?> As I say:
> ?> - I know what to do
> ?> - It can be a lot of work
> ?> - Frankly, the damage is minor (these are usually personal or low-risk scripts)
> ?> - The temptation to say "stuff it" and get on with my life is high
> ?> - It frustrates me that Python by default tempts me to *not* do the right thing
>
> Please don't blame it on Python. ?Python tempts you because it offers
> the choice to do it right. ?There is no way that Python can do it
> right *for* you, not even all the resources Microsoft or Apple can
> bring to bear have managed to do it right (you can't get 100% even
> within an all-Windows or all-Mac shop, let alone cross-platform). ?Not
> yet; it requires your help.

Point taken.

I think my point is that I wish there was a more obvious way for me to
tell Python that I just want to do it nearly right on this occasion
(like "everything else" does) because I really don't need to care for
now. I'm getting a lot closer to knowing how to do that as this thread
progresses, though, which is why I think of this as more of an
educational issue than anything else.

Thinking about how I'd code something like "cat" naively in C (while
((i = getchar()) != EOF) { putchar(i); }), I guess encoding=latin1 is
the way for Python to "work like everything else" in this context.

So I suppose there's a question. Do we really want to document how to
"do it wrong"? At first glance, obviously not. But if we don't, it
seems that the "Python 3 forces you to know Unicode" meme thrives, and
we keep getting bad press. Maybe we could add a note to the open()
documentation, something like the following:

"""To open a file, you need to know its encoding. This is not always
obvious, depending on where the file came from, among other things.
Other tools can process files without knowing the encoding by assuming
the bytes of the file map 1-1 to the first 256 Unicode characters.
This can cause issues such as mojibake or corrupted data, but for
casual use is sometimes sufficient. To get this behaviour in Python
(with all the same risks and problems) you can use the "latin1"
encoding, which maps bytes to unicode as described above. It is far,
far better to use the correct encoding declaration, if at all
possible, however."""

I have no real opinion on whether this is the right thing to do.
Unfortunately (in a sense :-)) it doesn't matter much to me any more,
as I now have the benefit of learning from this thread, so I'm no
longer in the target audience of the comment :-)

> Thanks for caring!<wink/>

Thanks for helping me learn!
Paul


From p.f.moore at gmail.com  Tue Feb 14 22:10:45 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 14 Feb 2012 21:10:45 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
Message-ID: <CACac1F9xkZitwPejPb-x-R1YmCdOOQsQJnBD+ae1E9tKz+zoaw@mail.gmail.com>

On 14 February 2012 14:38, Chris Rebert <pyideas at rebertia.com> wrote:
> On Tue, Feb 14, 2012 at 2:37 AM, Christopher Reay
> <christopherreay at gmail.com> wrote:
>> Web browsers can parse pages with multiple encodings seemingly perfectly
>> into the correct display characters. A quick copy and paste produces UTF-8
>> encoded text in the clip board. (on linux)
>>
>> HOW DO THEY DO IT.. can we have their libraries? :)
>
> The "chardet" package is in fact a port of Mozilla's encoding guessing code.

It seems to be Python 2 only. "Dive into Python 3" describes porting
it to Python 3, but I don't know of an actual Python 3 version.

Paul


From dirkjan at ochtman.nl  Tue Feb 14 22:34:31 2012
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Tue, 14 Feb 2012 22:34:31 +0100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CA+OGgf5xBkkXmo4J9NQ-+ukwviYFFS94fgzxPknLaHL9Oj=N6A@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
	<CAKmKYaB5iXr9r3JZcnQZ=uU-UGczp6z7HOkNQuZ7-xXjSR1KeA@mail.gmail.com>
	<CA+OGgf5xBkkXmo4J9NQ-+ukwviYFFS94fgzxPknLaHL9Oj=N6A@mail.gmail.com>
Message-ID: <CAKmKYaCdfUWYAR3NesLDzrMjKirzur6Y-sWArvLM7C_nRcpFmQ@mail.gmail.com>

(adding back the list...)

On Tue, Feb 14, 2012 at 21:55, Jim Jewett <jimjjewett at gmail.com> wrote:
> As useful as it could be, it needs to clearly be an "external
> standard", and should probably stay external. ?I don't want a
> situation where "Hey, this file was detected differently in Py3.4 and
> Py3.5" is a regression bug, or there will never be room for
> improvements.

Well, there have been *very* limited functional changes in the Mozilla
tree since about 2008, so I don't think there would be a great many
changes. And there's a large test suite to make sure regressions are
unlikely. I still think the benefit for the simple cases is
tremendous: a simple one-liner that finds the correct encoding in many
of the cases. There's also very little API surface.

Also, there is an actual 3.x port, I have it installed... should be in
the same tarball.

Cheers,

Dirkjan


From jimjjewett at gmail.com  Tue Feb 14 22:43:50 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 14 Feb 2012 16:43:50 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
Message-ID: <CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>

On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
<cmjohnson.mailinglist at gmail.com> wrote:

> OK, so concrete proposals: update the docs and maybe make a
> synonym for Latin-1 that makes it more semantically obvious that
> you're not really using it as Latin-1, just as a easy to pass through
> encoding. Anything else? Any bike shedding on the synonym?

encoding="ascii-ish"  # gets the sloppyness right

encoding="passthrough"  # I would like "ignore", if it wouldn't cause
confusion with the errorhandler

encoding="binpass"
encoding="rawbytes"

-jJ


From tjreedy at udel.edu  Tue Feb 14 23:29:24 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 14 Feb 2012 17:29:24 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
Message-ID: <jhen89$8d5$1@dough.gmane.org>

On 2/14/2012 6:39 AM, Carl M. Johnson wrote:
>
> On Feb 13, 2012, at 10:45 PM, Nick Coghlan wrote:
>
>> (Of course, if sys.stdout.encoding is "UTF-8", then you're right, those
>> characters will just be displayed as gibberish, as they would in the
>> latin-1 case. I guess its only on Windows and in any other locations
>> with a more restrictive default stdout encoding that errors are
>> particularly likely).
>
> I don't think that's right. I think that by default Python refuses to turn surrogate characters into UTF-8:
>
>>>> bytes(range(256)).decode("ascii", errors="surrogateescape")
> '\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\udc80\udc81\udc82\udc83\udc84\udc85\udc86\udc87\udc88\udc89\udc8a\udc8b\udc8c\udc8d\udc8e\udc8f\udc90\udc91\udc92\udc93\udc94\udc95\udc96\udc97\udc98\udc99\udc9a\udc9b\udc9c\udc9d\udc9e\udc9f\udca0\udca1\udca2\udca3\udca4\udca5\udca6\udca7\udca8\udca9\udcaa\udcab\udcac\udcad\udcae\udcaf\udcb0\udcb1\udcb2\udcb3\udcb4\udcb5\udcb6\udcb7\udcb8\udcb9\udcba\udcbb\udcbc\udcbd\udcbe\udcbf\udcc0\udcc1\udcc2\udcc3\udcc4\udcc5\udcc6\udcc7\udcc8\udcc9\udcca\udccb\udccc\udccd\udcce\udccf\udcd0\udcd1\udcd2\udcd3\udcd4\udcd5\udcd6\udcd7\udcd8\udcd9\udcda\udcdb\udcdc\udcdd\udcde\udcdf
>   \udce0\udce1\udce2\udce3\udce4\udce5\udce6\udce7\udce8\udce9\udcea\udceb\udcec\udced\udcee\udcef\udcf0\udcf1\udcf2\udcf3\udcf4\udcf5\udcf6\udcf7\udcf8\udcf9\udcfa\udcfb\udcfc\udcfd\udcfe\udcff'

While this is a Py3 str object, it is not unicode. Unicode only only 
allows proper surrogate codeunit pairs. Py2 allowed mal-formed unicode 
objects and that was not changed in Py3 -- or 3.3. It seems appropriate 
that bytes that are meaningless to ascii should be translated to 
codeunits that are meaningless (by themselves) to unicode.

>>>> _.encode("utf-8")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 128: surrogates not allowed

utf-8 only encodes proper unicode.

>>>> _.encode("utf-8", errors="surrogateescape")
> b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'

The result is not utf-8 and it would be better not to use 'utf-8' 
instead of 'ascii' in the expression. The above encodes to ascii + 
uninterpreted high-bit-set bytes.

 >>> s=bytes(range(256)).decode("ascii", errors="surrogateescape")
 >>> u=s.encode("utf-8", errors="surrogateescape")
 >>> a=s.encode("ascii", errors="surrogateescape")
 >>> u == a
True

-- 
Terry Jan Reedy


From python at mrabarnett.plus.com  Tue Feb 14 23:55:49 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Tue, 14 Feb 2012 22:55:49 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
Message-ID: <4F3AE675.6010907@mrabarnett.plus.com>

On 14/02/2012 21:43, Jim Jewett wrote:
> On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
> <cmjohnson.mailinglist at gmail.com>  wrote:
>
>>  OK, so concrete proposals: update the docs and maybe make a
>>  synonym for Latin-1 that makes it more semantically obvious that
>>  you're not really using it as Latin-1, just as a easy to pass through
>>  encoding. Anything else? Any bike shedding on the synonym?
>
> encoding="ascii-ish"  # gets the sloppyness right
>
> encoding="passthrough"  # I would like "ignore", if it wouldn't cause
> confusion with the errorhandler
>
> encoding="binpass"
> encoding="rawbytes"
>
encoding="mojibake" # :-)


From barry at python.org  Wed Feb 15 00:32:43 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 14 Feb 2012 18:32:43 -0500
Subject: [Python-ideas] Py3 unicode impositions
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
	<CACac1F9xkZitwPejPb-x-R1YmCdOOQsQJnBD+ae1E9tKz+zoaw@mail.gmail.com>
Message-ID: <20120214183243.3567f413@resist.wooz.org>

On Feb 14, 2012, at 09:10 PM, Paul Moore wrote:

>> The "chardet" package is in fact a port of Mozilla's encoding guessing code.
>
>It seems to be Python 2 only. "Dive into Python 3" describes porting
>it to Python 3, but I don't know of an actual Python 3 version.

We have a python3-chardet package in both Debian and Ubuntu, so the upstream
does support Python 3 afaik.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120214/226ade36/attachment.pgp>

From steve at pearwood.info  Wed Feb 15 00:35:11 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 15 Feb 2012 10:35:11 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3AE675.6010907@mrabarnett.plus.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>	<jh4bha$4ag$1@dough.gmane.org>	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>	<jh5n78$k3g$1@dough.gmane.org>	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>	<jh5ocl$ru9$1@dough.gmane.org>	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>	<jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
Message-ID: <4F3AEFAF.5060107@pearwood.info>

MRAB wrote:
> On 14/02/2012 21:43, Jim Jewett wrote:
>> On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
>> <cmjohnson.mailinglist at gmail.com>  wrote:
>>
>>>  OK, so concrete proposals: update the docs and maybe make a
>>>  synonym for Latin-1 that makes it more semantically obvious that
>>>  you're not really using it as Latin-1, just as a easy to pass through
>>>  encoding. Anything else? Any bike shedding on the synonym?
>>
>> encoding="ascii-ish"  # gets the sloppyness right
>> encoding="passthrough"  # I would like "ignore", if it wouldn't cause
>> confusion with the errorhandler

"Ignore" won't do. Ignore what? Everything? Don't actually run an encoder? 
That doesn't even make sense!

"Passthrough" is bad too, because it perpetrates the idea that ASCII 
characters are "plain text" which are bytes. Unicode strings, even those that 
are purely ASCII, are not strings of bytes (except in the sense that every 
data structure is a string of bytes). You can't just "pass bytes through" to 
turn them into Unicode.


>> encoding="binpass"
>> encoding="rawbytes"
>>
> encoding="mojibake" # :-)

You have a smiley, but I think that's the best name I've seen yet. It's 
explicit in what you get -- mojibake.

The only downside is that it's a little obscure. Not everyone knows what 
mojibake is called, or calls it mojibake, although I suppose we could add 
aliases to other terms such as Buchstabensalat and Kr?henf??e if German users 
complain <wink>

But remind me again, why are we doing this? If you have to teach people the 
recipe

     open(filename, encoding='mojibake')

why not just teach them the very slightly more complex recipe

     open(filename, encoding='ascii', errors='surrogateescape')

which captures the user's intent ("I want ASCII, with some way of escaping 
errors so I don't have to deal with them") much more accurately. Sometimes 
brevity is *not* a virtue.


-- 
Steven


From mwm at mired.org  Wed Feb 15 00:50:44 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 14 Feb 2012 18:50:44 -0500
Subject: [Python-ideas] Adding shm_open to mmap?
Message-ID: <20120214185044.4c5ee513@bhuda.mired.org>

One of the issues that showed up during the overlong TIOBE- thread and
spinoffs is that there's no portable way to get a named shared memory
segment (as distinguished from a disk-backed file) using the mmap
module. Most unix variants provide a memory-backed file system that
works for this, but it's name changes from distro to distro and even
installation to installation. It's not clear to me that non-Unix
platforms provide such a file system.

The Posix solution is shm_open, which accepts a name for rendezvous
and returns a file descriptor suitable for passing to mmap. Passing
the file descriptor to anything but fstat, ftruncate, close and mmap
is undefined.

We'd also need to add shm_unlink to remove the shared segment, as the
object created by shm_open isn't necessarily visible in the file
system name space. shm_open has five values that can be used in it's
flags argument, but those are shared with open and already available
in the os module.

This seems like a slam-dunk to me, but...

1) Is there some reason not to just add these two functions?

2) Are there any supported platforms with mmap and without
   shm_open/unlink?

3) Is this simple enough that a PEP isn't needed, just a patch in an
   issue?

   Thanks,
   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Wed Feb 15 01:02:20 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 10:02:20 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
Message-ID: <CADiSq7ftO=HPz0ciuOE1ho=2ZfJNqhteRbzAQMpCc5gRN1Gk=Q@mail.gmail.com>

On Tue, Feb 14, 2012 at 9:39 PM, Carl M. Johnson
<cmjohnson.mailinglist at gmail.com> wrote:
>
> On Feb 13, 2012, at 10:45 PM, Nick Coghlan wrote:
>
>> (Of course, if sys.stdout.encoding is "UTF-8", then you're right, those
>> characters will just be displayed as gibberish, as they would in the
>> latin-1 case. I guess its only on Windows and in any other locations
>> with a more restrictive default stdout encoding that errors are
>> particularly likely).
>
> I don't think that's right. I think that by default Python refuses to turn surrogate characters into UTF-8:

Oops, that's what I get for posting without testing :)

Still, your example clearly illustrates the point I was trying to make
- that using "ascii+surrogateescape" is less likely to silently
corrupt the data stream than using "latin-1", because attempts to
encode it under the "strict" error handler will generally fail, even
for an otherwise universal encoding like UTF-8.

> OK, so concrete proposals: update the docs and maybe make a synonym for Latin-1 that makes it more semantically obvious that you're not really using it as Latin-1, just as a easy to pass through encoding. Anything else? Any bike shedding on the synonym?

I don't see any reason to obfuscate the use of "latin-1" as a
workaround that maps 8-bit bytes directly to the corresponding Unicode
code points. My proposal would be two-fold:

Firstly, that we document three alternatives for working with
arbitrary ASCII compatible encodings (from simplest to most flexible):

1. Use the "latin-1" encoding

The latin-1 encoding accepts arbitrary binary data by mapping
individual bytes directly to the first 256 Unicode code points. Thus,
any sequence of bytes may be translated to a sequence of code points,
effectively reproducing the behaviour of Python 2's 8-bit strings. If
all data supplied is genuinely in an ASCII compatible encoding then
this will work correctly. However, it fails badly if the supplied data
is ever in an ASCII incompatible encoding, or if the decoded string is
written back out using a different encoding. Using this option
switches off *all* of Python 3's support for ensuring transcoding
correctness - errors will frequently pass silently and result in
corrupted output data rather than explicit exceptions.

2. Use the "ascii" encoding with the "surrogateescape" error handler

This is the most correct approach that doesn't involve attempting to
guess the string encoding. Behaviour if given data in an ASCII
incompatible encoding is still unpredictable (and likely to result in
data corruption). This approach retains most of Python 3's support for
ensuring transcoding correctness, while still accepting any ASCII
compatible encoding.

If UnicodeEncodeErrors when displaying surrogate escaped strings are
not desired, sys.stdout should also be updated to use the
"backslashreplace" error handler. (see below)

3. Initially process the data as binary, using the "chardet" package
from PyPI to guess the encoding

This is the most correct option that can even cope with many ASCII
incompatible encodings. Unfortunately, the chardet site is gone, since
Mark Pilgrim took down his entire web presence. This (including the
dead home page link from the PyPI entry) would need to be addressed
before its use could be recommended in the official documentation (or,
failing that, is there a properly documented alternative package
available?)

Secondly, that we make it easy to replace a TextIOWrapper with an
equivalent wrapper that has only selected settings changed (e.g.
encoding or errors). In 3.2, that is currently not possible, since the
original "newline" argument is not made available as a public
attribute. The closest we can get is to force universal newlines mode
along with whatever other changes we want to make:

    old = sys.stdout
    sys.stdout = io.TextIOWrapper(old.buffer, old.encoding,
"backslashreplace", None, old.line_buffering)

3.3 currently makes this even worse by accepting a "write_through"
argument that isn't available for introspection.

I propose that we make it possible to write the above as:

    sys.stdout = sys.stdout.rewrap(errors="backslashreplace")

For the latter point, see http://bugs.python.org/issue14017

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From p.f.moore at gmail.com  Wed Feb 15 01:06:12 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 15 Feb 2012 00:06:12 +0000
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <20120214183243.3567f413@resist.wooz.org>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
	<CACac1F9xkZitwPejPb-x-R1YmCdOOQsQJnBD+ae1E9tKz+zoaw@mail.gmail.com>
	<20120214183243.3567f413@resist.wooz.org>
Message-ID: <CACac1F_S5vE_JS7nzC+=VL3AWARjwbPJpc0PEZPRp8WAiy9YqQ@mail.gmail.com>

On 14 February 2012 23:32, Barry Warsaw <barry at python.org> wrote:
> On Feb 14, 2012, at 09:10 PM, Paul Moore wrote:
>
>>> The "chardet" package is in fact a port of Mozilla's encoding guessing code.
>>
>>It seems to be Python 2 only. "Dive into Python 3" describes porting
>>it to Python 3, but I don't know of an actual Python 3 version.
>
> We have a python3-chardet package in both Debian and Ubuntu, so the upstream
> does support Python 3 afaik.

Found it. There's a "chardet2" package.
Paul


From ncoghlan at gmail.com  Wed Feb 15 01:07:23 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 10:07:23 +1000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120214185044.4c5ee513@bhuda.mired.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
Message-ID: <CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>

On Wed, Feb 15, 2012 at 9:50 AM, Mike Meyer <mwm at mired.org> wrote:
> This seems like a slam-dunk to me, but...
>
> 1) Is there some reason not to just add these two functions?

Not that I can see. Make sure to add an "Availabilty: Unix" marker in
the relevant docs, though.

> 2) Are there any supported platforms with mmap and without
> ? shm_open/unlink?

The safest option is probably to add a configure check so we only
expose these APIs when the underlying platform offers them. There's a
*ton* of examples of such checks to copy from :)

> 3) Is this simple enough that a PEP isn't needed, just a patch in an
> ? issue?

Just a tracker issue will be fine - we expose additional posix APIs
all the time without a PEP.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From solipsis at pitrou.net  Wed Feb 15 01:05:18 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 15 Feb 2012 01:05:18 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
References: <20120214185044.4c5ee513@bhuda.mired.org>
Message-ID: <20120215010518.048e3da2@pitrou.net>

On Tue, 14 Feb 2012 18:50:44 -0500
Mike Meyer <mwm at mired.org> wrote:
> 
> This seems like a slam-dunk to me, but...
> 
> 1) Is there some reason not to just add these two functions?
> 
> 2) Are there any supported platforms with mmap and without
>    shm_open/unlink?
> 
> 3) Is this simple enough that a PEP isn't needed, just a patch in an
>    issue?

A patch is enough.

Note that this functionality is already available under Windows
(though not really advertised in our docs), through the `tagname`
parameter to mmap.mmap():

>>> import mmap
>>> f = mmap.mmap(-1, 4096, "mysharedmem")
>>> f.write(b"some bytes")

And in another session:

>>> import mmap
>>> f = mmap.mmap(-1, 4096, "mysharedmem")
>>> f.read(10)
b'some bytes'


See http://docs.python.org/dev/library/mmap.html and
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551%28v=vs.85%29.aspx

Regards

Antoine.


From greg at krypto.org  Wed Feb 15 01:10:29 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Tue, 14 Feb 2012 16:10:29 -0800
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F_S5vE_JS7nzC+=VL3AWARjwbPJpc0PEZPRp8WAiy9YqQ@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMgkT_8JP9mH2JEocpPeFaps00mnNeMXLv1N6XTqOo3+2DQsUw@mail.gmail.com>
	<CAMZYqRSRC0kDpYm-aN_TMdh4RboaNDiSjMm6bUXmn8m+T1iNfA@mail.gmail.com>
	<CACac1F9xkZitwPejPb-x-R1YmCdOOQsQJnBD+ae1E9tKz+zoaw@mail.gmail.com>
	<20120214183243.3567f413@resist.wooz.org>
	<CACac1F_S5vE_JS7nzC+=VL3AWARjwbPJpc0PEZPRp8WAiy9YqQ@mail.gmail.com>
Message-ID: <CAGE7PN+Qn9_=Yo2fyz6h10JoXcXStPU6uvodV1B1dGmOS0qKcg@mail.gmail.com>

oh good, this long thread has already started talking about encoding
detection packages.  now I don't have to bring it up.  :)

I suggest we link to one or more of these from the Python docs to
their pypi project pages as a suggestion for users that need to deal
with the real world of legacy data files in a variety of undeclared
format rather than the internet world of utf-8 or bust.

At some point it might be interesting to have a library like this in
the stdlib along with a common API for other compatible libraries but
I'm not sure any are ready for such a consideration. Is their behavior
stable or still learning based on new inputs?

-gps


From ben+python at benfinney.id.au  Wed Feb 15 01:15:36 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 15 Feb 2012 11:15:36 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
Message-ID: <87haytyms7.fsf@benfinney.id.au>

MRAB <python at mrabarnett.plus.com> writes:

> On 14/02/2012 21:43, Jim Jewett wrote:
> > On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
> > <cmjohnson.mailinglist at gmail.com>  wrote:
> >
> >>  OK, so concrete proposals: update the docs and maybe make a
> >>  synonym for Latin-1 that makes it more semantically obvious that
> >>  you're not really using it as Latin-1, just as a easy to pass through
> >>  encoding. Anything else? Any bike shedding on the synonym?
[?]

> encoding="mojibake" # :-)

+1

If people want to remain wilfully ignorant of text encoding in the third
millennium of our calendar, then a name like ?mojibake? is clear about
what they'll get, and will perhaps be publicly embarrassing enough that
some proportion of programmers will decide to reduce their ignorance and
use a specific encoding instead.

-- 
 \         ?Science is a way of trying not to fool yourself. The first |
  `\     principle is that you must not fool yourself, and you are the |
_o__)               easiest person to fool.? ?Richard P. Feynman, 1964 |
Ben Finney


From python at mrabarnett.plus.com  Wed Feb 15 01:44:19 2012
From: python at mrabarnett.plus.com (MRAB)
Date: Wed, 15 Feb 2012 00:44:19 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3AEFAF.5060107@pearwood.info>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>	<jh4bha$4ag$1@dough.gmane.org>	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>	<jh5n78$k3g$1@dough.gmane.org>	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>	<jh5ocl$ru9$1@dough.gmane.org>	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>	<jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
Message-ID: <4F3AFFE3.6070305@mrabarnett.plus.com>

On 14/02/2012 23:35, Steven D'Aprano wrote:
> MRAB wrote:
>>  On 14/02/2012 21:43, Jim Jewett wrote:
>>>  On Tue, Feb 14, 2012 at 6:39 AM, Carl M. Johnson
>>>  <cmjohnson.mailinglist at gmail.com>   wrote:
>>>
>>>>   OK, so concrete proposals: update the docs and maybe make a
>>>>   synonym for Latin-1 that makes it more semantically obvious that
>>>>   you're not really using it as Latin-1, just as a easy to pass through
>>>>   encoding. Anything else? Any bike shedding on the synonym?
>>>
>>>  encoding="ascii-ish"  # gets the sloppyness right
>>>  encoding="passthrough"  # I would like "ignore", if it wouldn't cause
>>>  confusion with the errorhandler
>
> "Ignore" won't do. Ignore what? Everything? Don't actually run an encoder?
> That doesn't even make sense!
>
> "Passthrough" is bad too, because it perpetrates the idea that ASCII
> characters are "plain text" which are bytes. Unicode strings, even those that
> are purely ASCII, are not strings of bytes (except in the sense that every
> data structure is a string of bytes). You can't just "pass bytes through" to
> turn them into Unicode.
>
>
>>>  encoding="binpass"
>>>  encoding="rawbytes"
>>>
>>  encoding="mojibake" # :-)
>
> You have a smiley, but I think that's the best name I've seen yet. It's
> explicit in what you get -- mojibake.
>
> The only downside is that it's a little obscure. Not everyone knows what
> mojibake is called, or calls it mojibake, although I suppose we could add
> aliases to other terms such as Buchstabensalat and Kr?henf??e if German users
> complain<wink>
>
Alternatively, "vreemdetekens" or "alfabetsoep"...

> But remind me again, why are we doing this? If you have to teach people the
> recipe
>
>       open(filename, encoding='mojibake')
>
> why not just teach them the very slightly more complex recipe
>
>       open(filename, encoding='ascii', errors='surrogateescape')
>
> which captures the user's intent ("I want ASCII, with some way of escaping
> errors so I don't have to deal with them") much more accurately. Sometimes
> brevity is *not* a virtue.
>


From grosser.meister.morti at gmx.net  Wed Feb 15 01:46:58 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 15 Feb 2012 01:46:58 +0100
Subject: [Python-ideas] map iterator
In-Reply-To: <CADwdpyYVQ0LThhwq-FKfhTBTKRS4hhmT+=in6OOA9Y3uN1L0pg@mail.gmail.com>
References: <CADwd9X=z-=773ZFw25PWZAV19yTqVp3pQNETzowe502_ANKLmw@mail.gmail.com>
	<CADwdpyYVQ0LThhwq-FKfhTBTKRS4hhmT+=in6OOA9Y3uN1L0pg@mail.gmail.com>
Message-ID: <4F3B0082.8040403@gmx.net>

On 02/09/2012 05:48 PM, Jerry Hill wrote:
> On Thu, Feb 9, 2012 at 11:40 AM, Edward Lesmes <ehlesmes at gmail.com <mailto:ehlesmes at gmail.com>> wrote:
>
>     An iterator version of map should be available for large sets of data.
>
>
> The python time machine strikes again.  In python 2, this is available as itertools.imap.  In python
> 3, this is the default behavior of the map() function.
>

Same goes for zip, by the way.


From shibturn at gmail.com  Wed Feb 15 02:00:30 2012
From: shibturn at gmail.com (shibturn)
Date: Wed, 15 Feb 2012 01:00:30 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120215010518.048e3da2@pitrou.net>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<20120215010518.048e3da2@pitrou.net>
Message-ID: <jhf03i$4v8$1@dough.gmane.org>

On 15/02/2012 12:05am, Antoine Pitrou wrote:
 > A patch is enough.
 >
 > Note that this functionality is already available under Windows
 > (though not really advertised in our docs), through the `tagname`
 > parameter to mmap.mmap():
 >
 >>>> import mmap
 >>>> f = mmap.mmap(-1, 4096, "mysharedmem")
 >>>> f.write(b"some bytes")
 >
 > And in another session:
 >
 >>>> import mmap
 >>>> f = mmap.mmap(-1, 4096, "mysharedmem")
 >>>> f.read(10)
 > b'some bytes'

It's not quite the same functionality since the lifetime of tagnamed 
mmaps is managed through handle refcounting.  In some cases that is an 
advantage compared to open()/unlink(), and in others a disadvantage.

Also, a problem with tagname is that there is no way to check whether 
the returned mmap was created by another process -- unless you resort to 
something like undocumented like

   from _multiprocessing import win32
   f = mmap.mmap(-1, 4096, "mysharedmem")
   if win32.GetLastError() == win32.ERROR_ALREADY_EXISTS:
     raise ValueError('tagname already exists')

sbt


From stephen at xemacs.org  Wed Feb 15 02:27:48 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Feb 2012 10:27:48 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<AEAC2A31-1FF5-4115-81FF-3E59FF28A368@gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
Message-ID: <87fwecopgr.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > If you're only round-tripping (i.e. writing back out as
 > "ascii+surrogateescape")

This is the only case that makes sense in this thread.  We're talking
about people coming from Python 2 who want an encoding-agnostic way to
script ASCII-oriented operations for an ASCII-compatible environment,
and not to learn about encodings at all.

While my opinions on this are (probably obviously) informed by the
WSGI discussion, this is not about making life come up roses for the
WSGI folks.  They work in a sewer; life stinks for them, and all they
can do about it is to hold their noses.  This thread is about people
who are not trying to handle sewage in a sanitary fashion, rather just
cook a meal and ignore the occasional hairs that inevitably fall in.

 > However, it's trivial to get an error when you go to encode the data
 > stream without one of the silencing error handlers set.

Sure, but getting errors is for people who want to learn how to do it
right, not for people who just need to get a job done.  Cf. the
fevered opposition to giving "import cElementTree" a DeprecationWarning.

 > In particular, sys.stdout has error handling set to strict, which I
 > believe is likely to throw UnicodeEncodeError if you try to feed a
 > string containing surrogate escaped bytes to an encoding that can't
 > handle them.

No, it should *always* throw a UnicodeEncodeError, because there are
*no* encodings that can handle them -- they're not characters, so they
can't be encoded.

 > (Of course, if sys.stdout.encoding is "UTF-8", then you're right,
 > those characters will just be displayed as gibberish,

No, they will raise UnicodeEncodeError; that's why surrogateescape was
invented, to work around the problem of what to do with bytes that the
programmer knows are meaningful to somebody, but do not represent
characters as far as Python can know:

wideload:~ 10:06$ python3.2
Python 3.2 (r32:88445, Mar 20 2011, 01:56:57) 
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> s = b'\xff\xff'.decode('utf-8', errors='surrogateescape')
>>> s.encode('utf-8',errors='strict')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in
position 0: surrogates not allowed
>>> 

The reason I advocate 'latin-1' (preferably under an appropriate
alias) is that you simply can't be sure that those surrogates won't be
passed to some module that decides to emit information about them
somewhere (eg, a warning or logging) -- without the protection of a
"silencing error handler".  Bang-bang! Python's silver hammer comes
down upon your head!


From ben+python at benfinney.id.au  Wed Feb 15 02:39:10 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 15 Feb 2012 12:39:10 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>
	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<87fwecopgr.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <874nuszxhd.fsf@benfinney.id.au>

"Stephen J. Turnbull" <stephen at xemacs.org>
writes:

> [?] the WSGI folks. They work in a sewer; life stinks for them, and
> all they can do about it is to hold their noses. This thread is about
> people who are not trying to handle sewage in a sanitary fashion,
> rather just cook a meal and ignore the occasional hairs that
> inevitably fall in.

[?]

> [?] some module that decides to emit information about them somewhere
> (eg, a warning or logging) -- without the protection of a "silencing
> error handler". Bang-bang! Python's silver hammer comes down upon your
> head!

You have made me feel strange emotions with this message. I don't know
what they are, but a combination of ?sickened? and ?admiring? and
?nostalgia?, with a pinch of fear, seems close.

Maybe this is what it's like to read poetry.

-- 
 \      ?[Entrenched media corporations will] maintain the status quo, |
  `\       or die trying. Either is better than actually WORKING for a |
_o__)                  living.? ?ringsnake.livejournal.com, 2007-11-12 |
Ben Finney


From mwm at mired.org  Wed Feb 15 03:25:39 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 14 Feb 2012 21:25:39 -0500
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
Message-ID: <20120214212539.7c5ffdef@bhuda.mired.org>

On Wed, 15 Feb 2012 10:07:23 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Wed, Feb 15, 2012 at 9:50 AM, Mike Meyer <mwm at mired.org> wrote:
> > This seems like a slam-dunk to me, but...
> > 1) Is there some reason not to just add these two functions?
> Not that I can see. Make sure to add an "Availabilty: Unix" marker in
> the relevant docs, though.

I thought Windows was a Posix system? As such, it should have shm_open
and shm_unlink, so the market wouldn't be appropriate.

    Thanks,
    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From stephen at xemacs.org  Wed Feb 15 03:43:41 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Feb 2012 11:43:41 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3AEFAF.5060107@pearwood.info>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
Message-ID: <87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:
 > MRAB wrote:

 > >> encoding="ascii-ish"  # gets the sloppyness right

+0.8  I'd prefer the more precise "ascii-compatible".  Shift JIS is
"ASCII-ish", but should not be decoded with this codec.

 > > encoding="mojibake" # :-)
 > 
 > You have a smiley, but I think that's the best name I've seen yet. It's 
 > explicit in what you get -- mojibake.

Explicit, but incorrect.  Mojibake ("bake" means "change") is what you
get when you use one encoding to encode characters, and another to
decode them.  Here, not only are we talking about using the same codec
at both ends, but in fact it's inside out (we are decoding then
encoding).  This is GIGO, not mojibake.

 > why not just teach them the very slightly more complex recipe
 > 
 >      open(filename, encoding='ascii', errors='surrogateescape')
 > 
 > which captures the user's intent ("I want ASCII, with some way of
 > escaping errors so I don't have to deal with them") much more
 > accurately.

Why not?  Because 'surrogateescape' does not express the user's
intent.  That user *will* have to deal with errors as soon as she
invokes modules that validate their input, or include some portion of
the text being treated in output of any kind, unless they use an
error-suppressing handler themselves.  Surrogates are errors in
Unicode, and that's the way it should be.  That's precisely why Martin
felt it necessary to use this technique in PEP 383: to ensure that
errors *will* occur unless you are very careful in handling strings
produced with the surrogateescape handler active.

It's arguable that most applications *should* want errors in these
cases; I've made that argument myself.  But it's quite clearly not the
user's intent.


From ncoghlan at gmail.com  Wed Feb 15 04:07:54 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 13:07:54 +1000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120214212539.7c5ffdef@bhuda.mired.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
Message-ID: <CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>

On Wed, Feb 15, 2012 at 12:25 PM, Mike Meyer <mwm at mired.org> wrote:
> On Wed, 15 Feb 2012 10:07:23 +1000
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> On Wed, Feb 15, 2012 at 9:50 AM, Mike Meyer <mwm at mired.org> wrote:
>> > This seems like a slam-dunk to me, but...
>> > 1) Is there some reason not to just add these two functions?
>> Not that I can see. Make sure to add an "Availabilty: Unix" marker in
>> the relevant docs, though.
>
> I thought Windows was a Posix system?

Not as far as I am aware - if it was, Cygwin wouldn't be needed as a
compatibility layer to get POSIX software running.

To get them to work properly on Windows, many modules that interface
with the OS have to use the win32 API directly rather than relying on
the native implementations of the POSIX APIs.

> As such, it should have shm_open
> and shm_unlink, so the market wouldn't be appropriate.

In this case, it sounds like Windows may already have a roughly
equivalent mechanism in mmap, so cross-platform support may be
feasible. If that's the case, a marker won't be needed.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Wed Feb 15 04:22:02 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 13:22:02 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>

On Wed, Feb 15, 2012 at 12:43 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> It's arguable that most applications *should* want errors in these
> cases; I've made that argument myself. ?But it's quite clearly not the
> user's intent.

However, from a correctness point of view, it's a big step up from
just saying "latin-1" (which effectively turns off *all* of the
additional encoding related sanity checking Python 3 offers over
Python 2). For many "I don't care about Unicode" use cases, using
"ascii+surrogateescape" for your own I/O and setting
"backslashreplace" on sys.stdout should cover you (and any exceptions
you get will be warning you about cases where your original
assumptions about not caring about Unicode validity have been proven
wrong).

If the logging module doesn't do it already, it should probably be
defaulting to backslashreplace when encoding messages, too (for the
same reason sys.stderr already defaults to that - you don't want your
error reporting system failing to encode corrupted Unicode data).

sys.stdin and sys.stdout are different due to the role they play in
pipeline processing - for those,
locale.getpreferredencoding()+"strict" is a more reasonable default
(but we should make it easy to replace them with something more
specific for a given application, hence
http://bugs.python.org/issue14017)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Wed Feb 15 05:10:11 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 14 Feb 2012 23:10:11 -0500
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
Message-ID: <20120214231011.6fce4b3b@bhuda.mired.org>

On Wed, 15 Feb 2012 13:07:54 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Wed, Feb 15, 2012 at 12:25 PM, Mike Meyer <mwm at mired.org> wrote:
> > On Wed, 15 Feb 2012 10:07:23 +1000
> > Nick Coghlan <ncoghlan at gmail.com> wrote:
> > As such, it should have shm_open
> > and shm_unlink, so the market wouldn't be appropriate.
> In this case, it sounds like Windows may already have a roughly
> equivalent mechanism in mmap, so cross-platform support may be
> feasible. If that's the case, a marker won't be needed.

The "tagname" feature in the windows version uses ref counting to free
the shared segment when no one is using it.

shm_open requires someone to call shm_unlink, but doesn't actually
remove it until there are no more references to it. However you can't
shm_open it again after shm_unlink'ing (expected on Unix, and verified
on my FBSD box).

We could sorta-kinda emulate the windows "tagname" behavior using
shm_open.

I'd prefer to provide shm_open on Windows if at all possible. The
"sorta-kinda" bothers me. That would also allow for an application to
exit and then resume work stored in a mapped segment (something I've
done before). However, setting this up on Windows isn't something I
can do.

   Thanks,
   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From stephen at xemacs.org  Wed Feb 15 05:12:58 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Feb 2012 13:12:58 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
Message-ID: <87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > using "ascii+surrogateescape" for your own I/O and setting
 > "backslashreplace" on sys.stdout should cover you (and any
 > exceptions you get will be warning you about cases where your
 > original assumptions about not caring about Unicode validity have
 > been proven wrong).

Are you saying you know more than the user about her application?

 > If the logging module doesn't do it already, it should probably be
 > defaulting to backslashreplace when encoding messages, too

See, *you* don't know whether it will raise, either, and that about an
important stdlib module.  Why should somebody who is not already a
Unicode geek and is just using a module they've downloaded off of PyPI
be required to audit its IO foibles?

Really, I think use of 'latin1' in this context is covered by
"consenting adults."  We *should* provide an alias that says "all we
know about this string is that the ASCII codes represent ASCII
characters," and document that even if your own code is ASCII
compatible (ie, treats runs of non-ASCII as opaque, atomic blobs),
third party modules may corrupt the text.  And use the word "corrupt";
all UnicodelyRightThinking folks will run away screaming.

That statement about corrupting text is true in Python 2, and
pre-PEP-393 Python 3, anyway (on Windows and UCS-2 builds elsewhere),
you know, since they can silently slice a surrogate pair in half.


From cmjohnson.mailinglist at gmail.com  Wed Feb 15 06:03:10 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Tue, 14 Feb 2012 19:03:10 -1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <16F229EE-4018-4E81-962A-8D48036F194F@gmail.com>

If I can I would like to offer one argument for surrogateescape over latin-1 as the newbie approach. Suppose I am naively processing text files to create a webpage and one of my filters is a "smart quotes" filter to change "" to ??. Of course, there's no way to smarten quotes up if you don't know the encoding of your input or output files; you'll just make a mess. In this situation, Latin-1 lets you mojibake it up. If your input turns out not to have been Latin-1, the final result will be corrupted by the quote smartener. On the other hand, if you use encoding="ascii", errors="surrogateescape" Python will complain, because the smart quotes being added aren't ascii. In other words, the surrogate escape force naive users to stick to ASCII unless they can determine what encoding they want to use for their input/output. It's not perfect, but I think it strikes a better balance than letting the users shoot themselves in the foot.

From ncoghlan at gmail.com  Wed Feb 15 07:58:55 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 16:58:55 +1000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120214231011.6fce4b3b@bhuda.mired.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
Message-ID: <CADiSq7dCB-pz0XzRZ51E2HCQ1v1BCw4x9Tadj-JDNNwzqS8_8A@mail.gmail.com>

On Wed, Feb 15, 2012 at 2:10 PM, Mike Meyer <mwm at mired.org> wrote:
> I'd prefer to provide shm_open on Windows if at all possible. The
> "sorta-kinda" bothers me. That would also allow for an application to
> exit and then resume work stored in a mapped segment (something I've
> done before). However, setting this up on Windows isn't something I
> can do.

That's the purpose of the "Availability" markers in the docs - to
allow a POSIX implementation to be added directly, then, if it's
confirmed to work on Windows, or someone implements the necessary
additional parts to make it work, the Availability restriction can be
dropped. The OS interface on Windows is just too different for us to
gate all OS service additions on having a working Windows version of
the feature. (It's not *ideal* when that happens, of course, but it's
a practical concession to the fact that our pool of Windows developers
is significantly smaller than our pool of *nix and OS X developers).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stephen at xemacs.org  Wed Feb 15 08:46:18 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 15 Feb 2012 16:46:18 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <16F229EE-4018-4E81-962A-8D48036F194F@gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<16F229EE-4018-4E81-962A-8D48036F194F@gmail.com>
Message-ID: <87aa4ko7xx.fsf@uwakimon.sk.tsukuba.ac.jp>

Carl M. Johnson writes:

 > If I can I would like to offer one argument for surrogateescape
 > over latin-1 as the newbie approach.

This isn't the newbie approach.  What should be recommended to newbies
is to use the default (which is locale-dependent, and therefore
"usually" "good enough"), and live with the risk of occasional
exceptions.  If they get exceptions, or must avoid exceptions, learn
about encodings or consult with someone who already knows.[1]
*Neither* of the approaches discussed here is reliable for tasks like
automatically processing email or uploaded files on the web, and
neither should be recommended to people who aren't already used to
encoding-agnostic processing in the Python 2 "str" style.

So, now that you mention "newbies", I don't know what other people are
discussing, but what I've been discussing here is an approach for
people who are comfortable working around (or never experience!) the
defects of Python 2's ASCII-compatible approach to handling varied
encodings in a single program, and want a workalike for Python 3.

The choice between the two is task-dependent.  The encoding='latin1'
method is for tasks where a little mojibake can be tolerated, but an
exception would stop the show.  The errors='surrogateencoding' method
is for tasks where any mojibake at all is a disaster, but occasional
exceptions can be handled as they arise.


Footnotes: 
[1]  When this damned term is over in a few weeks, I'll take a look at
the tutorial-level docs and see if I can come up with a gentle
approach for those who are finding out for the first time that the
locale-dependent default isn't good enough for them.


From ncoghlan at gmail.com  Wed Feb 15 09:03:03 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 15 Feb 2012 18:03:03 +1000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>

On Wed, Feb 15, 2012 at 2:12 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Nick Coghlan writes:
>
> ?> using "ascii+surrogateescape" for your own I/O and setting
> ?> "backslashreplace" on sys.stdout should cover you (and any
> ?> exceptions you get will be warning you about cases where your
> ?> original assumptions about not caring about Unicode validity have
> ?> been proven wrong).
>
> Are you saying you know more than the user about her application?

No, I'm merely saying that at least 3 options (latin-1,
ascii+surrogateescape, chardet2) should be presented clearly to
beginners and the trade-offs explained.

For example:

Task: Process data in any ASCII compatible encoding
Unicode Awareness Care Factor: None
Approach: Specify encoding="latin-1"
    Bytes/bytearray: data.decode("latin-1")
    Text files: open(fname, encoding="latin-1")
    Stdin replacement: sys.stdin = io.TextIOWrapper(sys.stdin.buffer, "latin-1")
    Stdout replacement (pipeline): sys.stdout =
io.TextIOWrapper(sys.stdout.buffer, "latin-1", line_buffered=True)
    Stdout replacement (terminal): Leave it alone

By decoding with latin-1, an application won't get *any* Unicode
decoding errors, as that encoding maps byte values directly to the
first 256 Unicode code points. However, any output data generated by
that application *will* be corrupted if the assumption of ASCII
compatibility are violated, or if implicit transcoding to any encoding
other than "latin-1" occurs (e.g. when writing to sys.stdout or a log
file, communicating over a network socket or serialising the string
the json module). This is the closest Python 3 comes to emulating the
permissive behaviour of Python 2's 8-bit strings (implicit
interoperation with byte sequences is still disallowed).

Task: Process data in any ASCII compatible encoding
Unicode Awareness Care Factor: Minimal
Approach: Use encoding="ascii" and errors="surrogateescape" (or,
alternatively, errors="backslashreplace" for sys.stdout)
    Bytes/bytearray: data.decode("ascii", errors="surrogateescape")
    Text files: open(fname, encoding="ascii", "surrogateescape")
    Stdin replacement: sys.stdin = io.TextIOWrapper(sys.stdin.buffer,
"ascii", "surrogateescape")
    Stdout replacement (pipeline): sys.stdout =
io.TextIOWrapper(sys.stdout.buffer, "ascii", "surrogateescape",
line_buffered=True)
    Stdout replacement (terminal): sys.stdout =
io.TextIOWrapper(sys.stdout.buffer, sys.stdout.encoding,
"backslashreplace", line_buffered=True)

Using "ascii+surrogateescape" instead of "latin-1" is a small initial
step into the Unicode-aware world. It still lets an application
process any ASCII-compatible encoding *without* having to know the
exact encoding of the source data, but will complain if there is an
implicit attempt to transcode the data to another encoding, or if the
application inserts non-ASCII data into the strings before writing
them out. Whether non-ASCII compatible encodings trigger errors or get
corrupted will depend on the specifics of the encoding and how the
program manipulates the data.

The "backslashreplace" error handler (enabled by default for
sys.stderr, optionally enabled as shown above for sys.stdout) can be
useful to help ensure that printing out strings will not trigger
UnicodeEncodeErrors (note: the *repr* of strings already escapes
non-ASCII characters internally, such that repr(x) == ascii(x). Thus,
UnicodeEncodeErrors will occur only when encoding the string itself
using the "strict" error handler, or when another library performs
equivalent validation on the string).

Task: Process data in any ASCII compatible encoding
Unicode Awareness Care Factor: High
Approach: Use binary APIs and the "chardet2" module from PyPI to
detect the character encoding
    Bytes/bytearray: data.decode(detected_encoding)
    Text files: open(fname, encoding=detected_encoding)

The *right* way to process text in an unknown encoding is to do your
best to derive the encoding from the data stream. The "chardet2"
module on PyPI allows this. Refer to that module's documentation
(WHERE?) for details.

With this approach, transcoding to the default sys.stdin and
sys.stdout encodings should generally work (although the default
restrictive character set on Windows and in some locales may cause
problems).

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From niki.spahiev at gmail.com  Wed Feb 15 09:52:45 2012
From: niki.spahiev at gmail.com (Niki Spahiev)
Date: Wed, 15 Feb 2012 10:52:45 +0200
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F8mQ1q0hzzrOP3WKAaLv04j_u3raNrJN08e=HXBnRBMGw@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8mQ1q0hzzrOP3WKAaLv04j_u3raNrJN08e=HXBnRBMGw@mail.gmail.com>
Message-ID: <jhfrot$3jv$1@dough.gmane.org>

On 14.02.2012 23:08, Paul Moore wrote:
> Maybe we could add a note to the open()
> documentation, something like the following:
>
> """To open a file, you need to know its encoding. This is not always
> obvious, depending on where the file came from, among other things.
> Other tools can process files without knowing the encoding by assuming
> the bytes of the file map 1-1 to the first 256 Unicode characters.
> This can cause issues such as mojibake or corrupted data, but for
> casual use is sometimes sufficient. To get this behaviour in Python
> (with all the same risks and problems) you can use the "latin1"
> encoding, which maps bytes to unicode as described above. It is far,
> far better to use the correct encoding declaration, if at all
> possible, however."""

IMHO it's better to make 'unknown' encoding alias to 'latin1'.
This way one can find and change it later.

Niki


From shibturn at gmail.com  Wed Feb 15 12:16:46 2012
From: shibturn at gmail.com (shibturn)
Date: Wed, 15 Feb 2012 11:16:46 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120214231011.6fce4b3b@bhuda.mired.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
Message-ID: <jhg472$6rh$1@dough.gmane.org>

On 15/02/2012 4:10am, Mike Meyer wrote:
> I'd prefer to provide shm_open on Windows if at all possible. The
> "sorta-kinda" bothers me. That would also allow for an application to
> exit and then resume work stored in a mapped segment (something I've
> done before). However, setting this up on Windows isn't something I
> can do.

Maybe creating a file using CreateFile and FILE_ATTRIBUTES_TEMPORARY 
would have a similar effect - it hints to the system to avoid flushing 
to the disk.  (os.open and O_TEMPORARY would not work because that also 
causes the file to be removed when all handles are closed.)

sbt


From solipsis at pitrou.net  Wed Feb 15 13:34:19 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 15 Feb 2012 13:34:19 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
Message-ID: <20120215133419.230ea8e6@pitrou.net>

On Tue, 14 Feb 2012 23:10:11 -0500
Mike Meyer <mwm at mired.org> wrote:
> 
> I'd prefer to provide shm_open on Windows if at all possible. The
> "sorta-kinda" bothers me. That would also allow for an application to
> exit and then resume work stored in a mapped segment (something I've
> done before).

The original discussion was about shared memory with multiprocessing.
In that context, automatic collection of shared memory areas shouldn't
be a problem.

Regards

Antoine.


From shibturn at gmail.com  Wed Feb 15 14:25:14 2012
From: shibturn at gmail.com (shibturn)
Date: Wed, 15 Feb 2012 13:25:14 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120215133419.230ea8e6@pitrou.net>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net>
Message-ID: <jhgbnu$2q6$1@dough.gmane.org>

On 15/02/2012 12:34pm, Antoine Pitrou wrote:
> The original discussion was about shared memory with multiprocessing.
> In that context, automatic collection of shared memory areas shouldn't
> be a problem.

One problem with automatic collection is if you want to put a reference 
to an mmap on a queue.  The mmap is likely to be disposed of before the 
target process can upickle it.

sbt


From phd at phdru.name  Wed Feb 15 14:39:12 2012
From: phd at phdru.name (Oleg Broytman)
Date: Wed, 15 Feb 2012 17:39:12 +0400
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87haytyms7.fsf@benfinney.id.au>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
Message-ID: <20120215133912.GA17040@iskra.aviel.ru>

On Wed, Feb 15, 2012 at 11:15:36AM +1100, Ben Finney wrote:
> If people want to remain wilfully ignorant of text encoding in the third
> millennium

   This returns us to the very beginning of the thread. The original
complain was: Python3 requires users to learn too much about unicode,
more than they really need.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From christopherreay at gmail.com  Wed Feb 15 09:14:17 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Wed, 15 Feb 2012 10:14:17 +0200
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
Message-ID: <CAMgkT__WDs2ZM29paqD8j8yRV4NohAo-cD0vvz1CaSr_Zx4yUA@mail.gmail.com>

+1000

Great, lets do that

Will I be repetitive if I say "can we put a link in the
"UnicodeDecodeError" docstring?
At the top of that page have "FOR   BEGINNERS" or "Mugh, just make this
error go away, Now", and this info from Nick
Also link to all the other tons and tons of stuff that exists on
UnicodeDecoding...


Chardet does nothing like the complex character set decoding that any of
the browsers accomplish.
Also, it almost always calls "latin-1" encoded files "latin-2" and
"latin-someOtherNumber", which actually doesnt work to decode the data.
The browsers can translate seemingly untouchable mush of mixed char
encodings into UTF-8 (on my linux box) without hiccupping. I tried to
emulate their behaviour for almost a week before I gave up. To be fair, I
was at that time char set newbie, and I guess I still am, though my scraper
works properly.

Christopherq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/7e237448/attachment.html>

From simon.sapin at kozea.fr  Wed Feb 15 14:41:46 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Wed, 15 Feb 2012 14:41:46 +0100
Subject: [Python-ideas] Py3 unicode impositions
In-Reply-To: <CACac1F8mQ1q0hzzrOP3WKAaLv04j_u3raNrJN08e=HXBnRBMGw@mail.gmail.com>
References: <CA+OGgf6wppYHHnkO-xfUVyvG1W+hzthAsmHkjGGzpe8fetFwsA@mail.gmail.com>
	<87ty2yvwiq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_TDjde-4+u3_ddsxcNw-bgYD0-Yjx_=LGkAhN1YJWgKA@mail.gmail.com>
	<4F374805.9000606@pearwood.info>
	<CACac1F-oYeCrFUwyc+e_CJf=Xy9i6McKOxUr-sio+3RYaohuHg@mail.gmail.com>
	<87zkcne1cn.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F_emky07=y9-+owCUBOKNUD_FBvfA-o19ZJfxBOqP6RZA@mail.gmail.com>
	<87k43poix5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CACac1F8mQ1q0hzzrOP3WKAaLv04j_u3raNrJN08e=HXBnRBMGw@mail.gmail.com>
Message-ID: <4F3BB61A.7020602@kozea.fr>

Le 14/02/2012 22:08, Paul Moore a ?crit :
> Thinking about how I'd code something like "cat" naively in C (while
> ((i = getchar()) != EOF) { putchar(i); }), I guess encoding=latin1 is
> the way for Python to "work like everything else" in this context.

Hi,

The Python equivalent to your C program is to use bytes without decoding 
at all: open a file with 'rb' mode, use sys.stdin.buffer, ...

I think this is the right thing to do if you want to pass through 
unmodified text without knowing the encoding.

Regards,
-- 
Simon Sapin


From christopherreay at gmail.com  Wed Feb 15 13:39:57 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Wed, 15 Feb 2012 14:39:57 +0200
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120215133419.230ea8e6@pitrou.net>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net>
Message-ID: <CAMgkT_8GXvzJk1NLVSwVw+VwKuKt5C4MEoY+DNtu=6aNNWSKUg@mail.gmail.com>

Do the people here want to shift over to the concurrency maililng list?

Would be nicer in there with a few more people
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/a7a4ef6c/attachment.html>

From Ronny.Pfannschmidt at gmx.de  Wed Feb 15 15:32:10 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Wed, 15 Feb 2012 15:32:10 +0100
Subject: [Python-ideas] automation of __repr__/__str__ for all the common
	simple cases
Message-ID: <4F3BC1EA.3030002@gmx.de>

Hi,

in my experience for many cases, __repr__ and __str__ can be 
unconditionally be represented as simple string formatting operation,

so i would propose to add a extension to support simply declaring them 
in the form of newstyle format strings

a basic implementation for __repr__ could look like:

  class SelfFormatter(string.Formatter):
         def __init__(self, obj):
             self.__obj = obj
             string.Formatter.__init__(self)

         def get_value(self, key, args, kwargs):
             if isinstance(key, str) and hasattr(self.__obj, key):
                 return getattr(self.__obj, key)
             return Formatter.get_value(self, key, args, kwargs)

   class SimpleReprMixing(object):
     _repr_ = '<{__class__.__name__} at 0x{__id__!x}>'
     def __repr__(self):
         formatter = SelfFormatter(self)
         return formatter.vformat(self._repr_, (), {'__id__':id(self)})

-- Ronny Pfannschmidt


From nathan.alexander.rice at gmail.com  Wed Feb 15 16:34:45 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Wed, 15 Feb 2012 10:34:45 -0500
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3BC1EA.3030002@gmx.de>
References: <4F3BC1EA.3030002@gmx.de>
Message-ID: <CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>

I think that a generic __repr__ has been reinvented more times than I
can count.  I don't think a generic __str__ is a good thing, as it is
supposed to be a pretty, semantically meaningful.  I don't really see
anywhere in the standard library that such a feature would make sense
though.  I feel like Python's standard library bloat actually makes
the good stuff harder to find, and a better approach would be to have
a minimal "core" standard library with a few "official" battery pack
style libs that are very prominently featured and available.


Since you might find this useful, here is my old __repr__ reciple
(which has several issues, but gets the job done for the most part):

def get_attributes(o):
    attributes = [(a, getattr(o, a)) for a in
set(dir(o)).difference(dir(object)) if a[0] != "_"]
    return {a[0]: a[1] for a in attributes if not callable(a[1])}

class ReprMixin(object):

    def _format(self, v):
        if isinstance(v, (basestring, date, time, datetime)):
            v = "'%s'" % v
            return v.encode("utf-8", errors="ignore")
        else:
            return v

    def __repr__(self):
        attribute_string = ", ".join("%s=%s" % (k[0],
self._format(k[1])) for k in get_attributes(self).items())
        return "%s(%s)" % (type(self).__name__, attribute_string)


There is a similar recipes in SQL Alchemy, and I've seen them in a few
other popular libs that I can't remember off the top of my head.


Nathan


From ubershmekel at gmail.com  Wed Feb 15 17:20:41 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Wed, 15 Feb 2012 18:20:41 +0200
Subject: [Python-ideas] Generators' and iterators' __add__ method
Message-ID: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>

Wouldn't it be nice to add generators and iterators like we can do with
lists?

    def f():
        yield 1
        yield 2
        yield 3

    def g():
        yield 4
        yield 5

    # today
    for item in itertools.chain(f(), g()):
        print(item)

    # proposal
    for item in f() + g():
        print(item)


What do you guys think?

Yuval Greenfield
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/b10eebbb/attachment.html>

From guido at python.org  Wed Feb 15 17:41:04 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 15 Feb 2012 08:41:04 -0800
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
Message-ID: <CAP7+vJ+EfGBnnDx50C98BZ42P7rRG9sNg6rTir03bBqautDYjg@mail.gmail.com>

It's been proposed many times, but always stumbled on the fact that
the iterator protocol doesn't have a standard implementation -- each
object implementing __next__ would have to be modified separately to
also support __add__.

On Wed, Feb 15, 2012 at 8:20 AM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
> Wouldn't it be nice to add generators and iterators like we can do with
> lists?
>
> ? ? def f():
> ? ? ? ? yield 1
> ? ? ? ? yield 2
> ? ? ? ? yield 3
>
> ? ? def g():
> ? ? ? ? yield 4
> ? ? ? ? yield 5
>
> ? ? # today
> ? ? for item in itertools.chain(f(), g()):
> ? ? ? ? print(item)
>
> ? ? # proposal
> ? ? for item in f() + g():
> ? ? ? ? print(item)
>
>
> What do you guys think?
>
> Yuval Greenfield
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
--Guido van Rossum (python.org/~guido)


From ubershmekel at gmail.com  Wed Feb 15 17:50:42 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Wed, 15 Feb 2012 18:50:42 +0200
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CAP7+vJ+EfGBnnDx50C98BZ42P7rRG9sNg6rTir03bBqautDYjg@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
	<CAP7+vJ+EfGBnnDx50C98BZ42P7rRG9sNg6rTir03bBqautDYjg@mail.gmail.com>
Message-ID: <CANSw7KzQPJTUQQo_PLbUff4rvRObWMj1W8cLV9tXdSXPx=ZrEw@mail.gmail.com>

On Wed, Feb 15, 2012 at 6:41 PM, Guido van Rossum <guido at python.org> wrote:

> It's been proposed many times, but always stumbled on the fact that
> the iterator protocol doesn't have a standard implementation -- each
> object implementing __next__ would have to be modified separately to
> also support __add__.
>
>
>
If it isn't a bad idea then we can at least do generators and whatever we
find in the standard lib.

I think I extrapolate from your response that it isn't a bad idea. I'll
work on a patch.

Cheers,

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/d36cfb19/attachment.html>

From guido at python.org  Wed Feb 15 17:54:52 2012
From: guido at python.org (Guido van Rossum)
Date: Wed, 15 Feb 2012 08:54:52 -0800
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CANSw7KzQPJTUQQo_PLbUff4rvRObWMj1W8cLV9tXdSXPx=ZrEw@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
	<CAP7+vJ+EfGBnnDx50C98BZ42P7rRG9sNg6rTir03bBqautDYjg@mail.gmail.com>
	<CANSw7KzQPJTUQQo_PLbUff4rvRObWMj1W8cLV9tXdSXPx=ZrEw@mail.gmail.com>
Message-ID: <CAP7+vJ+bO=OTwhJWPbh63v-S+i7UEcKhTwNu2A=R-NN2n1_ASQ@mail.gmail.com>

It IS a bad idea.

--Guido van Rossum (sent from Android phone)
On Feb 15, 2012 8:50 AM, "Yuval Greenfield" <ubershmekel at gmail.com> wrote:

> On Wed, Feb 15, 2012 at 6:41 PM, Guido van Rossum <guido at python.org>wrote:
>
>> It's been proposed many times, but always stumbled on the fact that
>> the iterator protocol doesn't have a standard implementation -- each
>> object implementing __next__ would have to be modified separately to
>> also support __add__.
>>
>>
>>
> If it isn't a bad idea then we can at least do generators and whatever we
> find in the standard lib.
>
> I think I extrapolate from your response that it isn't a bad idea. I'll
> work on a patch.
>
> Cheers,
>
> Yuval
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/b5bc7f9d/attachment.html>

From ehlesmes at gmail.com  Wed Feb 15 18:02:57 2012
From: ehlesmes at gmail.com (Edward Lesmes)
Date: Wed, 15 Feb 2012 12:02:57 -0500
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
Message-ID: <CADwd9Xn5rNYRQOyPA7Y6kvNqruzZ00fpMxGc-Bn1vVZoRQJXKg@mail.gmail.com>

On Wed, Feb 15, 2012 at 10:34 AM, Nathan Rice <
nathan.alexander.rice at gmail.com> wrote:

>  I feel like Python's standard library bloat actually makes
> the good stuff harder to find, and a better approach would be to have
> a minimal "core" standard library with a few "official" battery pack
> style libs that are very prominently featured and available.


+1, but who decides what?

-- 
Edward Lesmes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120215/072cc8a0/attachment.html>

From solipsis at pitrou.net  Wed Feb 15 18:02:50 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 15 Feb 2012 18:02:50 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
Message-ID: <20120215180250.21a05ddf@pitrou.net>

On Wed, 15 Feb 2012 13:25:14 +0000
shibturn <shibturn at gmail.com> wrote:
> On 15/02/2012 12:34pm, Antoine Pitrou wrote:
> > The original discussion was about shared memory with multiprocessing.
> > In that context, automatic collection of shared memory areas shouldn't
> > be a problem.
> 
> One problem with automatic collection is if you want to put a reference 
> to an mmap on a queue.  The mmap is likely to be disposed of before the 
> target process can upickle it.

Can you elaborate? I would think the general use case is to keep an
mmap alive as long as you need it, so I don't understand why someone
would destroy an mmap just after sending it to another process.

Regards

Antoine.


From shibturn at gmail.com  Wed Feb 15 18:43:25 2012
From: shibturn at gmail.com (shibturn)
Date: Wed, 15 Feb 2012 17:43:25 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <20120215180250.21a05ddf@pitrou.net>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
	<20120215180250.21a05ddf@pitrou.net>
Message-ID: <jhgqs1$3kf$1@dough.gmane.org>

On 15/02/2012 5:02pm, Antoine Pitrou wrote:
> On Wed, 15 Feb 2012 13:25:14 +0000
> Can you elaborate? I would think the general use case is to keep an
> mmap alive as long as you need it, so I don't understand why someone
> would destroy an mmap just after sending it to another process.

A process which creates an mmap may want to transfer ownership of the 
mmap to another process along a pipeline.  For example:

1) Process A creates an mmap
2) Process A does some work on mmap
3) Process A puts mmap on a queue.
4) mmap gets garbage collected in process A.
5) Process B gets mmap from queue.
...

With refcounting the mmap will be destroyed at step 4.  With 
shm_open/shm_unlink, it would be Process B's responsibility to unlink 
the file.

This is the scenario which Sturla Molden was concerned with, although he 
hadn't thought through the premature disposal issue.

sbt

P.S. I have posted a possible implementation of shm_open/shm_unlink for 
Windows at

 
http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html


From ethan at stoneleaf.us  Wed Feb 15 19:22:08 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 15 Feb 2012 10:22:08 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <87fwecopgr.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>	<CAB4yi1PrdZHUio6fTXiHwpnvOhaYtWskdWT_PEtfKCpkK7wY9Q@mail.gmail.com>	<CAB4yi1PGXqhbYJQaHNJZ2xBtkTN1Fh3EF2WBWdYpq8RPy9mqrA@mail.gmail.com>	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>	<jh260i$9ir$1@dough.gmane.org>	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>	<jh4bha$4ag$1@dough.gmane.org>	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>	<jh5n78$k3g$1@dough.gmane.org>	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>	<jh5ocl$ru9$1@dough.gmane.org>	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>	<jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<87fwecopgr.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4F3BF7D0.3030702@stoneleaf.us>

Stephen J. Turnbull wrote:
> While my opinions on this are (probably obviously) informed by the
> WSGI discussion, this is not about making life come up roses for the
> WSGI folks.  They work in a sewer; life stinks for them, and all they
> can do about it is to hold their noses.  This thread is about people
> who are not trying to handle sewage in a sanitary fashion, rather just
> cook a meal and ignore the occasional hairs that inevitably fall in.

+1 Picturesque QOTW


From stephen at xemacs.org  Wed Feb 15 19:40:25 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 16 Feb 2012 03:40:25 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
Message-ID: <878vk4ndnq.fsf@uwakimon.sk.tsukuba.ac.jp>

It seems we once again agree violently on the principles.  I think our
differences here are mostly due to me giving a lot of attention to
audience and presentation, and you focusing on the content of what to say.

Re: spin control:

Nick Coghlan writes:

 > No, I'm merely saying that at least 3 options (latin-1,
 > ascii+surrogateescape, chardet2) should be presented clearly to
 > beginners and the trade-offs explained.

Are you defining "beginner" as "Python 2 programmer experienced in a
multilingual context but new to Python 3"?

My point is that, by other definitions of "beginner", I don't think
the tradeoffs can be usefully explained to beginners without
substantial discussion of the issues involved in ASCII vs. the
encoding Babel vs. Unicode.  Only in extreme cases where the beginner
only cares about *never* getting a Unicode error, or only cares about
*never* getting mojibake, will they be able to get much out of this.

Re: descriptions

 > Task: Process data in any ASCII compatible encoding
 > Unicode Awareness Care Factor: None

I don't understand what "Unicode awareness" means here.  The degree to
which Python will raise Unicode errors?  The awareness of the programmer?

 > Approach: Specify encoding="latin-1"
[...]
 > first 256 Unicode code points. However, any output data generated by
 > that application *will* be corrupted

As advice, I think this is mostly false.  In particular, unless you do
language-specific manipulations (transforming particular words and the
like), the Latin-N family is going to be 6-sigma interoperable with
Latin-1, and the rest of the ISO 8859 and Windows-125x family
tolerably so.  This is why it is so hard to root out the "Python 3 is
just Unicode-me-harder by another name" meme.  The most you should say
here is that data *may* be corrupted and that, depending on the
program, the risk *may* be non-negligible for non-Latin-1 data if you
ever encounter it.

 > Using "ascii+surrogateescape" instead of "latin-1" is a small initial
 > step into the Unicode-aware world. It still lets an application
 > process any ASCII-compatible encoding *without* having to know the
 > exact encoding of the source data, but will complain if there is an
 > implicit attempt to transcode the data to another encoding,

That last line would be better "attempt to validate the data, or
output it without an error-suppressing handler (which may occur
implicitly, in a module your program uses)."

 > or if the application inserts non-ASCII data into the strings
 > before writing them out. Whether non-ASCII compatible encodings
 > trigger errors or get corrupted will depend on the specifics of the
 > encoding and how the program manipulates the data.

You can be a little more precise: Non-ASCII-compatible encodings will
trigger errors in the same circumstances as ASCII-compatible
encodings.  They also likely to be corrupted, but depending on the
specifics of the encoding and how the program manipulates the data.  I
don't know if it's worth the extra verbosity, though.

 > Task: Process data in any ASCII compatible encoding
 > Unicode Awareness Care Factor: High
 > Approach: Use binary APIs and the "chardet2" module from PyPI to
 > detect the character encoding
 >     Bytes/bytearray: data.decode(detected_encoding)
 >     Text files: open(fname, encoding=detected_encoding)
 > 
 > The *right* way to process text in an unknown encoding is to do your
 > best to derive the encoding from the data stream.

The claim of "right" isn't good advice.  The *right* way to process
text is to insist on knowing the encoding in advance.  If you have to
process text in unknown encodings, then what is "right" will vary with
the application.

For one thing, accurate detection generally impossible without advice
from outside.  Given inaccuracy of automatic detection, I would often
prefer to fall back to a generic ASCII-compatible algorithm that omits
any processing that requires identifying non-ASCII characters or
inserting non-ASCII characters into the text stream, rather than risk
mojibake.

In other cases, all of the significant processing is done on ASCII
characters, and non-ASCII is simply passed through verbatim.  Then if
you need to process text in assorted encodings, the 'latin1' method is
not merely acceptable, it is the obvious winning strategy.

And to some extent the environment:

 > [T]he default restrictive character set on Windows and in some
 > locales may cause problems.

In sum, most likely naive use of chardet is most effective as a way to
rule out non-ASCII-compatible encodings, which *can* be done rather
accurately (Shift JIS, Big5, UTF-16, and UTF-32 all have
characteristic patterns of use of non-ASCII octets).


From ned at nedbatchelder.com  Wed Feb 15 19:41:16 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 15 Feb 2012 13:41:16 -0500
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
Message-ID: <4F3BFC4C.5000202@nedbatchelder.com>

On 2/15/2012 10:34 AM, Nathan Rice wrote:
> I feel like Python's standard library bloat actually makes
> the good stuff harder to find, and a better approach would be to have
> a minimal "core" standard library with a few "official" battery pack
> style libs that are very prominently featured and available.
If the only problem is "hard to find," then you need a documentation 
re-organization, not a change to what is shipped where.

--Ned.


From p.f.moore at gmail.com  Wed Feb 15 19:51:29 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 15 Feb 2012 18:51:29 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
Message-ID: <CACac1F_+t5n4hu2PfnMD6x2mnr4ySrOYP+82QzXWDSfW7d_pVQ@mail.gmail.com>

I really like a task-oriented approach like this. +1000 for this sort
of thing in the docs.

On 15 February 2012 08:03, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Task: Process data in any ASCII compatible encoding

This is actually closest to how I think about what I'm doing, so
thanks for spelling it out.

> Unicode Awareness Care Factor: High

I'm not entirely sure how to interpret this - "High level of interest
in getting it right" or "High amount of investment in understanding
Unicode needed"? Or something else?

> Approach: Use binary APIs and the "chardet2" module from PyPI to
> detect the character encoding
> ? ?Bytes/bytearray: data.decode(detected_encoding)
> ? ?Text files: open(fname, encoding=detected_encoding)

If this is going into the Unicode FAQ or somewhere similar, it
probably needs a more complete snippet of sample code. Without having
looked for and read the chardet2 documentation, do I need to read the
file once in binary mode (possibly only partially) to scan it for an
encoding, and then start again "for real". That's arguably a downside
to this approach.

> The *right* way to process text in an unknown encoding is to do your
> best to derive the encoding from the data stream. The "chardet2"
> module on PyPI allows this. Refer to that module's documentation
> (WHERE?) for details.

There is arguably another, simpler approach, which is to pick a
default encoding (probably what Python gives you by default) and add a
command line argument to your program (or equivalent if your program
isn't a command line app) to manually specify an alternative. That's
probably more complicated than the naive user wanted to deal with when
they started reading this summary, but may well not sound so bad by
the time they get to this point :-)

> With this approach, transcoding to the default sys.stdin and
> sys.stdout encodings should generally work (although the default
> restrictive character set on Windows and in some locales may cause
> problems).

A couple of other tasks spring to mind:

Task: Process data in a file whose encoding I don't know
Unicode Understanding Needed: Medium-Low
Unicode Correctness: High
Approach: Use external tools to identify the encoding, then simply
specify it when opening the file. On Unix, "file -i FILENAME" will
attempt to detect the encoding, on Windows, XXX. If, and only if, this
approach doesn't identify the encoding clearly, then the other options
allow you to do the best you can.

(Needs a better description of what tools to use, and maybe a sample
Python script using chardet2 as a fallback).

This is actually the "right way", and should be highlighted as such.
By describing it this way, it's also rather clear that it's *not
hard*, once you get over the idea that you don't know how to get the
encoding, because it's not specified in the file.

Having read through and extended Nick's analysis to this point, I'm
thinking that it actually fits my use cases fine (and correct Unicode
handling no longer feels like such a hard problem to me :-))

Task: Process data in a file believed to have inconsistent encodings
Unicode Understanding Needed: High
Unicode Correctness: Low
Approach: ??? Panic :-)

This is the killer, but should be extremely rare. We don't need to
explain what to do here, but maybe offer a simple strategy (1. Are you
sure the file has mixed encodings? Have you checked twice? 2. If it's
ASCII-compatible, can you work on a basis that you just pass the
mixed-encoding bytes through unchanged? If so use one of the other
recipes Nick explained. 3. Do you care about mojibake or corruption?
Can you afford not to? 4. Are you a Unicode expert, or do you know
one? :-))

I think something like this would be a huge benefit for the Unicode
FAQ. I haven't got the time or expertise to write it, but I wish I
did. If I get some spare time, I might well have a go anyway, but I
can't promise.

Paul


From nathan.alexander.rice at gmail.com  Wed Feb 15 19:54:15 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Wed, 15 Feb 2012 13:54:15 -0500
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3BFC4C.5000202@nedbatchelder.com>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
	<4F3BFC4C.5000202@nedbatchelder.com>
Message-ID: <CAOFbRmLV1E=mubza5imKVeBanwJ034Bb9rGUn_bY112Bz-p1Tg@mail.gmail.com>

>> I feel like Python's standard library bloat actually makes
>> the good stuff harder to find, and a better approach would be to have
>> a minimal "core" standard library with a few "official" battery pack
>> style libs that are very prominently featured and available.
>
> If the only problem is "hard to find," then you need a documentation
> re-organization, not a change to what is shipped where.

I think the documentation is pretty well organized overall.  There is
an issue of irreducible complexity though; someone that is searching
for a specific thing wouldn't care, but a newer user trying to get
their bearings on how this python thing works by browsing the standard
lib probably would.  Additionally, decoupling modules from the
interpreter release schedule would probably be a good thing.


Nathan


From shibturn at gmail.com  Wed Feb 15 20:53:16 2012
From: shibturn at gmail.com (shibturn)
Date: Wed, 15 Feb 2012 19:53:16 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F_+t5n4hu2PfnMD6x2mnr4ySrOYP+82QzXWDSfW7d_pVQ@mail.gmail.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
	<CACac1F_+t5n4hu2PfnMD6x2mnr4ySrOYP+82QzXWDSfW7d_pVQ@mail.gmail.com>
Message-ID: <jhh2fg$2oh$1@dough.gmane.org>

On 15/02/2012 6:51pm, Paul Moore wrote:
> Task: Process data in a file whose encoding I don't know
> Unicode Understanding Needed: Medium-Low
> Unicode Correctness: High
> Approach: Use external tools to identify the encoding, then simply
> specify it when opening the file. On Unix, "file -i FILENAME" will
> attempt to detect the encoding, on Windows, XXX. If, and only if, this
> approach doesn't identify the encoding clearly, then the other options
> allow you to do the best you can.

Don't recommend "file -i".

I just tried it on the files in /usr/share/libtextcat/ShortTexts/. 
Basically, everything is identified as us-ascii, iso-8859-1 or unknown-8bit.

Examples:

chinese-big5.txt:        text/plain; charset=iso-8859-1
chinese-gb2312.txt:      text/plain; charset=iso-8859-1
japanese-euc_jp.txt:     text/plain; charset=iso-8859-1
korean.txt:              text/plain; charset=iso-8859-1

arabic-windows1256.txt:  text/plain; charset=iso-8859-1
georgian.txt:            text/plain; charset=iso-8859-1
greek-iso8859-7.txt:     text/plain; charset=iso-8859-1
hebrew-iso8859_8.txt:    text/plain; charset=iso-8859-1
russian-windows1251.txt: text/plain; charset=iso-8859-1
ukrainian-koi8_r.txt:    text/plain; charset=iso-8859-1

sbt


From greg.ewing at canterbury.ac.nz  Wed Feb 15 22:44:20 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Feb 2012 10:44:20 +1300
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3AE675.6010907@mrabarnett.plus.com>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<CAFYqXL84NddnDNKrQzJL=9fwnApLuQOOx2ozEMuua8Fk0Q=0Dg@mail.gmail.com>
	<jh260i$9ir$1@dough.gmane.org>
	<871uq3xeq7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<jh4bha$4ag$1@dough.gmane.org>
	<CACac1F9E1wFfNDBZ3gESKhh4gbe-8Goh9_W_gAbcBJq14WfKsQ@mail.gmail.com>
	<jh5n78$k3g$1@dough.gmane.org>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
Message-ID: <4F3C2734.9060807@canterbury.ac.nz>

MRAB wrote:

> encoding="mojibake" # :-)

+1

-- 
Greg


From tjreedy at udel.edu  Wed Feb 15 23:39:31 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 15 Feb 2012 17:39:31 -0500
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
Message-ID: <jhhc79$fks$1@dough.gmane.org>

On 2/15/2012 11:20 AM, Yuval Greenfield wrote:
> Wouldn't it be nice to add generators and iterators like we can do with
> lists?

That is simply not possible. list1+list2 is a list; tuple1+tuple2 is a 
tuple; list1+tuple2 is an error! What type would iterable1 + iterable 2 
be? Answer: a chain object!

>      def f():
>          yield 1
>          yield 2
>          yield 3
>
>      def g():
>          yield 4
>          yield 5
>
>      # today
>      for item in itertools.chain(f(), g()):
>          print(item)

The itertools module is the proper place for generic operations on 
iterables (and not just iterators!).

 >>> import itertools as it
 >>> list(it.chain([1,2,3], (4,5.6), range(7,10)))
[1, 2, 3, 4, 5.6, 7, 8, 9]

Chain 'adds' mixed types and is not limited to binary scope.

If we were starting fresh today, we *might* consider making the 
functions that went into itertools into methods of an iterable ABC. But 
making every function a method is not Python's style. Indeed, exposing 
generic methods like __len__ as functions is more common.

-- 
Terry Jan Reedy


From anacrolix at gmail.com  Wed Feb 15 23:52:55 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Thu, 16 Feb 2012 06:52:55 +0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120215133912.GA17040@iskra.aviel.ru>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
Message-ID: <CAB4yi1OggbnS4FLVVrdB82oipWTADJv=AGaF=nyofsyv8Nb1Jw@mail.gmail.com>

The thread was reasons for a possible drop in popularity. Somehow the other
reasons have been sabotaged leaving only the unicode discussion still alive.
On Feb 15, 2012 9:39 PM, "Oleg Broytman" <phd at phdru.name> wrote:

> On Wed, Feb 15, 2012 at 11:15:36AM +1100, Ben Finney wrote:
> > If people want to remain wilfully ignorant of text encoding in the third
> > millennium
>
>   This returns us to the very beginning of the thread. The original
> complain was: Python3 requires users to learn too much about unicode,
> more than they really need.
>
> Oleg.
> --
>     Oleg Broytman            http://phdru.name/            phd at phdru.name
>           Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120216/8f08260f/attachment.html>

From cs at zip.com.au  Thu Feb 16 00:07:49 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Thu, 16 Feb 2012 10:07:49 +1100
Subject: [Python-ideas] Unicode surrogateescape [was: Re: Python 3000
	TIOBE -3%]
In-Reply-To: <04A64366-3F31-40AF-9E84-FFB3C3C1E690@gmail.com>
References: <04A64366-3F31-40AF-9E84-FFB3C3C1E690@gmail.com>
Message-ID: <20120215230749.GA7352@cskk.homeip.net>

On 14Feb2012 10:17, Carl M. Johnson <cmjohnson.mailinglist at gmail.com> wrote:
| On Feb 14, 2012, at 10:04 AM, Jim Jewett wrote:
| > But is there a good reason not to change the default errorhandler to
| > errors="surrogateescape"?
| 
| It's a conflict in the Zen:
| 
| > Errors should never pass silently.
| > Unless explicitly silenced.
| 
| OK, so default to strict. But:

Yes.

| > Although practicality beats purity.
| 
| Hmm, so maybe do use surrogates. Then again:

No. Adding errors="surrogateescape" when needed is easy enough not to be
impractical.

(Also, it clearly flags in the code that we won't always get what we
expect/hope.)

| > In the face of ambiguity, refuse the temptation to guess.
| 
| Grr, I'm not nearly Dutch enough to make sense of this logical conflict!

I'm not Dutch either (I can never remember which way P and V go in
semaphore operations, for example). However, the logic I would use is
very simple:

  I should know the encoding of these bytes.

  If I don't, and I merely have to suck them in and spit them back out again
  as bytes undamaged (such as when reading filesystem filenames, which can
  often be treated as opaque tokens), use errors="surrogateescape".

  Otherwise, arrange to know the encoding (or have enough fiat to declare
  one, preferably utf-8).

errors="surrogateescape" is for lossless but usually "blind"
decode/encode. The rest of the time it would be better to know what
you're doing.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

We don't just *borrow* words; on occasion, English has pursued other
languages down alleyways to beat them unconscious and rifle their pockets for
new vocabulary. - James D. Nicoli


From p.f.moore at gmail.com  Thu Feb 16 00:15:18 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 15 Feb 2012 23:15:18 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jhh2fg$2oh$1@dough.gmane.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f8WwT1OmZaxDovznvy+=5FqqQB=56uZfvLTGTkZod7ng@mail.gmail.com>
	<CACac1F_+t5n4hu2PfnMD6x2mnr4ySrOYP+82QzXWDSfW7d_pVQ@mail.gmail.com>
	<jhh2fg$2oh$1@dough.gmane.org>
Message-ID: <CACac1F__6WNZXhsBqXcn6cByuEX=OxESZpaaDhL4ocOt=6iVmA@mail.gmail.com>

On 15 February 2012 19:53, shibturn <shibturn at gmail.com> wrote:
> Don't recommend "file -i".

Fair enough - I have no experience to comment one way or another. it
was just something I'd seen mentioned in the thread. If there isn't a
good standard encoding detector, maybe a small Python script using
chardet2 would be the best thing to recommend...

Paul.


From steve at pearwood.info  Thu Feb 16 00:26:50 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 16 Feb 2012 10:26:50 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CAB4yi1OggbnS4FLVVrdB82oipWTADJv=AGaF=nyofsyv8Nb1Jw@mail.gmail.com>
References: <jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>	<4F3AE675.6010907@mrabarnett.plus.com>	<87haytyms7.fsf@benfinney.id.au>	<20120215133912.GA17040@iskra.aviel.ru>
	<CAB4yi1OggbnS4FLVVrdB82oipWTADJv=AGaF=nyofsyv8Nb1Jw@mail.gmail.com>
Message-ID: <4F3C3F3A.4090408@pearwood.info>

Matt Joiner wrote:
> The thread was reasons for a possible drop in popularity. Somehow the other
> reasons have been sabotaged leaving only the unicode discussion still alive.

Not so much sabotaged as ignored.

Perhaps because we don't believe this alleged drop in popularity represents 
anything real, while the Unicode issue is a genuine problem that needs a solution.


-- 
Steven


From steve at pearwood.info  Thu Feb 16 00:43:33 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 16 Feb 2012 10:43:33 +1100
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3BC1EA.3030002@gmx.de>
References: <4F3BC1EA.3030002@gmx.de>
Message-ID: <4F3C4325.2070000@pearwood.info>

Ronny Pfannschmidt wrote:
> Hi,
> 
> in my experience for many cases, __repr__ and __str__ can be 
> unconditionally be represented as simple string formatting operation,

In my experience, not so much.


> so i would propose to add a extension to support simply declaring them 
> in the form of newstyle format strings

Declare them how? What is your proposed API for using this new functionality? 
Before proposing an implementation, you should propose an interface.


> a basic implementation for __repr__ could look like:
> 
>  class SelfFormatter(string.Formatter):
>         def __init__(self, obj):
>             self.__obj = obj
>             string.Formatter.__init__(self)
> 
>         def get_value(self, key, args, kwargs):
>             if isinstance(key, str) and hasattr(self.__obj, key):
>                 return getattr(self.__obj, key)
>             return Formatter.get_value(self, key, args, kwargs)
> 
>   class SimpleReprMixing(object):
>     _repr_ = '<{__class__.__name__} at 0x{__id__!x}>'
>     def __repr__(self):
>         formatter = SelfFormatter(self)
>         return formatter.vformat(self._repr_, (), {'__id__':id(self)})


I don't think you need this just to get a generic "instance at id" string. If 
you inherit from object (and remember that all classes inherit from object in 
Python 3) you get this for free:

 >>> class K(object):
...     pass
...
 >>> k = K()
 >>> str(k)
'<__main__.K object at 0xb746068c>'


-- 
Steven


From ncoghlan at gmail.com  Thu Feb 16 01:03:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Feb 2012 10:03:38 +1000
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
Message-ID: <CADiSq7eBJX+ffgbJrTrYaRRSuYb5YYLDGyqW9N5iQwPW6VbLJQ@mail.gmail.com>

On Thu, Feb 16, 2012 at 1:34 AM, Nathan Rice
<nathan.alexander.rice at gmail.com> wrote:
> I think that a generic __repr__ has been reinvented more times than I
> can count. ?I don't think a generic __str__ is a good thing, as it is
> supposed to be a pretty, semantically meaningful. ?I don't really see
> anywhere in the standard library that such a feature would make sense
> though.

Python 3's reprlib already provides some tools for writing
well-behaved __repr__ implementations (specifically, the
reprlib.recursive_repr decorator that handles cycles in container
representations).

I actually have a recipe for simple "cls(arg1, arg2, arg3)" style
__repr__ output on Stack Overflow:
http://stackoverflow.com/questions/7072938/including-a-formatted-iterable-as-part-of-a-larger-formatted-string

> There is a similar recipes in SQL Alchemy, and I've seen them in a few
> other popular libs that I can't remember off the top of my head.

The unfortunate part of dict-based __repr__ implementations is that
the order of the parameter display is technically arbitrary.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From greg.ewing at canterbury.ac.nz  Thu Feb 16 02:37:12 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Feb 2012 14:37:12 +1300
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120215133912.GA17040@iskra.aviel.ru>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com> <87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
Message-ID: <4F3C5DC8.707@canterbury.ac.nz>

On 16/02/12 02:39, Oleg Broytman wrote:
> On Wed, Feb 15, 2012 at 11:15:36AM +1100, Ben Finney wrote:
>> If people want to remain wilfully ignorant of text encoding in the third
>> millennium
>
>     This returns us to the very beginning of the thread. The original
> complain was: Python3 requires users to learn too much about unicode,
> more than they really need.

I don't think it's helpful to label everyone who wants to use the
techniques being discussed here as lazy or ignorant. As we've seen,
there are cases where you truly *can't* know the true encoding,
and at the same time it *doesn't matter*, because all you want to
do is treat the unknown bytes as opaque data. To tell someone in
that position that they're being lazy is both wrong and insulting.

It seems to me that what surrogateescape is effectively doing is
creating a new data type that consists of a mixture of ASCII
characters and raw bytes, and enables you to tell which is which.

Maybe there should be a real data type like this, or a flag on
the unicode type. The data would be stored in the same way as a
latin1-decoded string, but anything with the high bit set would
be regarded as a byte instead of a character. This might make it
easier to interoperate with external libraries that expect
well-formed unicode.

-- 
Greg


From sturla at molden.no  Thu Feb 16 02:40:43 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 16 Feb 2012 02:40:43 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <jhgqs1$3kf$1@dough.gmane.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
	<20120215180250.21a05ddf@pitrou.net> <jhgqs1$3kf$1@dough.gmane.org>
Message-ID: <6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>


> 
> P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at
> 
> http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html
> 
> 

A temporary file is not backed shared memory on Windows, but is a persistent file on disk. You have to mmap from the OS' paging file to get shared memory.

Sturla


From greg.ewing at canterbury.ac.nz  Thu Feb 16 02:46:23 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Feb 2012 14:46:23 +1300
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
Message-ID: <4F3C5FEF.5090508@canterbury.ac.nz>

On 16/02/12 05:20, Yuval Greenfield wrote:
> Wouldn't it be nice to add generators and iterators like we can do with lists?
 >
>      for item in f() + g():
>          print(item)

No. Then every iterator would be expected to implement __add__,
including all the ones already written.

It would also clash with existing meanings of __add__ on some
types, such as NumPy arrays.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu Feb 16 02:48:51 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Feb 2012 14:48:51 +1300
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CANSw7KzQPJTUQQo_PLbUff4rvRObWMj1W8cLV9tXdSXPx=ZrEw@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
	<CAP7+vJ+EfGBnnDx50C98BZ42P7rRG9sNg6rTir03bBqautDYjg@mail.gmail.com>
	<CANSw7KzQPJTUQQo_PLbUff4rvRObWMj1W8cLV9tXdSXPx=ZrEw@mail.gmail.com>
Message-ID: <4F3C6083.9080004@canterbury.ac.nz>

On 16/02/12 05:50, Yuval Greenfield wrote:

> If it isn't a bad idea then we can at least do generators and whatever we find
> in the standard lib.

It's only a good idea if it applies universally, otherwise code that
relies on it would be fragile.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu Feb 16 02:56:22 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 16 Feb 2012 14:56:22 +1300
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <jhgqs1$3kf$1@dough.gmane.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org> <20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org>
Message-ID: <4F3C6246.7000509@canterbury.ac.nz>

On 16/02/12 06:43, shibturn wrote:

> A process which creates an mmap may want to transfer ownership of the mmap to
> another process along a pipeline. For example:
>
> 1) Process A creates an mmap
> 2) Process A does some work on mmap
> 3) Process A puts mmap on a queue.
> 4) mmap gets garbage collected in process A.
> 5) Process B gets mmap from queue.

I don't know about Windows, but in Unix it's possible to send a
file descriptor from one process to another over a unix-domain
socket connection. So a refcounted anonymous mmap handover could
be achieved this way:

1. Process A creates a temp file, mmaps it and unlinks it.
2. Process A sends the file descriptor to process B over a
    unix-domain socket.
3. Process B mmaps it.

Even if process A closes its version of the fd right after
sending it, the OS should keep it alive while it's in transit,
I think.

-- 
Greg


From steve at pearwood.info  Thu Feb 16 05:08:39 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 16 Feb 2012 15:08:39 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3C5DC8.707@canterbury.ac.nz>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz>
Message-ID: <20120216040839.GA3048@ando>

On Thu, Feb 16, 2012 at 02:37:12PM +1300, Greg Ewing wrote:
> On 16/02/12 02:39, Oleg Broytman wrote:
> >On Wed, Feb 15, 2012 at 11:15:36AM +1100, Ben Finney wrote:
> >>If people want to remain wilfully ignorant of text encoding in the third
> >>millennium
> >
> >    This returns us to the very beginning of the thread. The original
> >complain was: Python3 requires users to learn too much about unicode,
> >more than they really need.
> 
> I don't think it's helpful to label everyone who wants to use the
> techniques being discussed here as lazy or ignorant. As we've seen,
> there are cases where you truly *can't* know the true encoding,
> and at the same time it *doesn't matter*, because all you want to
> do is treat the unknown bytes as opaque data. To tell someone in
> that position that they're being lazy is both wrong and insulting.

In fairness, this thread was originally started with the scenario "I'm 
reading files which are only mostly ASCII, but I don't want to learn 
about Unicode" rather than "I know about Unicode, but it doesn't help me 
in this situation because the encoding truly is unknown". So wilful 
ignorance does apply, at least in the use-case the thread started with. 
(If it helps, think of them as too busy to learn, not too lazy.)

If you already know about Unicode, then you probably don't need to be 
given a simple recipe to follow, because you probably already have a 
solution that works for you.

Which brings us back to the original use-case: 

"I have a file which is only mostly ASCII, and I don't care to learn 
about Unicode at this time to deal with it. I need a recipe I can 
follow that will do the right-thing so I can continue to ignore the 
issue for a little longer."

I don't think that we should either insist that these people be forced 
to learn Unicode, nor expect to be able to solve every possible problem 
they might find. 

A couple of recipes in the FAQs, and discussion of why you 
might prefer one to the other, should be able to cover most simple 
cases:

open(filename, encoding='ascii', errors='surrogateescape')
open(filename, encoding='latin1')

Both recipes hint at the wider world of encodings and error handlers, 
hence act as a non-threatening introduction to Unicode.


-- 
Steven


From storchaka at gmail.com  Thu Feb 16 07:24:25 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Thu, 16 Feb 2012 08:24:25 +0200
Subject: [Python-ideas] automation of __repr__/__str__ for all the
	common simple cases
In-Reply-To: <CADiSq7eBJX+ffgbJrTrYaRRSuYb5YYLDGyqW9N5iQwPW6VbLJQ@mail.gmail.com>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
	<CADiSq7eBJX+ffgbJrTrYaRRSuYb5YYLDGyqW9N5iQwPW6VbLJQ@mail.gmail.com>
Message-ID: <jhi7ev$l7j$1@dough.gmane.org>

16.02.12 02:03, Nick Coghlan ???????(??):
> The unfortunate part of dict-based __repr__ implementations is that
> the order of the parameter display is technically arbitrary.

Not for OrderedDict.


From Ronny.Pfannschmidt at gmx.de  Thu Feb 16 07:54:06 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Thu, 16 Feb 2012 07:54:06 +0100
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3C4325.2070000@pearwood.info>
References: <4F3BC1EA.3030002@gmx.de> <4F3C4325.2070000@pearwood.info>
Message-ID: <4F3CA80E.1060208@gmx.de>

On 02/16/2012 12:43 AM, Steven D'Aprano wrote:
> Ronny Pfannschmidt wrote:
>> Hi,
>>
>> in my experience for many cases, __repr__ and __str__ can be
>> unconditionally be represented as simple string formatting operation,
>
> In my experience, not so much.
>
>
>> so i would propose to add a extension to support simply declaring them
>> in the form of newstyle format strings
>
> Declare them how? What is your proposed API for using this new
> functionality? Before proposing an implementation, you should propose an
> interface.
>
>
>> a basic implementation for __repr__ could look like:
>>
>> class SelfFormatter(string.Formatter):
>> def __init__(self, obj):
>> self.__obj = obj
>> string.Formatter.__init__(self)
>>
>> def get_value(self, key, args, kwargs):
>> if isinstance(key, str) and hasattr(self.__obj, key):
>> return getattr(self.__obj, key)
>> return Formatter.get_value(self, key, args, kwargs)
>>
>> class SimpleReprMixing(object):
>> _repr_ = '<{__class__.__name__} at 0x{__id__!x}>'
>> def __repr__(self):
>> formatter = SelfFormatter(self)
>> return formatter.vformat(self._repr_, (), {'__id__':id(self)})
>
>
> I don't think you need this just to get a generic "instance at id"
> string. If you inherit from object (and remember that all classes
> inherit from object in Python 3) you get this for free:
>
>  >>> class K(object):
> ... pass
> ...
>  >>> k = K()
>  >>> str(k)
> '<__main__.K object at 0xb746068c>'
>
>
>

seems like you completely missed that class-level attributes can easily 
be redefined by subclasses

like

class User(SimpleReprMixin):
   _repr_ = "<User {name!r}>"
   ...

the implementation and the interface is pretty simple and straightforward


From ncoghlan at gmail.com  Thu Feb 16 08:04:11 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Feb 2012 17:04:11 +1000
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <jhi7ev$l7j$1@dough.gmane.org>
References: <4F3BC1EA.3030002@gmx.de>
	<CAOFbRmLiLd+2NyoAP2iDX3ib5cCFRxOKAG=-FzwLRpV1ZxccAg@mail.gmail.com>
	<CADiSq7eBJX+ffgbJrTrYaRRSuYb5YYLDGyqW9N5iQwPW6VbLJQ@mail.gmail.com>
	<jhi7ev$l7j$1@dough.gmane.org>
Message-ID: <CADiSq7fjoqZS4DSih98pXksWhWXWbky-sTtLzmLz8x=z5FgTKg@mail.gmail.com>

On Thu, Feb 16, 2012 at 4:24 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 16.02.12 02:03, Nick Coghlan ???????(??):
>
>> The unfortunate part of dict-based __repr__ implementations is that
>> the order of the parameter display is technically arbitrary.
>
>
> Not for OrderedDict.

Yes, but relying on OrderedDict rules out using keyword arguments to
make any API easier to use. At that point, it's generally simpler for
people to write their own repr that does exactly what they want.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From techtonik at gmail.com  Thu Feb 16 08:04:54 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 16 Feb 2012 10:04:54 +0300
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <4F3C5FEF.5090508@canterbury.ac.nz>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
	<4F3C5FEF.5090508@canterbury.ac.nz>
Message-ID: <CAPkN8x+mq_3bndN=dVuKdbNRV9cxafYzkLG6V+uxAXGCaJX3_A@mail.gmail.com>

On Thu, Feb 16, 2012 at 4:46 AM, Greg Ewing <greg.ewing at canterbury.ac.nz>wrote:

> On 16/02/12 05:20, Yuval Greenfield wrote:
>
>> Wouldn't it be nice to add generators and iterators like we can do with
>> lists?
>>
> >
>
>>     for item in f() + g():
>>         print(item)
>>
>
> No. Then every iterator would be expected to implement __add__,
> including all the ones already written.
>
> It would also clash with existing meanings of __add__ on some
> types, such as NumPy arrays.


Good example for Python Ideas FAQ.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120216/43987877/attachment.html>

From ncoghlan at gmail.com  Thu Feb 16 08:13:47 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 16 Feb 2012 17:13:47 +1000
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3CA80E.1060208@gmx.de>
References: <4F3BC1EA.3030002@gmx.de> <4F3C4325.2070000@pearwood.info>
	<4F3CA80E.1060208@gmx.de>
Message-ID: <CADiSq7cZ-4N6_MVZ+cdLRa8bhDxGY4kuzbWCgVHvF17Qq0Q7NA@mail.gmail.com>

On Thu, Feb 16, 2012 at 4:54 PM, Ronny Pfannschmidt
<Ronny.Pfannschmidt at gmx.de> wrote:
> the implementation and the interface is pretty simple and straightforward

However, the question is whether it's simple and straightforward
enough to be worth standardising.

There are a few common patterns that recur in repr implementations:

- <type(obj) & id(obj)> based (the object.__repr__ default)

- cls(arg1, arg2...) positional argument based

- cls(kwd1=arg1, kwd2=arg2...) keyword argument based

- a mixture of the previous two options

Adding some helpers along those lines to reprlib may make sense, but a
meaningful ReprMixin places stronger constraints on the relationship
between class attributes and the desired repr output than is
appropriate for the standard library. It's way too idiosyncratic
across developers to be worth promoting in the stdlib (a project or
domain specific class hierarchy is a different story, but that's not
relevant for a general purpose ReprMixin proposal).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From Ronny.Pfannschmidt at gmx.de  Thu Feb 16 08:20:06 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Thu, 16 Feb 2012 08:20:06 +0100
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <CADiSq7cZ-4N6_MVZ+cdLRa8bhDxGY4kuzbWCgVHvF17Qq0Q7NA@mail.gmail.com>
References: <4F3BC1EA.3030002@gmx.de> <4F3C4325.2070000@pearwood.info>
	<4F3CA80E.1060208@gmx.de>
	<CADiSq7cZ-4N6_MVZ+cdLRa8bhDxGY4kuzbWCgVHvF17Qq0Q7NA@mail.gmail.com>
Message-ID: <4F3CAE26.4020201@gmx.de>

On 02/16/2012 08:13 AM, Nick Coghlan wrote:
> On Thu, Feb 16, 2012 at 4:54 PM, Ronny Pfannschmidt
> <Ronny.Pfannschmidt at gmx.de>  wrote:
>> the implementation and the interface is pretty simple and straightforward
>
> However, the question is whether it's simple and straightforward
> enough to be worth standardising.
>
> There are a few common patterns that recur in repr implementations:
>
> -<type(obj)&  id(obj)>  based (the object.__repr__ default)
>
> - cls(arg1, arg2...) positional argument based
>
> - cls(kwd1=arg1, kwd2=arg2...) keyword argument based
>
> - a mixture of the previous two options
>
> Adding some helpers along those lines to reprlib may make sense, but a
> meaningful ReprMixin places stronger constraints on the relationship
> between class attributes and the desired repr output than is
> appropriate for the standard library. It's way too idiosyncratic
> across developers to be worth promoting in the stdlib (a project or
> domain specific class hierarchy is a different story, but that's not
> relevant for a general purpose ReprMixin proposal).

instead of the ReprMixin, how about a descriptor

the Api would change to something like

class User(object):
   __repr__ = FormatRepr('<User {name!r}')

its less concise, but actually more straightforward and better decoupled 
(thanks for hinting at the strong coupling in the class hierarchy)

and ArgRepr and KwargRepr could be added in a similar fashion

Cheers,
Ronny
>
> Cheers,
> Nick.
>


From stephen at xemacs.org  Thu Feb 16 08:49:47 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 16 Feb 2012 16:49:47 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3C5DC8.707@canterbury.ac.nz>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz>
Message-ID: <877gznnrok.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > Maybe there should be a real data type [parallel to str and bytes
 > that mixes str and bytes], or a flag on the unicode type.

-1.  This is yesterday's problem.  It still hurts today; we need
workarounds.  But it's going to be less and less important as time
goes on, because nobody can afford one-locale software anymore, and
the cheapest way to be multilocale is to process in Unicode, and
insist on Unicode on input and output.

The unknown encoding problem is not one with a generally acceptable
solution.  That's why Unicode was invented.  To "solve" the problem by
ensuring it doesn't occur in the first place.


From p.f.moore at gmail.com  Thu Feb 16 13:59:26 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Feb 2012 12:59:26 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120216040839.GA3048@ando>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
Message-ID: <CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>

On 16 February 2012 04:08, Steven D'Aprano <steve at pearwood.info> wrote:
> On 16/02/12 02:39, Oleg Broytman wrote:
>> I don't think it's helpful to label everyone who wants to use the
>> techniques being discussed here as lazy or ignorant. As we've seen,
>> there are cases where you truly *can't* know the true encoding,
>> and at the same time it *doesn't matter*, because all you want to
>> do is treat the unknown bytes as opaque data. To tell someone in
>> that position that they're being lazy is both wrong and insulting.
>
> In fairness, this thread was originally started with the scenario "I'm
> reading files which are only mostly ASCII, but I don't want to learn
> about Unicode" rather than "I know about Unicode, but it doesn't help me
> in this situation because the encoding truly is unknown". So wilful
> ignorance does apply, at least in the use-case the thread started with.
> (If it helps, think of them as too busy to learn, not too lazy.)

As the person who started the thread with this use case, I'd dispute
that description of what I said.

To restate it "I'm reading files which are mostly ASCII but not all. I
know that I should identify the encoding, and what to do if I did know
the encoding, but I'm not sure how to find out reliably what the
encoding is. Also, the problem doesn't really warrant investing the
time needed to research means of doing so - given that I don't need to
process the non-ASCII, I just want to avoid decoding errors and not
corrupt the data".

I'm not lazy, I've just done a cost/benefit analysis and determined
that my limited knowledge should be enough. Experience with other
tools which aren't as strict as Python 3 on Unicode matters confirms
that a "good enough" job does satisfy my needs. And I'm not willfully
ignorant, I actually have a good feel for Unicode and the issues
involved, and I certainly know what's right. I've just found that
everything I've read assumes that "knowing the encoding" isn't hard -
and my experience differs, so I don't know where to go for answers.

Add to this the fact that I *know* I've seen supposed text files with
mixed encoding content, and no-one has *ever* explained how to handle
that (it's basically a damaged file, and so all the "right way to deal
with Unicode" discussions ignore it) even though tools like grep and
awk do a perfectly acceptable job to the level I care about.

I'm very pleased with the way this thread has gone, because it has
answered all of the questions I've had about "nearly-ASCII" text
files. But there's no way I'd have expected to spend this much time,
and involve this many other people with more knowledge than me, just
to handle my original changelog-parsing problem that I could do in awk
or Python 2 in about 5 minutes. Now, I could also do it in Python 3.
But then, I couldn't. Hopefully the knowledge from this thread can be
captured so that other people can avoid my dilemma.

OK, so maybe I do feel somewhat insulted...

Cheers,
Paul.


From steve at pearwood.info  Thu Feb 16 14:44:25 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 17 Feb 2012 00:44:25 +1100
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>	<4F3AE675.6010907@mrabarnett.plus.com>	<87haytyms7.fsf@benfinney.id.au>	<20120215133912.GA17040@iskra.aviel.ru>	<4F3C5DC8.707@canterbury.ac.nz>	<20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
Message-ID: <4F3D0839.2070802@pearwood.info>

Paul Moore wrote:
> On 16 February 2012 04:08, Steven D'Aprano <steve at pearwood.info> wrote:
>> On 16/02/12 02:39, Oleg Broytman wrote:
>>> I don't think it's helpful to label everyone who wants to use the
>>> techniques being discussed here as lazy or ignorant. As we've seen,
>>> there are cases where you truly *can't* know the true encoding,
>>> and at the same time it *doesn't matter*, because all you want to
>>> do is treat the unknown bytes as opaque data. To tell someone in
>>> that position that they're being lazy is both wrong and insulting.
>> In fairness, this thread was originally started with the scenario "I'm
>> reading files which are only mostly ASCII, but I don't want to learn
>> about Unicode" rather than "I know about Unicode, but it doesn't help me
>> in this situation because the encoding truly is unknown". So wilful
>> ignorance does apply, at least in the use-case the thread started with.
>> (If it helps, think of them as too busy to learn, not too lazy.)
> 
> As the person who started the thread with this use case, I'd dispute
> that description of what I said.

I am sorry, I spoke poorly. Apologies if you feel I misrepresented you.

To be honest, this thread has been so large, and so rambling, and covering so 
much ground, I have no idea what the *actual* first mention of encoding 
related issues was. The oldest I can find was Giampaolo Rodol? on 9 Feb 2012 
20:16:00 +0100:

     I bet a lot of people don't want to upgrade for another reason:
     unicode. The impression I got is that python 3 forces the user to
     use and *understand* unicode and a lot of people simply don't want
     to deal with that.


two days before the first post from you mentioning encoding issues that I can 
find. Another mention of a similar use-case was by Stephen J Turnbull on 10 
Feb 2012 17:41:21 +0900:

     True, if one sticks to pure ASCII, there's no difference to notice,
     but that's just not possible for people who live outside of the U.S.,
     or who share text with people outside of the U.S.  They need currency
     symbols, they have friends whose names have little dots on them.
     Every single one of those is a backtrace waiting to happen.  A
     backtrace on

         f = open('text-file.txt')
         for line in f: pass

     is an imposition.  That doesn't happen in 2.x (for the wrong reasons,
     but it's very convenient 95% of the time).

     This is what Victor's "locale" codec is all about.  I think that's
     the wrong spelling for the feature, but there does need to be a way
     to express "don't bother me about Unicode" in most scripts for most
     people.  We don't have a decent boilerplate for that yet.


which I *paraphrased* as "I have text files that are mostly ASCII and I don't 
want to deal with Unicode yadda yadda yadda".

But in any case, I expressed myself poorly, and I'm sorry about that.

Regardless of who made the very first mention of the encoding problem in this 
thread, I think we should all be able to agree that laziness is *not* the only 
reason for having encoding problems. I thought I made it clear that I did not 
subscribe to that opinion.


-- 
Steven


From p.f.moore at gmail.com  Thu Feb 16 15:47:58 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Feb 2012 14:47:58 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3D0839.2070802@pearwood.info>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
	<4F3D0839.2070802@pearwood.info>
Message-ID: <CACac1F9fa-2dU7CDetoeFUV6iHi6kHQ82zFhWC2seAGQOQZkCQ@mail.gmail.com>

On 16 February 2012 13:44, Steven D'Aprano <steve at pearwood.info> wrote:
> But in any case, I expressed myself poorly, and I'm sorry about that.
>
> Regardless of who made the very first mention of the encoding problem in
> this thread, I think we should all be able to agree that laziness is *not*
> the only reason for having encoding problems. I thought I made it clear that
> I did not subscribe to that opinion.

Not a problem. Equally, my "I feel insulted" dig was uncalled for - it
was the sort of semi-humorous comment that doesn't translate itself
well in email.

I think the debate here has been immensely useful, and I appreciate
everyone's comments.

Paul.


From nathan.alexander.rice at gmail.com  Thu Feb 16 15:55:27 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Thu, 16 Feb 2012 09:55:27 -0500
Subject: [Python-ideas] automation of __repr__/__str__ for all the
 common simple cases
In-Reply-To: <4F3CAE26.4020201@gmx.de>
References: <4F3BC1EA.3030002@gmx.de> <4F3C4325.2070000@pearwood.info>
	<4F3CA80E.1060208@gmx.de>
	<CADiSq7cZ-4N6_MVZ+cdLRa8bhDxGY4kuzbWCgVHvF17Qq0Q7NA@mail.gmail.com>
	<4F3CAE26.4020201@gmx.de>
Message-ID: <CAOFbRmLOTEKKhmi=C54rbVtnu0kb+LhnHHmA9Jun5b5f=0SyCg@mail.gmail.com>

> instead of the ReprMixin, how about a descriptor
>
> the Api would change to something like
>
> class User(object):
> ?__repr__ = FormatRepr('<User {name!r}')
>
> its less concise, but actually more straightforward and better decoupled
> (thanks for hinting at the strong coupling in the class hierarchy)
>
> and ArgRepr and KwargRepr could be added in a similar fashion

+1 for not using inheritance (which honestly creates about as many
problems as it solves)...

Having some of the common use cases like this as descriptors that
could be imported and used directly would be an improvement on the
current interface of the lib IMHO.


Nathan


From stephen at xemacs.org  Thu Feb 16 16:25:59 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 17 Feb 2012 00:25:59 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
Message-ID: <874nuqol4o.fsf@uwakimon.sk.tsukuba.ac.jp>

Paul Moore writes:

 > Add to this the fact that I *know* I've seen supposed text files with
 > mixed encoding content,

Heck, I've seen *file names* with mixed encoding content.

 > and no-one has *ever* explained how to handle that (it's basically
 > a damaged file, and so all the "right way to deal with Unicode"
 > discussions ignore it)

The right way to handle such a file is ad hoc: operate on the features
you can identify, and treats runs of bytes of unknown encoding as
atomic blobs.

In practice, there is a generic such feature that supports many
applications: runs of ASCII text.  Which is the intuition all the
pragmatists start with -- it's correct.

 > OK, so maybe I do feel somewhat insulted...

I'm sorry you feel that way.  (I've sided with the pragmatists in this
thread, but on this issue I'm a purist at heart.)


From sturla at molden.no  Thu Feb 16 16:27:24 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 16 Feb 2012 16:27:24 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
References: <20120214185044.4c5ee513@bhuda.mired.org>	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>	<20120214212539.7c5ffdef@bhuda.mired.org>	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>	<20120214231011.6fce4b3b@bhuda.mired.org>	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org>	<20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org>
	<6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
Message-ID: <4F3D205C.5020406@molden.no>

On 16.02.2012 02:40, Sturla Molden wrote:
>
>>
>> P.S. I have posted a possible implementation of shm_open/shm_unlink for Windows at
>>
>> http://mail.python.org/pipermail/concurrency-sig/2012-February/000058.html
>>
>>
>
> A temporary file is not backed shared memory on Windows, but is a persistent file on disk. You have to mmap from the OS' paging file to get shared memory.


Hmm...

It seems files created with the flag FILE_ATTRIBUTE_TEMPORARY is backed 
by memory if possible. Though MSDN does not say if it is shared memory 
that can be used for IPC. A blog article on MSDN from 2004 indicates 
that the combination FILE_ATTRIBUTE_TEMPORARY|FILE_FLAG_DELETE_ON_CLOSE 
is needed. The  Windows systems programming book from MS Press does not 
mention FILE_ATTRIBUTE_TEMPORARY for temporary files.

So it seems most Windows programmers are actually creating permanent 
files in the temp file folder, rather than creating temporary files. So 
the cause for buil-up of temporary files on Windows is actually a 
wide-spread programming error, not the fault of the operating system.

It seems tmpfile.NamedTemporaryFile will use FILE_ATTRIBUTE_TEMPORARY on 
Windows if called with delete=True. But is does not use 
FILE_FLAG_DELETE_ON_CLOSE as well, which probably is an error 
(particularly if the "delete" keyword argument should make sence).


Sturla


From ethan at stoneleaf.us  Thu Feb 16 16:21:23 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 16 Feb 2012 07:21:23 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3C5DC8.707@canterbury.ac.nz>
References: <jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz>
Message-ID: <4F3D1EF3.40203@stoneleaf.us>

Greg Ewing wrote:
> It seems to me that what surrogateescape is effectively doing is
> creating a new data type that consists of a mixture of ASCII
> characters and raw bytes, and enables you to tell which is which.

How so?  Sounds like this new data type assumes everything over 127 is a 
raw byte, but there are plenty of applications where values between 0 - 
127 should be interpreted as raw bytes even when the majority are indeed 
just plain ascii.


> Maybe there should be a real data type like this, or a flag on
> the unicode type. The data would be stored in the same way as a
> latin1-decoded string, but anything with the high bit set would
> be regarded as a byte instead of a character. This might make it
> easier to interoperate with external libraries that expect
> well-formed unicode.

I can see a data type that is easier to work with than bytes 
(ascii-string, anybody? ;) but I don't think we want to make it any kind 
of unicode -- once the text has been extracted from this ascii-string it 
should be converted to unicode for further processing, while any other 
non-convertible bytes should stay as bytes (or ascii-string, or whatever 
we call it).

The above is not arguing with the 'latin-1' nor 'surrogateescape' 
techniques, but only commenting on a different data type with probably 
different uses.

~Ethan~


From p.f.moore at gmail.com  Thu Feb 16 16:37:02 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 16 Feb 2012 15:37:02 +0000
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <874nuqol4o.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
	<874nuqol4o.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CACac1F_0cbtPyOiB9fB99hZDem87M9trGEB0rs9qx5ejo0kofg@mail.gmail.com>

On 16 February 2012 15:25, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> OK, so maybe I do feel somewhat insulted...
>
> I'm sorry you feel that way. ?(I've sided with the pragmatists in this
> thread, but on this issue I'm a purist at heart.)

As I said elsewhere that was a lame attempt at a joke. My apologies.
No-one has been anything but helpful in this thread, I was just
reacting (a little) to the occasional characterisation I've noticed of
people as "lazy" - your term "pragmatists" is much less emotive. (And
it wasn't so much a personal reaction anyway, just an awareness that
we need to be careful how we express things to people struggling with
this)

Paul.


From barry at python.org  Thu Feb 16 17:07:11 2012
From: barry at python.org (Barry Warsaw)
Date: Thu, 16 Feb 2012 11:07:11 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<16F229EE-4018-4E81-962A-8D48036F194F@gmail.com>
	<87aa4ko7xx.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20120216110711.284001db@resist.wooz.org>

On Feb 15, 2012, at 04:46 PM, Stephen J. Turnbull wrote:

>[1]  When this damned term is over in a few weeks, I'll take a look at
>the tutorial-level docs and see if I can come up with a gentle
>approach for those who are finding out for the first time that the
>locale-dependent default isn't good enough for them.

I really hope you do this, but note that it would be very helpful to have
guidelines and recommendations even for advanced, knowledgeable Python
developers.  I have participated in many discussions in various forums with
other Python developers where genuine differences of opinion or experience,
leads to different solutions.  It would be very helpful to point to a document
and say "here are the best practices for your [application|library] as
recommended by core Python experts in Unicode handling."

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120216/16dd9c42/attachment.pgp>

From stephen at xemacs.org  Thu Feb 16 17:25:47 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 17 Feb 2012 01:25:47 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3D1EF3.40203@stoneleaf.us>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <4F3D1EF3.40203@stoneleaf.us>
Message-ID: <8739aaoid0.fsf@uwakimon.sk.tsukuba.ac.jp>

Ethan Furman writes:

 > The above is not arguing with the 'latin-1' nor 'surrogateescape' 
 > techniques, but only commenting on a different data type with probably 
 > different uses.

But there really aren't any uses that aren't equally well dealt with
by 'surrogateescape' that I can see.  You have to process it code unit
by code unit (just like surrogateescape) and if you find a non-
character code unit, you then have an ad hoc decision to make about
what to do with it.

surrogateescape makes one particular treatment blazingly efficient
(namely, turning the surrogate back into a byte with no known
meaning).  What other treatment of a byte of by-definition unknown
semantics deserves the blazing efficiency that a new (presumably
builtin) type could give?


From shibturn at gmail.com  Thu Feb 16 17:51:58 2012
From: shibturn at gmail.com (shibturn)
Date: Thu, 16 Feb 2012 16:51:58 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
	<20120215180250.21a05ddf@pitrou.net> <jhgqs1$3kf$1@dough.gmane.org>
	<6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
Message-ID: <jhjc7k$ige$1@dough.gmane.org>

On 16/02/2012 1:40am, Sturla Molden wrote:
> A temporary file is not backed shared memory on Windows, but is a
> persistent file on disk. You have to mmap from the OS' paging file
> to get shared memory.

An mmap can certainly be used as shared memory when it is backed by a 
real file.  Or are you saying that it would work but be much slower?

Also, according to this msdn blog

   http://blogs.msdn.com/b/larryosterman/archive/2004/04/19/116084.aspx

if you open a file using FILE_ATTRIBUTE_TEMPORARY and
FILE_FLAG_DELETE_ON_CLOSE the file will not be flushed to the disk
unless there is memory pressure.

sbt


From shibturn at gmail.com  Thu Feb 16 17:56:46 2012
From: shibturn at gmail.com (shibturn)
Date: Thu, 16 Feb 2012 16:56:46 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <4F3D205C.5020406@molden.no>
References: <20120214185044.4c5ee513@bhuda.mired.org>	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>	<20120214212539.7c5ffdef@bhuda.mired.org>	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>	<20120214231011.6fce4b3b@bhuda.mired.org>	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org>	<20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org>
	<6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
	<4F3D205C.5020406@molden.no>
Message-ID: <jhjcgk$k2c$1@dough.gmane.org>

On 16/02/2012 3:27pm, Sturla Molden wrote:
> Hmm...
>
> It seems files created with the flag FILE_ATTRIBUTE_TEMPORARY isbacked by memory if possible...

I did not notice this message before I replied to your earlier one.

sbt


From julien at tayon.net  Thu Feb 16 18:51:47 2012
From: julien at tayon.net (julien tayon)
Date: Thu, 16 Feb 2012 18:51:47 +0100
Subject: [Python-ideas] Generators' and iterators' __add__ method
In-Reply-To: <CAPkN8x+mq_3bndN=dVuKdbNRV9cxafYzkLG6V+uxAXGCaJX3_A@mail.gmail.com>
References: <CANSw7Kx7Y+5ySJ4Se5-n2WVnGaxxkv8E9gLj-FZwuR04gjkimQ@mail.gmail.com>
	<4F3C5FEF.5090508@canterbury.ac.nz>
	<CAPkN8x+mq_3bndN=dVuKdbNRV9cxafYzkLG6V+uxAXGCaJX3_A@mail.gmail.com>
Message-ID: <CAFpLVky8w=M_BhpCvTF2K99EMxmE18KCTYO2djQQVeBwoYDBPw@mail.gmail.com>

>> It would also clash with existing meanings of __add__ on some
>> types, such as NumPy arrays.
>
>
> Good example for Python Ideas FAQ.

Well, it looks like a classical problem of disambiguisation. We have
more than one consistent behaviour for addition. (I know I am dense).

And these behaviours  clash if not properly disambiguized.  In a world
were unicorns exists, we would select the behaviour of __add__ with a
switch (let's say __add__ would be a dispatch table that would take
the ?algebrae? as a parameter). Still in this  world were unicorns
exists and Perl6 would be in production (and acclaimed) this switch
could be sensibily set according to the context. And everybody would
be perfectly aware and taking no dangers.

In the actual word, it is pretty much a chimera or a dying foetus
(musical joke) in the case of python since it would violates half the
tao of python :
...
Explicit is better than implicit.
...
Complex is better than complicated.
...
Special cases aren't special enough to break the rules.
...
There should be one-- and preferably only one --obvious way to do it.

As a result my suggestion for the FAQ should be to state that anyone
wanting to submit an idea to python community should ?import this?
first and see it as deadly serious.
I made the mistake twice :)

PS I still see cases were having an __algebrae__ private member could
be very helpful, because dynamic typing has its limit.

hf, gl

-- 
Jul


From ethan at stoneleaf.us  Thu Feb 16 20:34:16 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 16 Feb 2012 11:34:16 -0800
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <8739aaoid0.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <jh6fu0$qdg$1@dough.gmane.org>	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>	<4F3AE675.6010907@mrabarnett.plus.com>	<87haytyms7.fsf@benfinney.id.au>	<20120215133912.GA17040@iskra.aviel.ru>	<4F3C5DC8.707@canterbury.ac.nz>	<4F3D1EF3.40203@stoneleaf.us>
	<8739aaoid0.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4F3D5A38.6070901@stoneleaf.us>

Stephen J. Turnbull wrote:
> Ethan Furman writes:
>> The above is not arguing with the 'latin-1' nor 'surrogateescape' 
>> techniques, but only commenting on a different data type with probably 
>> different uses.
> 
> But there really aren't any uses that aren't equally well dealt with
> by 'surrogateescape' that I can see.  You have to process it code unit
> by code unit (just like surrogateescape) and if you find a non-
> character code unit, you then have an ad hoc decision to make about
> what to do with it.
> 
> surrogateescape makes one particular treatment blazingly efficient
> (namely, turning the surrogate back into a byte with no known
> meaning).  What other treatment of a byte of by-definition unknown
> semantics deserves the blazing efficiency that a new (presumably
> builtin) type could give?

It wasn't the 'unknown semantics' that I was responding to (latin-1 and 
surrogateescape deal with that just fine), but rather a new data type 
with a mixture of valid unicode (0-127) and raw bytes (128-255) -- I 
don't think that would be common enough to justify, and I can see 
confusion again creeping in when somebody (like myself ;) sees a 
datatype which seemingly supports a mixture of unicode and raw bytes 
only to find out that 'uni_raw(...)[5] != 32' because a u' ' was 
returned and an integer (or raw byte) was expected at that location.

~Ethan~


From sturla at molden.no  Thu Feb 16 21:40:54 2012
From: sturla at molden.no (Sturla Molden)
Date: Thu, 16 Feb 2012 21:40:54 +0100
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <jhjc7k$ige$1@dough.gmane.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>	<20120214212539.7c5ffdef@bhuda.mired.org>	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>	<20120214231011.6fce4b3b@bhuda.mired.org>	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org>	<20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org>	<6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
	<jhjc7k$ige$1@dough.gmane.org>
Message-ID: <4F3D69D6.2000809@molden.no>

On 16.02.2012 17:51, shibturn wrote:

> An mmap can certainly be used as shared memory when it is backed by a
> real file. Or are you saying that it would work but be much slower?

For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or 
private. (I.e. if using it for IPC will involve disk access.)

mmap can certainly be used for shared memory.


Sturla


From shibturn at gmail.com  Thu Feb 16 22:20:55 2012
From: shibturn at gmail.com (shibturn)
Date: Thu, 16 Feb 2012 21:20:55 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <4F3C6246.7000509@canterbury.ac.nz>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
	<20120215180250.21a05ddf@pitrou.net> <jhgqs1$3kf$1@dough.gmane.org>
	<4F3C6246.7000509@canterbury.ac.nz>
Message-ID: <jhjrvt$kre$1@dough.gmane.org>

On 16/02/2012 1:56am, Greg Ewing wrote:
> I don't know about Windows, but in Unix it's possible to send a
> file descriptor from one process to another over a unix-domain
> socket connection. So a refcounted anonymous mmap handover could
> be achieved this way:
>
> 1. Process A creates a temp file, mmaps it and unlinks it.
> 2. Process A sends the file descriptor to process B over a
> unix-domain socket.
> 3. Process B mmaps it.
>
> Even if process A closes its version of the fd right after
> sending it, the OS should keep it alive while it's in transit,
> I think.

If the receiving process is expecting an fd then that certainly works. 
But making it work transparently with pickle is difficult. 
(multiprocessing.reduction tried making it transparent using a 
background thread to accept requests for fds from unpickling processes. 
  But that functionality has been disabled.)

On Windows one rather cleaner possibility is for the process pickling 
the handle to use DuplicateHandle() to copy the handle to the main 
process.  Then the receiving process can copy the handle from the main 
process, removing it from the main process at the same time by using 
"dwOptions=DUPLICATE_CLOSE_SOURCE".  Since the main process will not 
exit before its descendants, that will solve the keep-alive problem.  (I 
have managed to produce a working example of this scheme for transfering 
a file handle.)

sbt


From shibturn at gmail.com  Thu Feb 16 22:31:49 2012
From: shibturn at gmail.com (shibturn)
Date: Thu, 16 Feb 2012 21:31:49 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <4F3D69D6.2000809@molden.no>
References: <20120214185044.4c5ee513@bhuda.mired.org>	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>	<20120214212539.7c5ffdef@bhuda.mired.org>	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>	<20120214231011.6fce4b3b@bhuda.mired.org>	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org>	<20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org>	<6E24498E-EB73-46EA-9508-BC4279762B74@molden.no>
	<jhjc7k$ige$1@dough.gmane.org> <4F3D69D6.2000809@molden.no>
Message-ID: <jhjska$ps6$1@dough.gmane.org>

On 16/02/2012 8:40pm, Sturla Molden wrote:
> For FILE_ATTRIBUTE_TEMPORARY, I am not sure if the memory is shared or
> private. (I.e. if using it for IPC will involve disk access.)

Even if it is backed by a perfectly normal file, using an mmap for IPC 
does not require disk access if the relevant pages have not been evicted 
from memory.

FILE_ATTRIBUTE_TEMPORARY only affects how eager the system is to flush 
modified data to the disk.

sbt


From mbarkhau at googlemail.com  Fri Feb 17 01:06:52 2012
From: mbarkhau at googlemail.com (Manuel Barkhau)
Date: Fri, 17 Feb 2012 01:06:52 +0100
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
Message-ID: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>

Hi everybody,

I'd like to suggest adopting something similar to the
ScopeGuardStatement from the D programming language. A description of
the D version can be found here:
http://d.digitalmars.com/2.0/statement.html#ScopeGuardStatement

It is also similar to golangs "defer" statement:
http://golang.org/doc/go_spec.html#Defer_statements

So these are roughly equivelent:
defer {block} // in golang
scope(exit) {block} // in D

I have written a context manager that approximates the behavior of the
scope statement in D: http://ideone.com/vNmq8
The use of lambdas or nested functions doesn't look very nice however.


So, on to the proposal.

I think the "defer" keyword is more appropriate than "scope" and the
function like syntax of "scope(exit)" doesn't fit with the overall
python syntax.

There are three ways to define a defer block.
 - "defer: BLOCK", which is the same as "defer {BLOCK} in golang or
   "scope(exit) {BLOCK}" in D.
 - "defer EXPR as VAR: BLOCK", which is similar to "scope(failure)". It
   differs in that it specifies the exception that caused the failure
   and is only called for matching exceptions.
 - "defer EXPR: BLOCK else: BLOCK", where the else BLOCK is executed
   when no exception occurs. This is similar to "scope(success)" and the
   existing except: else: construct in python.

As "defer:" is currently invalid syntax, there shouldn't be any code
breakage from adding the new keyword.

Some rules:
 - Deferred blocks are executed in the reverse lexical order in which
   they appear.
 - If a function returns before reaching a defer statement, it will not
   be executed.
 - If a defer block raises an error, a lexically earlier defer block may
   catch it.
 - If multiple defer blocks raise errors or return results, the raise or
   return of the lexically earlier defer will mask the previous result
   or error.


Some example code:

>>> def ordering_example():
...     print(1)
...     defer: print(2)
...     defer: print(3)
...     print(4)
...
>>> ordering_example()
1
4
3
2

Handling exceptions:

>>> def defer_example():
...     # setup
...     defer:                  # always executed
...         # cleanup
...     defer Exception as e:   # executed if exception is raised
...         # handle exception
...     else:                   # executed if no exception is raised
...         # success code
...
...     # your usual code
...     # possibly raise exception


Equivalent using try/except/finally:

>>> def try_example():
...     # setup
...     try:
...         # your usual code
...         # possibly raise exception
...     except Exception as e:
...         # handle exception
...     else:
...         # success code
...     finally:
...         # cleanup


The nesting advantage becomes more apparent when more are required. Here
is an example from
http://www.doughellmann.com/articles/how-tos/python-exception-handling/index.html

    #!/usr/bin/env python

    import sys
    import traceback

    def throws():
        raise RuntimeError('error from throws')

    def cleanup():
        raise RuntimeError('error from cleanup')

    def nested():
        try:
            throws()
        except Exception as original_error:
            try:
                raise
            finally:
                try:
                    cleanup()
                except:
                    pass # ignore errors in cleanup

    def main():
        try:
            nested()
            return 0
        except Exception as err:
            traceback.print_exc()
            return 1

    if __name__ == '__main__':
        sys.exit(main())


Here are the equivalent of main and nested functions using defer:

    def nested():
        defer RuntimeError: pass # ignore errors in cleanup
        defer: cleanup()
        throws()

    def main():
        defer Exception as err:
            traceback.print_exc()
            return 1
        else:
            return 0

        nested()

Notice that we don't even need "defer Exception as original_error:
raise" after "defer: cleanup()" in order to preserve the stack trace. It
will go up the call stack, so long as no defer handles it or masks it
with another exception.


This proposal would probably have had a better chance before the
introduction of the "with" statement, but I still think it may be useful
in cases where you don't want to write a context manager. Context
managers may also not have access to the scope they are used in, which
may be inconvenient in some cases.

For code where try/except/finally would otherwise be required, I think
the advantages make this proposal at least worth considering. You don't
need to nest your normal code in a try block and you can place error
handling code together with relevant sections, rather than further down
in an except block.

I'm sure there is much I have overlooked, possibly this is technically
difficult and of course there is the minor task of implementation. But
other than that what do you think?


Manuel


From tjreedy at udel.edu  Fri Feb 17 02:18:07 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 16 Feb 2012 20:18:07 -0500
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
Message-ID: <jhk9sm$mfs$1@dough.gmane.org>

On 2/16/2012 7:59 AM, Paul Moore wrote:

> Add to this the fact that I *know* I've seen supposed text files with
> mixed encoding content, and no-one has *ever* explained how to handle
> that (it's basically a damaged file,

Before unicode, mixed encodings was the only was to have multi-lingual 
digital text (with multiple symbol systems) in one file. I presume such 
texts used some sort of language markup like <English>, <Hindi> (or 
<Sanskrit>), and <Tibetan>, along with software that understood the 
markup. Such files were not broken, just the pre-unicode system of 
different codes for each language or nation.

To handle such a file, the program, whatever the language, has to 
understand the custom markup, segment the bytes, and handle each segment 
appropriately.

Crazy text that switches among unknown encodings without notice is a 
possibly unsolvable decryption problem. Such have no guaranteed 
algorithms, only heuristics.

-- 
Terry Jan Reedy


From steve at pearwood.info  Fri Feb 17 02:25:41 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 17 Feb 2012 12:25:41 +1100
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
Message-ID: <4F3DAC95.8050804@pearwood.info>

Manuel Barkhau wrote:

> As "defer:" is currently invalid syntax, there shouldn't be any code
> breakage from adding the new keyword.

Of course there will be. Every new keyword will break code that uses that word 
as a regular name:

defer = True
instance.defer = None

Both of which will become a SyntaxError if defer becomes a keyword.

It's not even like "defer" is an uncommon word unlikely to be used anywhere. 
(Although I can't find any examples of it in the standard library.)


> Some rules:
>  - Deferred blocks are executed in the reverse lexical order in which
>    they appear.

Why in reverse order? This is unintuitive. If you write:

def func():
     defer: print(1)
     defer: print(2)
     defer: print(3)
     do_stuff()
     return


the output will be

3
2
1

Is this a deliberate design choice, or an accident of implementation that D 
and Go have followed? If it is deliberate, what is the rationale for it?


[...]
> The nesting advantage becomes more apparent when more are required. Here
> is an example from

I disagree. Nesting is an advantage, the use of defer which eliminates that 
nesting is a MAJOR disadvantage of the concept. You seem to believe that 
nesting is a problem to be worked around. I call it a feature to be encouraged.

With try...except/finally, the structure of which blocks are called, and when, 
is directly reflected in the nesting and indentation.

With defer, that structure is gone. The reader has to try to recreate the 
execution order in their head. That is an enormous negative.


> http://www.doughellmann.com/articles/how-tos/python-exception-handling/index.html
> 
>     #!/usr/bin/env python
> 
>     import sys
>     import traceback
> 
>     def throws():
>         raise RuntimeError('error from throws')
> 
>     def cleanup():
>         raise RuntimeError('error from cleanup')
> 
>     def nested():
>         try:
>             throws()
>         except Exception as original_error:
>             try:
>                 raise
>             finally:
>                 try:
>                     cleanup()
>                 except:
>                     pass # ignore errors in cleanup

I don't understand the point of that example. Wouldn't it be better written as 
this?


def nested():
     try:
         throws()
     finally:
         try:
             cleanup()
         except:
             pass

As far as I can tell, my version gives the same behaviour as yours:

py> main()
Traceback (most recent call last):
   File "<stdin>", line 3, in main
   File "<stdin>", line 3, in nested
   File "<stdin>", line 2, in throws
RuntimeError: error from throws
1


(Tested in Python 2.5 with the obvious syntax changes.)


[...]
> Here are the equivalent of main and nested functions using defer:
> 
>     def nested():
>         defer RuntimeError: pass # ignore errors in cleanup
>         defer: cleanup()
>         throws()

How is the reader supposed to know that pass will ignore errors in cleanup, 
and nothing else, without the comment? Imagine that the first defer line and 
the second are separated by a bunch of code:

def nested():
     defer RuntimeError: pass
     do_this()
     do_that()
     do_something_else()
     if flag:
         return
     if condition:
         defer: something()
     defer: cleanup()
     throws()


What is there to connect the first defer to the cleanup now? It seems to me 
that defer would let you write spaghetti code in a way which is really 
difficult (if not impossible) with try blocks. When considering a proposal, we 
should consider how it will be abused as well as how it will be used.


-- 
Steven


From ncoghlan at gmail.com  Fri Feb 17 02:33:59 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 17 Feb 2012 11:33:59 +1000
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
Message-ID: <CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>

On Fri, Feb 17, 2012 at 10:06 AM, Manuel Barkhau
<mbarkhau at googlemail.com> wrote:
> It is also similar to golangs "defer" statement:
> http://golang.org/doc/go_spec.html#Defer_statements

Since there have been a few proposals along these lines recently:

Nothing is going to happen on the dedicated syntax front in the
deferred execution space at least until I get contextlib.CallbackStack
into Python 3.3 and we gather additional feedback on patterns of use
(and, assuming that API addresses the relevant use cases the way I
plan, these features will *never* need dedicated syntax).

A preliminary version of the API is available in the contextlib2
backport as ContextStack:
http://contextlib2.readthedocs.org/en/latest/index.html#contextlib2.ContextStack

See the issue tracker for the changes that are planned in order to
update that to the new CallbackStack API:
https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and

> Some example code:
>
>>>> def ordering_example():
> ... ? ? print(1)
> ... ? ? defer: print(2)
> ... ? ? defer: print(3)
> ... ? ? print(4)

With ContextStack:

def ordering_example():
    with ContextStack() as stack:
        print(1)
        stack.register(print, 2)
        stack.register(print, 3)
        print(4)


With the planned CallbackStack API:

def ordering_example():
    with CallbackStack() as stack:
        print(1)
        stack.push(print, 2)
        stack.push(print, 3)
        print(4)

>>>> ordering_example()
> 1
> 4
> 3
> 2

Same output.

> The nesting advantage becomes more apparent when more are required. Here
> is an example from
> http://www.doughellmann.com/articles/how-tos/python-exception-handling/index.html
>
> ? ?#!/usr/bin/env python
>
> ? ?import sys
> ? ?import traceback
>
> ? ?def throws():
> ? ? ? ?raise RuntimeError('error from throws')
>
> ? ?def cleanup():
> ? ? ? ?raise RuntimeError('error from cleanup')
>
> ? ?def nested():
> ? ? ? ?try:
> ? ? ? ? ? ?throws()
> ? ? ? ?except Exception as original_error:
> ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ?raise
> ? ? ? ? ? ?finally:
> ? ? ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ? ? ?cleanup()
> ? ? ? ? ? ? ? ?except:
> ? ? ? ? ? ? ? ? ? ?pass # ignore errors in cleanup

Huh? That's a bizarre way to write it. A more sane equivalent would be

    def nested():
        try:
            throws()
        except BaseException:
            try:
                cleanup()
            except:
                pass
            raise

>>> def throws():
...     1/0
...
>>> def nested():
...     try:
...         throws()
...     except BaseException:
...         try:
...             raise Exception
...         except:
...             pass
...         raise
...
>>> nested()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in nested
  File "<stdin>", line 2, in throws
ZeroDivisionError: division by zero

However, this does raise a reasonable feature request for the planned
contextlib2.CallbackStack API, so the above can be written as:

    def _ignore_exception(*args):
        return True

    def _cleanup_on_error(exc_type, exc_val, exc_tb):
        if exc_type is not None:
            cleanup()

    def nested():
        with CallbackStack(callback_error=_ignore_exception) as stack:
            stack.push_exit(_cleanup_on_error)
            throws()

In Python 3 though, your better bet is often going to be just to let
the cleanup exception fly - the __context__ attribute means the
original exception and the full stack trace will be preserved
automatically.

> I'm sure there is much I have overlooked, possibly this is technically
> difficult and of course there is the minor task of implementation. But
> other than that what do you think?

I think contextlib2 and PEP 3144 cover the use cases you have
presented more cleanly and without drastic syntax changes.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stephen at xemacs.org  Fri Feb 17 03:22:44 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 17 Feb 2012 11:22:44 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <20120216110711.284001db@resist.wooz.org>
References: <CAPkN8xKu-0Wdwk5Jp7neNudvWQ6WH1S=2OPDfb9pOjL-59_WnQ@mail.gmail.com>
	<3A660961-784E-43BC-8EE5-EA5E71B44E5A@masklinn.net>
	<jh5ocl$ru9$1@dough.gmane.org>
	<08E5748E-1A04-4986-A907-5D86B9C99711@masklinn.net>
	<jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<4F3AEFAF.5060107@pearwood.info>
	<87ehtwolya.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7f=5yPCo-OMFN+FmbqgjUjJX0fz5ivUvNbOJ=58RX9E9g@mail.gmail.com>
	<87bop0ohth.fsf@uwakimon.sk.tsukuba.ac.jp>
	<16F229EE-4018-4E81-962A-8D48036F194F@gmail.com>
	<87aa4ko7xx.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20120216110711.284001db@resist.wooz.org>
Message-ID: <871upunqq3.fsf@uwakimon.sk.tsukuba.ac.jp>

Barry Warsaw writes:

 > I really hope you do this, but note that it would be very helpful to have
 > guidelines and recommendations even for advanced, knowledgeable Python
 > developers.

 > I have participated in many discussions in various forums with
 > other Python developers where genuine differences of opinion or
 > experience, leads to different solutions.  It would be very helpful
 > to point to a document and say "here are the best practices for
 > your [application|library] as recommended by core Python experts in
 > Unicode handling."

I'll see what I can do, but for *best practices* going beyond the
level of Paul Moore's use case is difficult for the reasons elaborated
elsewhere (by others as well as myself): basic Unicode handling is no
harder than ASCII handling as long as everything is Unicode.  So the
real answer is to insist on valid Unicode for your text I/O, failing
that, text labeled *as* text *with* an encoding[1], and failing that
(or failing validation of the input), reject the input.[2]

If that's not acceptable -- all too often it is not -- you're in a
world of pain, and the solutions are going to be ad hoc.  The WSGI
folks will not find the solutions proposed for email acceptable, and
vice versa.

Something like the format Nick proposed, where the tradeoffs are
described, would be useful, I guess.  But the tradeoffs have to be
made ad hoc.

Footnotes: 
[1]  Of course it's OK if these are implicitly labeled by requirements
or defaults of a higher-level protocol.

[2]  This is the Unicode party line, of course.  But it's really the
only generally applicable advice.


From mbarkhau at googlemail.com  Fri Feb 17 03:25:36 2012
From: mbarkhau at googlemail.com (Manuel Barkhau)
Date: Fri, 17 Feb 2012 03:25:36 +0100
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
Message-ID: <CANf+-jn+JP_WDTgmimyntEKLNnw5NTy9RcSF5iT9dKy4iWdcsQ@mail.gmail.com>

> Every new keyword will break code that uses that word as a regular
> name:

Ah, my bad. I had assumed that the addition of the with statement didn't
break anything and thought the only case I needed to look at was "defer:
...".

> It seems to me that defer would let you write spaghetti code in a way
> which is really difficult (if not impossible) with try blocks.

Sure people can write spaghetti code with this, who ever said it was
appropriate for everything in the world? I also wasn't aware there were
people so fond of writing try blocks, because to me they luck fugly.
Rather than wrapping all my code in a try block, I would rather write
the code that deals with peripheral cases in a block, and continue on
with the main code.

> You seem to believe that nesting is a problem to be worked around. I
> call it a feature to be encouraged.

Bingo, I don't like nesting too much.

> Is this a deliberate design choice, or an accident of implementation
> that D and Go have followed? If it is deliberate, what is the
> rationale for it?

Yes it's because of how they chose to do it and I kept it that way if
nothing else, for familiarity. But I'm sure there is some reasoning
behind it.

> Huh? That's a bizarre way to write it. A more sane equivalent would be

The example given by Doug is intended to preserve the original stack
trace of the exception that is thrown by throws.

> ? ?def nested():
> ? ? ? ?try:
> ? ? ? ? ? ?throws()
> ? ? ? ?except BaseException:
> ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ?cleanup()
> ? ? ? ? ? ?except:
> ? ? ? ? ? ? ? ?pass
> ? ? ? ? ? ?raise

This raises the exception thrown by cleanup. If you use "raise
original_exception", the stack trace isn't preserved, which is what the
article is about. But now that you mention it, I'm not sure the defer
example I gave actually would produce the same stack trace either.

Oh well, context managers it is then I guess. Thanks for the
references Nick.

Manuel


From stephen at xemacs.org  Fri Feb 17 04:42:25 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 17 Feb 2012 12:42:25 +0900
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <jhk9sm$mfs$1@dough.gmane.org>
References: <874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com>
	<87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru>
	<4F3C5DC8.707@canterbury.ac.nz> <20120216040839.GA3048@ando>
	<CACac1F-DDN2zp+3vLANJDy+xr0MyDjNRsi11njVj2BUyW7S=4A@mail.gmail.com>
	<jhk9sm$mfs$1@dough.gmane.org>
Message-ID: <87zkcim8gu.fsf@uwakimon.sk.tsukuba.ac.jp>

Terry Reedy writes:

 > Before unicode, mixed encodings was the only was to have multi-lingual 
 > digital text (with multiple symbol systems) in one file.

There is a long-accepted standard for doing this, ISO 2022.  IIRC it's
available online from ISO now, and if not, ECMA 35 is the same.  The X
Compound Text standard (I think this is documented in the ICCCM) and
the Motif Compound String are profiles of ISO 2022.

If that is what Paul is seeing, then the iso-2022-jp codec might be
good enough to decode the files he has, depending on which version of
ISO-2022-JP is implemented.  If not, iconv -f ISO-2022-JP-2 (or
ISO-2022-JP-3) should work (at least for GNU's iconv implementation).

 > I presume such texts used some sort of language markup like
 > <English>, <Hindi> (or <Sanskrit>), and <Tibetan>, along with
 > software that understood the markup.

They would use encoding "markup" (specifically escape sequences).
Language is not enough, as all languages have had multiple encodings
since the invention of ASCII (or EBCDIC, whichever came second ;-),
and in many cases multilingual standards have evolved (Japanese, for
example, includes Greek and Cyrillic alphabets in its JIS standard
coded character set).  More recently, many languages have several ISO
2022-based encodings (the ISO 8859 family is a conformant profile of
ISO 2022, as are the EUC encodings for Asian languages; the Windows
125x code pages are non-conformant extensions of ASCII based on ISO
8859).

 > Crazy text that switches among unknown encodings without notice is a 
 > possibly unsolvable decryption problem.

True, and occasionally seen even today in Japan (cat(1) will produce
such files easily, and any system for including files).


From sven at marnach.net  Fri Feb 17 18:14:01 2012
From: sven at marnach.net (Sven Marnach)
Date: Fri, 17 Feb 2012 17:14:01 +0000
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <4F3DAC95.8050804@pearwood.info>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<4F3DAC95.8050804@pearwood.info>
Message-ID: <20120217171401.GA3406@pantoffel-wg.de>

Steven D'Aprano schrieb am Fr, 17. Feb 2012, um 12:25:41 +1100:
> Why in reverse order? This is unintuitive.
> [...]
> Is this a deliberate design choice, or an accident of implementation
> that D and Go have followed? If it is deliberate, what is the
> rationale for it?

Basically any cleanup mechanism I know of does the cleanups in the
reverse order as the initialisations, be it destructor calls in C++,
defer handlers in Go or nested 'with' statements in Python.  Since the
later initialised objects might depend on the previously defined
objects, this is also the only sane choice.

> With defer, that structure is gone. The reader has to try to
> recreate the execution order in their head. That is an enormous
> negative.

I think "defer" has some definite advantages over try/except as far as
readability is concerned.  It places the cleanup code at the position
the necessity for the cleanup occurs, and not way down in the code.
Python's "with" statement does a similar thing, but it gets difficult
to handle as soon as you try to *conditionally* add a cleanup handler
-- we had this discussion before, and it lead to Nick's contextlib2.

Cheers,
    Sven


From dreamingforward at gmail.com  Fri Feb 17 22:57:36 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Fri, 17 Feb 2012 14:57:36 -0700
Subject: [Python-ideas] doctest
Message-ID: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>

I find myself wanting to use doctest for some test-driven development,
and find myself slightly frustrated and wonder if others would be
interested in seeing the following additional functionality in
doctest:

1. Execution context determined by outer-scope doctest defintions.
2. Smart Comparisons that will detect output of a non-ordered type
(dict/set), lift and recast it and do a real comparison.

Without #1, "literate testing" becomes awash with re-defining re-used
variables which, generally, also detracts from exact purpose of the
test -- this creates testdoc noise and the docs become less useful.
Without #2, "readable docs" nicely co-aligning with "testable docs"
tends towards divergence.

Perhaps not enough developers use doctest to care, but I find it one
of the more enjoyable ways to develop python code -- I don't have to
remember test cases nor go through the trouble of setting up
unittests.   AND, it encourages agile development.  Another user wrote
a while back of even having a built-in test() method.  Wouldn't that
really encourage agile developement?  And you wouldn't have to muddy
up your code with "if __name__ == "__main__": import doctest, yadda
yadda".

Anyway... of course patches welcome, yes...  ;^)

mark


From ncoghlan at gmail.com  Fri Feb 17 23:12:15 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 18 Feb 2012 08:12:15 +1000
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
Message-ID: <CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>

On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com> wrote:
> Anyway... of course patches welcome, yes... ?;^)

Not really. doctest is for *testing code example in docs*. If you try
to use it for more than that, it's likely to drive you up the wall, so
proposals to make it more than it is usually don't get a great
reception (docs patches to make it's limitations clearer are generally
welcome, though). The stdib solution for test driven development is
unittest (the vast majority of our own regression suite is written
that way - only a small proportion uses doctest).

An interesting third party alternative that has been created recently
is behave: http://crate.io/packages/behave/

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From nathan.alexander.rice at gmail.com  Fri Feb 17 23:16:47 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Fri, 17 Feb 2012 17:16:47 -0500
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
Message-ID: <CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>

> Since there have been a few proposals along these lines recently:
>
> Nothing is going to happen on the dedicated syntax front in the
> deferred execution space at least until I get contextlib.CallbackStack
> into Python 3.3 and we gather additional feedback on patterns of use
> (and, assuming that API addresses the relevant use cases the way I
> plan, these features will *never* need dedicated syntax).
>
> A preliminary version of the API is available in the contextlib2
> backport as ContextStack:
> http://contextlib2.readthedocs.org/en/latest/index.html#contextlib2.ContextStack
>
> See the issue tracker for the changes that are planned in order to
> update that to the new CallbackStack API:
> https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and
>
>> ... snip ...
>
> With ContextStack:
>
> def ordering_example():
> ? ?with ContextStack() as stack:
> ? ? ? ?print(1)
> ? ? ? ?stack.register(print, 2)
> ? ? ? ?stack.register(print, 3)
> ? ? ? ?print(4)
>
>
> With the planned CallbackStack API:
>
> def ordering_example():
> ? ?with CallbackStack() as stack:
> ? ? ? ?print(1)
> ? ? ? ?stack.push(print, 2)
> ? ? ? ?stack.push(print, 3)
> ? ? ? ?print(4)

Hi Nick,

I just wanted to chime in on this, because I understand the use cases
and benefits of this but the code is very semantically opaque and
imperative.  I also feel like a lot of C programming concepts and
semantics have leaked into the design.  Additionally, I feel that
there are some benefits to taking a step back and looking at this
problem as part of a bigger picture.

Fundamentally, context managers are ways to convert a block of code
into an event, with __enter__ analogous to "before_block" and __exit__
analogous to an "after_block".  There are a couple of problems with
context managers that I feel an event system handles more elegantly:

1.)  Context is ambiguous.  Context could be interpreted to mean a
thread, a scope, a point in time, etc.  Context managers only deal
with the narrow problem of a block of code being run.  This is
succinctly described as an event.

2.)  The context manager API requires you to fire events before and
after the code block (yes, you can pass) and does not provide other
options, such as (in an ideal world of python with well behaved
threads) an event that is fired/in the active state concurrent to the
block of code's execution.  There are a few ways to hack this behavior
but they're all bad, and interoperability between libraries is
unlikely.

3.) If you want to extend context management for a particular piece of
code, you have to modify the code to add another context manager, or
monkey patch the existing context manager.  Modifying the code is has
some thorny issues, for instance, if you need to modify the context
handling in a third party lib, all of a sudden you have to fork the
lib and manually patch every time you upgrade or redeploy.  Monkey
patching is easy, but from a conceptual/readability perspective it is
horrible.  If the lib fires events, you can just register an action on
the event in your code and live happily ever after.

4.) The way context managers are defined only allows you describe a
linear chain of events, because they are associated with a block of
code, and the act of association precludes other context managers from
firing events for that same block of code.  Because of this, you have
things like register and preserve that exist to add support for (weak)
non-linearity.

5.) Going back to event concurrency and touching on non-linearity
again, if I have two functions that I've asked to fire when an event
occurs, this provides a strong clue to the interpreter that the given
functions could potentially run in parallel.  Of course, there would
need to be other cues, but I don't think people want to be in the
business of explicitly writing parallel code forever.

6.) User interface coders going back 30 years understand events pretty
well, but will probably give you a blank stare for a second or two if
you mention context managers.

I feel that "with"/context managers are an elegant solution to the
simple problem, however it seems like the generalized solution based
on context managers is pretty awkward.  The right thing to do in my
opinion would be to go back to the drawing board, design an event
subsystem that maps to something like pi-calculus/interval temporal
logic in a human/pythonic way.  This will avoid the immediate issues
like the necessary goofiness of contextlib2, and lay the groundwork
for nice things like automatic parallelization.

Of course, I'm anal about getting things 100% right, and context
managers are very nice, simple, elegant 80% solution.  If 99% of
people are happy with the 80% solution, it is probably the right thing
to do just to force a ugly hack on the remaining 1%.

Take care,

Nathan


From christopherreay at gmail.com  Fri Feb 17 23:31:38 2012
From: christopherreay at gmail.com (Christopher Reay)
Date: Sat, 18 Feb 2012 00:31:38 +0200
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
Message-ID: <CAMgkT_82TcPVsLBagYJKz5Q4+Xkd2Bo2-zspX9Eon1ZhyYFYTw@mail.gmail.com>

+1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120218/3197044c/attachment.html>

From jeanpierreda at gmail.com  Sat Feb 18 01:43:00 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Fri, 17 Feb 2012 19:43:00 -0500
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
Message-ID: <CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>

On Fri, Feb 17, 2012 at 4:57 PM, Mark Janssen <dreamingforward at gmail.com> wrote:
> I find myself wanting to use doctest for some test-driven development,
> and find myself slightly frustrated and wonder if others would be
> interested in seeing the following additional functionality in
> doctest:
>
> 1. Execution context determined by outer-scope doctest defintions.

I'm not sure what you mean, but it might be relevant that Sphinx lets
you define multiple scopes for doctests. I feel like its approach is
the right one, but it isn't reusable in Python docstrings. That said,
I think users of doctest have moved away from embedded doctests in
docstrings -- it encourages doctests to have way too many "examples"
(test cases), which reduces their usefulness as documentation.

> 2. Smart Comparisons that will detect output of a non-ordered type
> (dict/set), lift and recast it and do a real comparison.

I think it's better to just always use ast.literal_eval on the output
as another form of testing for equivalence. This could break code, but
probably not any code worth caring about.

(in particular,
    >>> print 'r""'
    ""
would pass in a literal_eval-ing system, but not in some other system)

> Without #1, "literate testing" becomes awash with re-defining re-used
> variables which, generally, also detracts from exact purpose of the
> test -- this creates testdoc noise and the docs become less useful.
> Without #2, "readable docs" nicely co-aligning with "testable docs"
> tends towards divergence.
>
> Perhaps not enough developers use doctest to care, but I find it one
> of the more enjoyable ways to develop python code -- I don't have to
> remember test cases nor go through the trouble of setting up
> unittests. ? AND, it encourages agile development. ?Another user wrote
> a while back of even having a built-in test() method. ?Wouldn't that
> really encourage agile developement? ?And you wouldn't have to muddy
> up your code with "if __name__ == "__main__": import doctest, yadda
> yadda".
>
> Anyway... of course patches welcome, yes... ?;^)

Not exactly... doctest has no maintainer, and so no patches ever get
accepted. If you want to improve it, you'll have to fork it. I hope
you're that sort of person, because doctest can totally be improved.
It suffers a lot from people thinking of what it is rather than what
it could be. :(

I've in the past worked a bit on improving doctest in a fork I
started. Its primary purpose was originally to add Cram-like "shell
doctests" to doctest (see http://pypi.python.org/pypi/cram ), but
since then I started working on other bits here and there. The work
I've done is available at
https://bitbucket.org/devin.jeanpierre/doctest2 (please forgive the
presumptuous name -- I'm considering a rename to "lembas".)

The reason I've not worked on it recently is that the problems have
gotten harder and my time has run short. I would be very open to
collaboration or forking, although I also understand that a largeish
expansion with redesigned internals created by an overworked student
is probably not the greatest place to start.

This is all assuming your intentions are to contribute rather than
only suggest. Not that suggestions aren't welcome, I suppose, but
maybe not here. doctest is not actively developed or maintained
anywhere, as far as I know. (I want to say "except by me", because
that'd make me seem all special and so on, but I haven't committed a
thing in months.)

Mostly, I feel a bit like this thread could accidentally spawn
parallel / duplicated work, so I figured I'd put what I have out here.
Please don't take it for more than it is, doctest2 is still a work in
progress (and, worse, its source code is in the middle of two feature
additions!)

I definitely hope you help to make the doctest world better. I think
it fills a role that should be filled, and its neglect is unfortunate.

-- Devin


From ianb at colorstudy.com  Sat Feb 18 05:24:10 2012
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 17 Feb 2012 22:24:10 -0600
Subject: [Python-ideas] doctest
In-Reply-To: <mailman.32822.1329538377.27777.python-ideas@python.org>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
Message-ID: <CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>

On Feb 17, 2012 4:12 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com>
wrote:
> > Anyway... of course patches welcome, yes...  ;^)
>
> Not really. doctest is for *testing code example in docs*. If you try
> to use it for more than that, it's likely to drive you up the wall, so
> proposals to make it more than it is usually don't get a great
> reception (docs patches to make it's limitations clearer are generally
> welcome, though). The stdib solution for test driven development is
> unittest (the vast majority of our own regression suite is written
> that way - only a small proportion uses doctest).

This pessimistic attitude is why doctest is challenging to work with at
times, not anything to do with doctest's actual model.  The constant
criticisms of doctest keep contributors away, and keep its many resolvable
problems from being resolved.

> An interesting third party alternative that has been created recently
> is behave: http://crate.io/packages/behave/

This style of test is why it's so sad that doctest is ignored and
unmaintained.  It's based on testing patterns developed by people who care
to promote what they are doing, but I'm of the strong opinion that they are
inferior to doctest.

  Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120217/27821cf1/attachment.html>

From ianb at colorstudy.com  Sat Feb 18 05:25:02 2012
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 17 Feb 2012 22:25:02 -0600
Subject: [Python-ideas] Fwd: Re:  doctest
In-Reply-To: <mailman.32823.1329539015.27777.python-ideas@python.org>
References: <mailman.32823.1329539015.27777.python-ideas@python.org>
Message-ID: <CAHtfchookS_ur-B82CbHGi3+GzsxXytQvVGjnau38dZAf5+CWg@mail.gmail.com>

On Feb 17, 2012 3:58 PM, "Mark Janssen" <dreamingforward at gmail.com> wrote:
> I find myself wanting to use doctest for some test-driven development,
> and find myself slightly frustrated and wonder if others would be
> interested in seeing the following additional functionality in
> doctest:
>
> 1. Execution context determined by outer-scope doctest defintions
> 2. Smart Comparisons that will detect output of a non-ordered type
> (dict/set), lift and recast it and do a real comparison.
>
> Without #1, "literate testing" becomes awash with re-defining re-used
> variables which, generally, also detracts from exact purpose of the
> test -- this creates testdoc noise and the docs become less useful.

I dunno... I find the discipline of defining your prerequesites to be a
helpful feature of doctest (I find TestCase.setUp to be smelly).  You can
include a namespace in doctest invocations, but I'm guessing the problem is
that you aren't able to give these settings when using some kind of test
collector/runner?  More flexible ways of defining doctest options (e.g.,
ELLIPSIS) would be helpful.

> Without #2, "readable docs" nicely co-aligning with "testable docs"
> tends towards divergence.

IMHO this could be more easily solved by replacing the standard repr with
one that is more predictable.  At least that would handle dictionaries, it
becomes a bit more difficult for custom types.  Also it diverges from being
exactly like the console, but eh, I don't think that's a big advantage.

Unfortunately plugging in a custom repr is kind of hard; Python has a way
to specifically compile expressions into "print repr(expr)" (more or less)
but no general way to get the value of expressions (while also handling
statements).  But if you wanted to try it, I did figure out a terrible hack
for it.

  Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120217/d13b97ae/attachment.html>

From steve at pearwood.info  Sat Feb 18 05:50:58 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 18 Feb 2012 15:50:58 +1100
Subject: [Python-ideas] doctest
In-Reply-To: <CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
Message-ID: <4F3F2E32.7070907@pearwood.info>

Nick Coghlan wrote:
> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com> wrote:
>> Anyway... of course patches welcome, yes...  ;^)
> 
> Not really. doctest is for *testing code example in docs*. If you try
> to use it for more than that, it's likely to drive you up the wall,

Really? Not in my experience, although I admit I haven't tried to push the 
envelope too far.

But I haven't had any problem with a literate programming model:

* Use short, self-contained but not necessarily exhaustive examples in the 
code's docstrings (I don't try to give examples of *every* combination of good 
and bad data, special cases, etc. in the docstring).

* Write extensive (ideally exhaustive) examples with explanatory text, in a 
separate text file.

I generally do this to describe, explain and test the interface, rather than 
the implementation, but I see no reason why it wouldn't work for the 
implementation as well. It would require writing for the next maintainer 
rather than for a user of the library.

In the external test text file(s), examples don't necessarily need to be 
self-contained. I have an entire document to create a test environment, if 
necessary, and can include extra functions, stubs, mocks, etc. as needed, 
without clashing with the primary purpose of docstrings to be *documentation* 
first and tests a distant second.

If need be, test infrastructure can go into an external module, to be 
imported, rather than in-place in the doctest file.

In my experience, this works well for algorithmic code that doesn't rely on 
external resources. If my tests require setting up and tearing down resources, 
I stick to unittest which has better setup/teardown support. (It would be hard 
to have *less* support for setup and teardown than doctest.) But otherwise, I 
haven't run into any problems with doctest other than the perennial "oops, I 
forgot to escape my backslashes!".


-- 
Steven


From guido at python.org  Sat Feb 18 05:55:48 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 17 Feb 2012 20:55:48 -0800
Subject: [Python-ideas] Fwd: Re: doctest
In-Reply-To: <CAHtfchookS_ur-B82CbHGi3+GzsxXytQvVGjnau38dZAf5+CWg@mail.gmail.com>
References: <mailman.32823.1329539015.27777.python-ideas@python.org>
	<CAHtfchookS_ur-B82CbHGi3+GzsxXytQvVGjnau38dZAf5+CWg@mail.gmail.com>
Message-ID: <CAP7+vJKozBs8C6JxQK6=pLd96pKKoPGvTcc-TdgPDZk7-=uY5Q@mail.gmail.com>

On Fri, Feb 17, 2012 at 8:25 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> On Feb 17, 2012 3:58 PM, "Mark Janssen" <dreamingforward at gmail.com> wrote:
>> I find myself wanting to use doctest for some test-driven development,
>> and find myself slightly frustrated and wonder if others would be
>> interested in seeing the following additional functionality in
>> doctest:
>>
>> 1. Execution context determined by outer-scope doctest defintions
>> 2. Smart Comparisons that will detect output of a non-ordered type
>> (dict/set), lift and recast it and do a real comparison.
>>
>> Without #1, "literate testing" becomes awash with re-defining re-used
>> variables which, generally, also detracts from exact purpose of the
>> test -- this creates testdoc noise and the docs become less useful.
>
> I dunno... I find the discipline of defining your prerequesites to be a
> helpful feature of doctest (I find TestCase.setUp to be smelly).? You can
> include a namespace in doctest invocations, but I'm guessing the problem is
> that you aren't able to give these settings when using some kind of test
> collector/runner?? More flexible ways of defining doctest options (e.g.,
> ELLIPSIS) would be helpful.
>
>> Without #2, "readable docs" nicely co-aligning with "testable docs"
>> tends towards divergence.
>
> IMHO this could be more easily solved by replacing the standard repr with
> one that is more predictable.? At least that would handle dictionaries, it
> becomes a bit more difficult for custom types.? Also it diverges from being
> exactly like the console, but eh, I don't think that's a big advantage.
>
> Unfortunately plugging in a custom repr is kind of hard; Python has a way to
> specifically compile expressions into "print repr(expr)" (more or less) but
> no general way to get the value of expressions (while also handling
> statements).? But if you wanted to try it, I did figure out a terrible hack
> for it.

Isn't sys.displayhook() usable for this purpose?

-- 
--Guido van Rossum (python.org/~guido)


From steve at pearwood.info  Sat Feb 18 06:08:16 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 18 Feb 2012 16:08:16 +1100
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
Message-ID: <4F3F3240.4090104@pearwood.info>

Mark Janssen wrote:
> I find myself wanting to use doctest for some test-driven development,
> and find myself slightly frustrated and wonder if others would be
> interested in seeing the following additional functionality in
> doctest:
> 
> 1. Execution context determined by outer-scope doctest defintions.

Can you give an example of how you would like this to work?


> 2. Smart Comparisons that will detect output of a non-ordered type
> (dict/set), lift and recast it and do a real comparison.

I would love to see a doctest directive that accepted differences in output 
order, e.g. would match {1, 2, 3} and {3, 1, 2}. But I think that's a hard 
problem to solve in the general case. Should it match 123 and 312? I don't 
think so.

Just coming up with a clear and detailed set of requirements for (e.g.) 
#doctest:+IGNORE_ORDER may be tricky.


I'd like a #3 as well: an abbreviated way to spell doctest directives, because 
they invariably push my tests well past the 80 character mark.


-- 
Steven


From ncoghlan at gmail.com  Sat Feb 18 16:30:58 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 19 Feb 2012 01:30:58 +1000
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
Message-ID: <CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>

On Sat, Feb 18, 2012 at 8:16 AM, Nathan Rice
<nathan.alexander.rice at gmail.com> wrote:
> I just wanted to chime in on this, because I understand the use cases
> and benefits of this but the code is very semantically opaque and
> imperative. ?I also feel like a lot of C programming concepts and
> semantics have leaked into the design. ?Additionally, I feel that
> there are some benefits to taking a step back and looking at this
> problem as part of a bigger picture.

So... context managers are not a good fit for general event handling. Correct.

Given that I agree with your basic point, I'm not sure what the rest
of that had to do with anything, unless you heard the word "callback"
and immediately assumed I was talking about general event handling
rather than Go defer'ed style cleanup APIs (along with a replacement
for the bug-prone, irredeemably flawed contextlib.nested API).

I'm not - what I'm planning would be a terrible API for general event
handling. Fortunately, it's just a replacement for contextlib.nested()
as a tool for programmatic management of context managers. If you want
nice clean callbacks for general event handling, Python doesn't
currently provide that. (We certainly don't have anything that gets
remotely close to the elegance of Ruby's blocks for that style of
programming: http://www.boredomandlaziness.org/2011/10/correcting-ignorance-learning-bit-about.html)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From anacrolix at gmail.com  Sat Feb 18 16:38:06 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sat, 18 Feb 2012 23:38:06 +0800
Subject: [Python-ideas] channel (synchronous queue)
Message-ID: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>

Recently (for some) the CSP style of channel has become quite popular
in concurrency implementations. This kind of channel allows sends that
do not complete until a receiver has actually taken the item. The
existing  queue.Queue would act like this if it didn't treat a queue
size of 0 as infinite capacity.

In particular, I find channels to have value when sending data between
threads, where it doesn't make sense to proceed until some current
item has been accepted. This is useful when items are not purely CPU
bound, and so generators are not appropriate.

I believe this rendezvous behaviour can be added to queue.Queue for
the maxsize=0 case, with maxsize=None being the existing "infinite
queue" behaviour. Additionally a close method, Closed exception and
other usability features like an __iter__ for receiving until closed
can be added. The stackless class linked below also has some other
possible ideas for performance reasons that make a lot of sense.

Existing code using queue.Queue would remain completely unaffected by
such additions if the default maxsize value is changed to
maxsize=None, and maxsize=0 is not being explicitly passed (it's
currently the default).

Here are a few links for some background and ideas:
http://gevent.org/gevent.queue.html#gevent.queue.Queue
http://www.disinterest.org/resource/stackless/2.6-docs-html/library/stackless/channels.html#the-channel-class
http://en.wikipedia.org/wiki/Communicating_sequential_processes#Comparison_with_the_Actor_Model
http://golang.org/doc/go_spec.html#Channel_types


From arnodel at gmail.com  Sat Feb 18 19:57:55 2012
From: arnodel at gmail.com (Arnaud Delobelle)
Date: Sat, 18 Feb 2012 18:57:55 +0000
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
Message-ID: <CAJ6cK1Z935giuFmJJ49xaFEp4j52qsJNJ2=ZP69ffCPxW2o=9A@mail.gmail.com>

On 18 February 2012 15:38, Matt Joiner <anacrolix at gmail.com> wrote:
> Recently (for some) the CSP style of channel has become quite popular
> in concurrency implementations. This kind of channel allows sends that
> do not complete until a receiver has actually taken the item. The
> existing ?queue.Queue would act like this if it didn't treat a queue
> size of 0 as infinite capacity.

I don't know if that's exactly what you have in mind, but you can
implement a channel very simply with a threading.Barrier object (new
in Python 3.2).  I'm no specialist of concurrency at all, but it seems
that this is what you are describing (what in the go language is
called a "synchronous channel" I think):

from threading import Barrier

class Channel:
    def __init__(self):
        self._sync = Barrier(2)
        self._values = [None, None]
    def send(self, value=None):
        i = self._sync.wait()
        self._values[i] = value
        self._sync.wait()
        return self._values[1 - i]
    def get(self):
        return self.send()

Then with the following convenience function to start a function in a
new thread:

from threading import Thread

def go(f, *args, **kwargs):
    thread = Thread(target=f, args=args, kwargs=kwargs)
    thread.start()
    return thread

You can have e.g. the scenario:

ch = Channel()

def produce(ch):
    for i in count():
        print("sending", i)
        ch.send(i)

def consume(ch, n):
    for i in range(n):
        print("getting", ch.get())

Giving you this:

>>> go(produce, ch)
sending 0
<Thread(Thread-103, started 4759019520)>
>>> go(consume, ch, 3)
<Thread(Thread-104, started 4763226112)>
getting 0
sending 1
getting 1
sending 2
getting 2
sending 3
>>> go(consume, ch, 5)
<Thread(Thread-105, started 4763226112)>
getting 3
sending 4
getting 4
sending 5
getting 5
sending 6
getting 6
sending 7
getting 7
sending 8
>>>

-- 
Arnaud


From solipsis at pitrou.net  Sat Feb 18 20:01:08 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 18 Feb 2012 20:01:08 +0100
Subject: [Python-ideas] channel (synchronous queue)
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
Message-ID: <20120218200108.2b72ab9f@pitrou.net>

On Sat, 18 Feb 2012 23:38:06 +0800
Matt Joiner <anacrolix at gmail.com> wrote:
> Recently (for some) the CSP style of channel has become quite popular
> in concurrency implementations. This kind of channel allows sends that
> do not complete until a receiver has actually taken the item. The
> existing  queue.Queue would act like this if it didn't treat a queue
> size of 0 as infinite capacity.
> 
> In particular, I find channels to have value when sending data between
> threads, where it doesn't make sense to proceed until some current
> item has been accepted. This is useful when items are not purely CPU
> bound, and so generators are not appropriate.

What is the point to process the data in another thread, if you are
going to block on the result anyway?

Antoine.


From nathan.alexander.rice at gmail.com  Sat Feb 18 23:33:16 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Sat, 18 Feb 2012 17:33:16 -0500
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
Message-ID: <CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>

On Sat, Feb 18, 2012 at 10:30 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Feb 18, 2012 at 8:16 AM, Nathan Rice
> <nathan.alexander.rice at gmail.com> wrote:
>> I just wanted to chime in on this, because I understand the use cases
>> and benefits of this but the code is very semantically opaque and
>> imperative. ?I also feel like a lot of C programming concepts and
>> semantics have leaked into the design. ?Additionally, I feel that
>> there are some benefits to taking a step back and looking at this
>> problem as part of a bigger picture.
>
> So... context managers are not a good fit for general event handling. Correct.
>
> Given that I agree with your basic point, I'm not sure what the rest
> of that had to do with anything, unless you heard the word "callback"
> and immediately assumed I was talking about general event handling
> rather than Go defer'ed style cleanup APIs (along with a replacement
> for the bug-prone, irredeemably flawed contextlib.nested API).

My point was more that I feel like you're hitting a point where the
context manager as a programming and semantic construct is starting to
stretch pretty thin.  My gut feeling is that it might be more
productive to let context managers alone (I think they're in an okay
place with multiple managers in a single with statement) and start to
examine the larger class of problems of which the deferred cleanup is
a member.  Events can unify a lot of concepts in python, while
providing a much more elegant handle into third party code than is
currently possible.  For example...

Decorators, descriptors and exceptions can all be unified neatly as
events, and events let you reach into 3rd party code in a robust
manner.  I can't tell you the number of times I have had to subclass
multiple things from a third party library to fix a small,
unnecessarily limiting design decision.  I've even run into this with
authors who make very elegant libraries like Armin; nobody can predict
all the use cases for their code.  The best thing we can do is make it
easy to work around such problems.

I like the with statement in general, but if python is ever going to
embrace events, the farther you travel along this path the more
painful switching over is going to be down the line.

> I'm not - what I'm planning would be a terrible API for general event
> handling. Fortunately, it's just a replacement for contextlib.nested()
> as a tool for programmatic management of context managers. If you want
> nice clean callbacks for general event handling, Python doesn't
> currently provide that. (We certainly don't have anything that gets
> remotely close to the elegance of Ruby's blocks for that style of
> programming: http://www.boredomandlaziness.org/2011/10/correcting-ignorance-learning-bit-about.html)

I like ruby's blocks a lot.  I don't think they don't drink enough of
the koolaid though.  Blocks can be a gateway to powerful macros (if
you have first class expressions) and a mechanism for very elegant
currying and partial function evaluation.

I think something that is missing for me is a clear picture of where
Python is going.  I imagine between you, Guido, Martin, Anton, Georg
and Raymond (apologies to any of the primary group I'm forgetting)
there is some degree of tacit understanding.  My perspective on python
was framed by Peter Norvig's description of it as aspiring to be a
humane reexamination of lisp, but lately I get the feeling the target
would better be described as a 21st century pascal.


Nathan


From guido at python.org  Sat Feb 18 23:47:15 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Feb 2012 14:47:15 -0800
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
Message-ID: <CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>

On Sat, Feb 18, 2012 at 2:33 PM, Nathan Rice
<nathan.alexander.rice at gmail.com> wrote:
> On Sat, Feb 18, 2012 at 10:30 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On Sat, Feb 18, 2012 at 8:16 AM, Nathan Rice
>> <nathan.alexander.rice at gmail.com> wrote:
>>> I just wanted to chime in on this, because I understand the use cases
>>> and benefits of this but the code is very semantically opaque and
>>> imperative. ?I also feel like a lot of C programming concepts and
>>> semantics have leaked into the design. ?Additionally, I feel that
>>> there are some benefits to taking a step back and looking at this
>>> problem as part of a bigger picture.
>>
>> So... context managers are not a good fit for general event handling. Correct.
>>
>> Given that I agree with your basic point, I'm not sure what the rest
>> of that had to do with anything, unless you heard the word "callback"
>> and immediately assumed I was talking about general event handling
>> rather than Go defer'ed style cleanup APIs (along with a replacement
>> for the bug-prone, irredeemably flawed contextlib.nested API).
>
> My point was more that I feel like you're hitting a point where the
> context manager as a programming and semantic construct is starting to
> stretch pretty thin. ?My gut feeling is that it might be more
> productive to let context managers alone (I think they're in an okay
> place with multiple managers in a single with statement) and start to
> examine the larger class of problems of which the deferred cleanup is
> a member. ?Events can unify a lot of concepts in python, while
> providing a much more elegant handle into third party code than is
> currently possible. ?For example...
>
> Decorators, descriptors and exceptions can all be unified neatly as
> events, and events let you reach into 3rd party code in a robust
> manner. ?I can't tell you the number of times I have had to subclass
> multiple things from a third party library to fix a small,
> unnecessarily limiting design decision. ?I've even run into this with
> authors who make very elegant libraries like Armin; nobody can predict
> all the use cases for their code. ?The best thing we can do is make it
> easy to work around such problems.
>
> I like the with statement in general, but if python is ever going to
> embrace events, the farther you travel along this path the more
> painful switching over is going to be down the line.
>
>> I'm not - what I'm planning would be a terrible API for general event
>> handling. Fortunately, it's just a replacement for contextlib.nested()
>> as a tool for programmatic management of context managers. If you want
>> nice clean callbacks for general event handling, Python doesn't
>> currently provide that. (We certainly don't have anything that gets
>> remotely close to the elegance of Ruby's blocks for that style of
>> programming: http://www.boredomandlaziness.org/2011/10/correcting-ignorance-learning-bit-about.html)
>
> I like ruby's blocks a lot. ?I don't think they don't drink enough of
> the koolaid though. ?Blocks can be a gateway to powerful macros (if
> you have first class expressions) and a mechanism for very elegant
> currying and partial function evaluation.
>
> I think something that is missing for me is a clear picture of where
> Python is going. ?I imagine between you, Guido, Martin, Anton, Georg
> and Raymond (apologies to any of the primary group I'm forgetting)
> there is some degree of tacit understanding. ?My perspective on python
> was framed by Peter Norvig's description of it as aspiring to be a
> humane reexamination of lisp, but lately I get the feeling the target
> would better be described as a 21st century pascal.

Was that meant as an insult? Because it sounds to me like one.

-- 
--Guido van Rossum (python.org/~guido)


From p.f.moore at gmail.com  Sat Feb 18 23:48:05 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sat, 18 Feb 2012 22:48:05 +0000
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
Message-ID: <CACac1F-YJGuARjajcxQkn3MOu8QYvnSbcBuY_nACSR37WyvUbQ@mail.gmail.com>

On 18 February 2012 22:33, Nathan Rice <nathan.alexander.rice at gmail.com> wrote:
> Events can unify a lot of concepts in python, while
> providing a much more elegant handle into third party code than is
> currently possible.

You may have a point, but I find it hard to understand what you are
getting at. Would you be able to propose a specific syntax/semantics
to clarify what you're trying to express? (I think I get the general
concept, but I can't see how you imagine it to work).

Thanks,
Paul


From cs at zip.com.au  Sun Feb 19 00:05:25 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 19 Feb 2012 10:05:25 +1100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <20120218200108.2b72ab9f@pitrou.net>
References: <20120218200108.2b72ab9f@pitrou.net>
Message-ID: <20120218230525.GA14740@cskk.homeip.net>

On 18Feb2012 20:01, Antoine Pitrou <solipsis at pitrou.net> wrote:
| On Sat, 18 Feb 2012 23:38:06 +0800
| Matt Joiner <anacrolix at gmail.com> wrote:
| > Recently (for some) the CSP style of channel has become quite popular
| > in concurrency implementations. This kind of channel allows sends that
| > do not complete until a receiver has actually taken the item. The
| > existing  queue.Queue would act like this if it didn't treat a queue
| > size of 0 as infinite capacity.
| > 
| > In particular, I find channels to have value when sending data between
| > threads, where it doesn't make sense to proceed until some current
| > item has been accepted. This is useful when items are not purely CPU
| > bound, and so generators are not appropriate.
| 
| What is the point to process the data in another thread, if you are
| going to block on the result anyway?

Synchronisation. Shrug. I use synchronous channels myself; they are a
fine basic facility. The problem with Queues et al is that they are
inherently _asynchronous_ and you have to work hard to wrap locking
around it when you want interlocking cogs.

Also, it is perfectly reasonable in many circumstances to use a thread
for algorithmic clarity, just like you might use a generator or a
coroutine in suitable circumstances. Here one does it not so that some
work may process in parallel but to cleanly write two algorithms that
pass information between each other but are otherwise as separate as an
aother pair of functions might be. The alternative may be a complicated
interwoven event loop.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

C makes it easy for you to shoot yourself in the foot.  C++ makes that
harder, but when you do, it blows away your whole leg.
- Bjarne Stroustrup


From greg.ewing at canterbury.ac.nz  Sun Feb 19 00:37:19 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 19 Feb 2012 12:37:19 +1300
Subject: [Python-ideas] Python 3000 TIOBE -3%
In-Reply-To: <4F3D5A38.6070901@stoneleaf.us>
References: <jh6fu0$qdg$1@dough.gmane.org>
	<AF767F62-EABF-4D57-A762-49C0DA15C538@masklinn.net>
	<874nuvfhnb.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7czfdiFpmQftiejxYNBhCMkwgkPpCbzAnjyoK81Vw8Bpg@mail.gmail.com>
	<87lio5onav.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fXN32tsQ5bxaBvdKqB4FOBD+p94zNO4F5CNiL8df6rYQ@mail.gmail.com>
	<70089F52-E9AB-4C3D-97BA-88A5BC11B976@gmail.com>
	<CA+OGgf5WXMdHTv-Gk3egtpL41KrOnsvN__JHOAUkOnroH6t7JA@mail.gmail.com>
	<4F3AE675.6010907@mrabarnett.plus.com> <87haytyms7.fsf@benfinney.id.au>
	<20120215133912.GA17040@iskra.aviel.ru> <4F3C5DC8.707@canterbury.ac.nz>
	<4F3D1EF3.40203@stoneleaf.us>
	<8739aaoid0.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4F3D5A38.6070901@stoneleaf.us>
Message-ID: <4F40362F.6060400@canterbury.ac.nz>

Ethan Furman wrote:
> I can see 
> confusion again creeping in when somebody (like myself ;) sees a 
> datatype which seemingly supports a mixture of unicode and raw bytes 
> only to find out that 'uni_raw(...)[5] != 32' because a u' ' was 
> returned and an integer (or raw byte) was expected at that location.

I wasn't intending that an int would be returned when you
index a non-char position. Indexing and slicing would always
return another mixed-string object.

-- 
Greg


From steve at pearwood.info  Sun Feb 19 00:52:57 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 19 Feb 2012 10:52:57 +1100
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
Message-ID: <4F4039D9.2010909@pearwood.info>

Guido van Rossum wrote:

>> I think something that is missing for me is a clear picture of where
>> Python is going.  I imagine between you, Guido, Martin, Anton, Georg
>> and Raymond (apologies to any of the primary group I'm forgetting)
>> there is some degree of tacit understanding.  My perspective on python
>> was framed by Peter Norvig's description of it as aspiring to be a
>> humane reexamination of lisp, but lately I get the feeling the target
>> would better be described as a 21st century pascal.
> 
> Was that meant as an insult? Because it sounds to me like one.


I hope not. I like Pascal. It has nice, clean syntax (if a tad verbose, with 
the BEGIN/END tags) and straight-forward, simple semantics. Standard Pascal is 
somewhat lacking (e.g. no strings) but who uses standard Pascal?

Without wishing to deny the strengths of C, I think the computing world would 
be a lot better if C was closer to Pascal than if Pascal had been closer to C.


-- 
Steven


From greg.ewing at canterbury.ac.nz  Sun Feb 19 00:56:35 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 19 Feb 2012 12:56:35 +1300
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <jhjrvt$kre$1@dough.gmane.org>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net>
	<jhgbnu$2q6$1@dough.gmane.org> <20120215180250.21a05ddf@pitrou.net>
	<jhgqs1$3kf$1@dough.gmane.org> <4F3C6246.7000509@canterbury.ac.nz>
	<jhjrvt$kre$1@dough.gmane.org>
Message-ID: <4F403AB3.3030502@canterbury.ac.nz>

shibturn wrote:

> If the receiving process is expecting an fd then that certainly works. 
> But making it work transparently with pickle is difficult. 

Is making it work with pickle a requirement? The point of using
shared memory is to avoid the need for serialising and deserialising.

-- 
Greg


From nathan.alexander.rice at gmail.com  Sun Feb 19 00:57:52 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Sat, 18 Feb 2012 18:57:52 -0500
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
Message-ID: <CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>

> Was that meant as an insult? Because it sounds to me like one.

I'm sorry if my poor wording caused it to come across that way.
Pascal was a very useful language, it with a perspective that was
different than its contemporaries because it was originally intended
for educational purposes, rather than as an academic language like
lisp or a hacker tool like c or fortran.

I enjoy writing python a lot, and would prefer to use it rather than
ruby/lisp/java/etc in most cases. My suggestions come from
frustrations that occur when using python in areas where the right
answer is probably just to use a different language.  If I knew that
what I wanted was at odds with the vision for python, I would have
less of an issue just accepting circumstances, and would just get to
work rather than sidetracking discussions on this list.

Thanks, and again, sorry!


Nathan


From sturla at molden.no  Sun Feb 19 01:19:15 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 01:19:15 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
Message-ID: <4F404003.4060403@molden.no>

Den 18.02.2012 16:38, skrev Matt Joiner:
> Recently (for some) the CSP style of channel has become quite popular
> in concurrency implementations. This kind of channel allows sends that
> do not complete until a receiver has actually taken the item. The
> existing  queue.Queue would act like this if it didn't treat a queue
> size of 0 as infinite capacity.
>
> In particular, I find channels to have value when sending data between
> threads, where it doesn't make sense to proceed until some current
> item has been accepted.

That is the most common cause of deadlock in number crunching code using 
MPI.

Process A sends message to Process B, waits for B to receive
Process B sends message to Process A, waits for A to receive

... and now we just wait ...

I am really glad the queues on Python do not do this.

Sturla


From ron3200 at gmail.com  Sun Feb 19 01:27:23 2012
From: ron3200 at gmail.com (Ron Adam)
Date: Sat, 18 Feb 2012 18:27:23 -0600
Subject: [Python-ideas] doctest
In-Reply-To: <CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
Message-ID: <1329611243.27188.4.camel@Gutsy>

On Sat, 2012-02-18 at 08:12 +1000, Nick Coghlan wrote:
> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com> wrote:
> > Anyway... of course patches welcome, yes...  ;^)
> 
> Not really. doctest is for *testing code example in docs*. If you try
> to use it for more than that, it's likely to drive you up the wall, so
> proposals to make it more than it is usually don't get a great
> reception (docs patches to make it's limitations clearer are generally
> welcome, though). The stdib solution for test driven development is
> unittest (the vast majority of our own regression suite is written
> that way - only a small proportion uses doctest).

I love doctest for *testing while I develop code*.  

Cheers,
   Ron


From anacrolix at gmail.com  Sun Feb 19 01:30:10 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 19 Feb 2012 08:30:10 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAJ6cK1Z935giuFmJJ49xaFEp4j52qsJNJ2=ZP69ffCPxW2o=9A@mail.gmail.com>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
	<CAJ6cK1Z935giuFmJJ49xaFEp4j52qsJNJ2=ZP69ffCPxW2o=9A@mail.gmail.com>
Message-ID: <CAB4yi1M7aVvJFqdAdy_cnoZqoHYwn7waOJ0iDScf2jDSHtRkJw@mail.gmail.com>

I'm not sure that your example allows for multiple senders and receivers to
block at the same time. I'm also not sure why the senders are receiving
values. It's definitely an interesting approach to use a Barrier but
incomplete.
On Feb 19, 2012 2:57 AM, "Arnaud Delobelle" <arnodel at gmail.com> wrote:

> On 18 February 2012 15:38, Matt Joiner <anacrolix at gmail.com> wrote:
> > Recently (for some) the CSP style of channel has become quite popular
> > in concurrency implementations. This kind of channel allows sends that
> > do not complete until a receiver has actually taken the item. The
> > existing  queue.Queue would act like this if it didn't treat a queue
> > size of 0 as infinite capacity.
>
> I don't know if that's exactly what you have in mind, but you can
> implement a channel very simply with a threading.Barrier object (new
> in Python 3.2).  I'm no specialist of concurrency at all, but it seems
> that this is what you are describing (what in the go language is
> called a "synchronous channel" I think):
>
> from threading import Barrier
>
> class Channel:
>    def __init__(self):
>        self._sync = Barrier(2)
>        self._values = [None, None]
>    def send(self, value=None):
>        i = self._sync.wait()
>        self._values[i] = value
>        self._sync.wait()
>        return self._values[1 - i]
>    def get(self):
>        return self.send()
>
> Then with the following convenience function to start a function in a
> new thread:
>
> from threading import Thread
>
> def go(f, *args, **kwargs):
>    thread = Thread(target=f, args=args, kwargs=kwargs)
>    thread.start()
>    return thread
>
> You can have e.g. the scenario:
>
> ch = Channel()
>
> def produce(ch):
>    for i in count():
>        print("sending", i)
>        ch.send(i)
>
> def consume(ch, n):
>    for i in range(n):
>        print("getting", ch.get())
>
> Giving you this:
>
> >>> go(produce, ch)
> sending 0
> <Thread(Thread-103, started 4759019520)>
> >>> go(consume, ch, 3)
> <Thread(Thread-104, started 4763226112)>
> getting 0
> sending 1
> getting 1
> sending 2
> getting 2
> sending 3
> >>> go(consume, ch, 5)
> <Thread(Thread-105, started 4763226112)>
> getting 3
> sending 4
> getting 4
> sending 5
> getting 5
> sending 6
> getting 6
> sending 7
> getting 7
> sending 8
> >>>
>
> --
> Arnaud
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120219/e966dfbf/attachment.html>

From anacrolix at gmail.com  Sun Feb 19 01:31:56 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 19 Feb 2012 08:31:56 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <20120218200108.2b72ab9f@pitrou.net>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
	<20120218200108.2b72ab9f@pitrou.net>
Message-ID: <CAB4yi1Nb1BPVjZdqWnXe8r3NQ4szaB7pBgZ1g1P=uXZmjExy_g@mail.gmail.com>

Cameron explained this better than I could.
On Feb 19, 2012 3:05 AM, "Antoine Pitrou" <solipsis at pitrou.net> wrote:

> On Sat, 18 Feb 2012 23:38:06 +0800
> Matt Joiner <anacrolix at gmail.com> wrote:
> > Recently (for some) the CSP style of channel has become quite popular
> > in concurrency implementations. This kind of channel allows sends that
> > do not complete until a receiver has actually taken the item. The
> > existing  queue.Queue would act like this if it didn't treat a queue
> > size of 0 as infinite capacity.
> >
> > In particular, I find channels to have value when sending data between
> > threads, where it doesn't make sense to proceed until some current
> > item has been accepted. This is useful when items are not purely CPU
> > bound, and so generators are not appropriate.
>
> What is the point to process the data in another thread, if you are
> going to block on the result anyway?
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120219/ff9b3def/attachment.html>

From anacrolix at gmail.com  Sun Feb 19 01:39:16 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 19 Feb 2012 08:39:16 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F404003.4060403@molden.no>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
	<4F404003.4060403@molden.no>
Message-ID: <CAB4yi1PjJ-ZtLropnZ2GhNTqBc-P=QrTfA-RfpO07dWzpKpmRA@mail.gmail.com>

Yes, channels can allow for this, but as with locks directionality and
ordering matter. Typically messages will only run in a particular
direction. Nor will all channels be synchronous (they're a tool, not a
panacea), they might be intermixed with infinite asynchronous queues as is
commonplace at the moment.
On Feb 19, 2012 8:19 AM, "Sturla Molden" <sturla at molden.no> wrote:

> Den 18.02.2012 16:38, skrev Matt Joiner:
>
>> Recently (for some) the CSP style of channel has become quite popular
>> in concurrency implementations. This kind of channel allows sends that
>> do not complete until a receiver has actually taken the item. The
>> existing  queue.Queue would act like this if it didn't treat a queue
>> size of 0 as infinite capacity.
>>
>> In particular, I find channels to have value when sending data between
>> threads, where it doesn't make sense to proceed until some current
>> item has been accepted.
>>
>
> That is the most common cause of deadlock in number crunching code using
> MPI.
>
> Process A sends message to Process B, waits for B to receive
> Process B sends message to Process A, waits for A to receive
>
> ... and now we just wait ...
>
> I am really glad the queues on Python do not do this.
>
> Sturla
>
>
>
>
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120219/26c78cc6/attachment.html>

From guido at python.org  Sun Feb 19 01:48:16 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Feb 2012 16:48:16 -0800
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
Message-ID: <CAP7+vJK4=RiDi0b+VA7wWNzJ1C4WGBfL3_=EjDdGYH8=jvaBXg@mail.gmail.com>

On Sat, Feb 18, 2012 at 3:57 PM, Nathan Rice
<nathan.alexander.rice at gmail.com> wrote:
>> Was that meant as an insult? Because it sounds to me like one.
>
> I'm sorry if my poor wording caused it to come across that way.
> Pascal was a very useful language, it with a perspective that was
> different than its contemporaries because it was originally intended
> for educational purposes, rather than as an academic language like
> lisp or a hacker tool like c or fortran.

I have no ideal how old you are, or what your background is, so I
don't know if you have all that from personal experience or from
hearsay.

I do know that for me, when I first learned Pascal on the Control Data
mainframe in 1974, it was the ultimate hacker tool. (Well,
penultimate. Assembler was the ultimate. But even then it was a last
resort.)

Pascal was also developed by an academic. I never got much out of
Lisp. So I guess it's a matter of perspective.

> I enjoy writing python a lot, and would prefer to use it rather than
> ruby/lisp/java/etc in most cases. My suggestions come from
> frustrations that occur when using python in areas where the right
> answer is probably just to use a different language. ?If I knew that
> what I wanted was at odds with the vision for python, I would have
> less of an issue just accepting circumstances, and would just get to
> work rather than sidetracking discussions on this list.
>
> Thanks, and again, sorry!

I strongly recommend that you stick to describing your use cases and
tentatively exploring possible solutions, instead of trying to spout
sweeping controversial statements. Those just get in the way of
getting an exchange of ideas going.

-- 
--Guido van Rossum (python.org/~guido)


From shibturn at gmail.com  Sun Feb 19 01:50:07 2012
From: shibturn at gmail.com (shibturn)
Date: Sun, 19 Feb 2012 00:50:07 +0000
Subject: [Python-ideas] Adding shm_open to mmap?
In-Reply-To: <4F403AB3.3030502@canterbury.ac.nz>
References: <20120214185044.4c5ee513@bhuda.mired.org>
	<CADiSq7fZi3ZB=z5akJZpLZ+dOrSo5x5gcVfny=wQT8fs6gFR6A@mail.gmail.com>
	<20120214212539.7c5ffdef@bhuda.mired.org>
	<CADiSq7deX9Y5FzMbSTS18zh=6RxvmnxbTwX2dn1O_FOAFwDKKA@mail.gmail.com>
	<20120214231011.6fce4b3b@bhuda.mired.org>
	<20120215133419.230ea8e6@pitrou.net> <jhgbnu$2q6$1@dough.gmane.org>
	<20120215180250.21a05ddf@pitrou.net> <jhgqs1$3kf$1@dough.gmane.org>
	<4F3C6246.7000509@canterbury.ac.nz> <jhjrvt$kre$1@dough.gmane.org>
	<4F403AB3.3030502@canterbury.ac.nz>
Message-ID: <jhph06$jor$1@dough.gmane.org>

On 18/02/2012 11:56pm, Greg Ewing wrote:
> shibturn wrote:
>
>> If the receiving process is expecting an fd then that certainly works.
>> But making it work transparently with pickle is difficult.
>
> Is making it work with pickle a requirement? The point of using
> shared memory is to avoid the need for serialising and deserialising.
>

The point is to avoiding having to pickle/unpickle the *data*.  Being 
able to pickle/unpickle a *reference* to the data would be rather 
convenient.  Then, for instance, you can put references to blocks of raw 
data on a queue.

sbt


From sturla at molden.no  Sun Feb 19 01:59:20 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 01:59:20 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAB4yi1PjJ-ZtLropnZ2GhNTqBc-P=QrTfA-RfpO07dWzpKpmRA@mail.gmail.com>
References: <CAB4yi1Mb8QJNQSws6GDHwwZixfjVicxfqGhqpw5YUmB7_Taxbg@mail.gmail.com>
	<4F404003.4060403@molden.no>
	<CAB4yi1PjJ-ZtLropnZ2GhNTqBc-P=QrTfA-RfpO07dWzpKpmRA@mail.gmail.com>
Message-ID: <4F404968.5070000@molden.no>

Den 19.02.2012 01:39, skrev Matt Joiner:
>
> Yes, channels can allow for this, but as with locks directionality and 
> ordering matter. Typically messages will only run in a particular 
> direction.
>

Actually, it was only a synchronous MPI_Recv that did this in MPI, a 
synchronous
MPI_Send would have been even worse. Which is why MPI got the
asynchronous method MPI_Irecv...

Sounds like you just want a barrier or a condition primitive. E.g. have 
the sender
call .wait() on a condition and let the receiver call .notify() the 
condition.


Sturla


From ncoghlan at gmail.com  Sun Feb 19 02:27:12 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 19 Feb 2012 11:27:12 +1000
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
Message-ID: <CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>

On Sun, Feb 19, 2012 at 9:57 AM, Nathan Rice
<nathan.alexander.rice at gmail.com> wrote:
> I enjoy writing python a lot, and would prefer to use it rather than
> ruby/lisp/java/etc in most cases. My suggestions come from
> frustrations that occur when using python in areas where the right
> answer is probably just to use a different language. ?If I knew that
> what I wanted was at odds with the vision for python, I would have
> less of an issue just accepting circumstances, and would just get to
> work rather than sidetracking discussions on this list.

The core problem comes down to the differences between Guido's
original PEP 340 idea (which was much closer in power to Ruby's
blocks, since it was a new looping construct that allowed 0-or-more
executions of the contained block) and the more constrained with
statement that is defined in PEP 343 (which will either execute the
body once or throw an exception, distinguishing it clearly from both
the looping constructs and if statements).

The principle Guido articulated when making that decision was:
"different forms of flow control should look different at the point of
invocation".

So, where a language like Ruby just defines one protocol (callbacks,
supplemented by anonymous blocks that run directly in the namespace of
the containing function) and uses it for pretty much *all* flow
control (including all their loop constructs), Python works the other
way around, defining *different* protocols for different patterns of
invocation.

This provides a gain in readability on the Python side. When you see
any of the following in Python:

   @whatever()
   def f():
        pass

    with whatever():
        # Do something!

    for x in whatever():
        # Do something!

It places a lot of constraints on the nature of the object returned by
"whatever()" - even without knowing anything else about it, you know
the first must return a decorator, the second a context manager, and
the third an iterable. If that's all you need to know at this point in
time, you don't need to worry about the details - the local syntax
tells you the important things you need to know about the flow
control.

In Ruby, though, all of them (assuming it isn't actually important
that the function name be bound locally) could be written like this:

    whatever() do:
        # Do something!
    end

Is it a context manager? An iterable? Some other kind of callback?
There's nothing in the syntax to tell you that - you're relying on
naming conventions to provide that information (like the ".foreach"
convention for iteration methods). That approach can obviously work
(otherwise Ruby wouldn't be as popular as it is), but it *does* make
it harder to pick up a piece of code and understand the possible
control flows without looking elsewhere.

However, this decision to be explicit about flow control for the
benefit of the *reader* brings with it a high *cost* on the Python
side for the code *author*: where Ruby works by defining a nice syntax
and semantics for callback based programming and building other
language constructs on top of that, Python *doesn't currently have* a
particularly nice general purpose native syntax for callback based
programming.

Decorators do work in many cases (especially simple callback
registration), but they sometimes feel wrong because they're mainly
designed to modify how a function is defined, not implement key
program flow control constructs. However, their flexibility shouldn't
be underestimated, and the CallbackStack API is designed to help
Python developers push decorators and context managers closer to those
limits *without* needing new language constructs. By decoupling the
callback stack from the code layout, it gives you full *programmatic*
control of the kinds of things context managers can help with when you
know in advance exactly what you want to do.

*If* CallbackStack proves genuinely popular (and given the number of
proposals I have seen along these lines, and the feedback I have
received on ContextStack to date, I expect it will), and people start
to develop interesting patterns for using it, *then* we can start
looking at the possibility of dedicated syntax to streamline
particular use cases (just as the with statement itself was designed
to streamline various use cases of the more general try statement).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sun Feb 19 02:54:26 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 19 Feb 2012 11:54:26 +1000
Subject: [Python-ideas] doctest
In-Reply-To: <CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>
Message-ID: <CADiSq7fBUNffoZUJ=y=gxH8Pn78kh7pt0C9haAAnUhRBntp8hQ@mail.gmail.com>

On Sat, Feb 18, 2012 at 10:43 AM, Devin Jeanpierre
<jeanpierreda at gmail.com> wrote:
> I've in the past worked a bit on improving doctest in a fork I
> started. Its primary purpose was originally to add Cram-like "shell
> doctests" to doctest (see http://pypi.python.org/pypi/cram ), but
> since then I started working on other bits here and there. The work
> I've done is available at
> https://bitbucket.org/devin.jeanpierre/doctest2 (please forgive the
> presumptuous name -- I'm considering a rename to "lembas".)
>
> The reason I've not worked on it recently is that the problems have
> gotten harder and my time has run short. I would be very open to
> collaboration or forking, although I also understand that a largeish
> expansion with redesigned internals created by an overworked student
> is probably not the greatest place to start.
>
> This is all assuming your intentions are to contribute rather than
> only suggest. Not that suggestions aren't welcome, I suppose, but
> maybe not here. doctest is not actively developed or maintained
> anywhere, as far as I know. (I want to say "except by me", because
> that'd make me seem all special and so on, but I haven't committed a
> thing in months.)
>
> Mostly, I feel a bit like this thread could accidentally spawn
> parallel / duplicated work, so I figured I'd put what I have out here.
> Please don't take it for more than it is, doctest2 is still a work in
> progress (and, worse, its source code is in the middle of two feature
> additions!)
>
> I definitely hope you help to make the doctest world better. I think
> it fills a role that should be filled, and its neglect is unfortunate.

Indeed, my apologies for my earlier crankiness (I should know by now
to stay away from mailing lists at crazy hours of the morning).

While it's obviously not the ideal, forking orphaned stdlib modules
and publishing new versions on PyPI can be an *excellent* idea. The
core development team is generally a fairly conservative bunch, so
unless a module has a sufficiently active maintainer that feels
entitled to make API design decisions, our default response to
proposals is going to be "no". One of the *best* ways to change this
is to develop a community around an enhanced version of the module -
one of our reasons for switching to a DVCS for our development was to
help make it easier for people to extract and merge stdlib updates
while maintaining their own versions. Then, when you come to
python-ideas to say "Hey, wouldn't this be a good idea?", it's
possible to point to the PyPI version and say:
- people have tried this and liked it
- I've been maintaining this for a while now and would continue to do
so for the standard library

Some major (current or planned) updates to the Python 3.3 standard
library occurred because folks decided the stdlib solutions were not
in an acceptable state and set out to improve them (specifically, the
packaging package came from the distutils2 fork, which continues as a
backport to early Python versions, and MRAB's regex module has been
approved for addition, although it hasn't actually been incorporated
yet). In the past, other major additions like argparse came about that
way.

A few other stdlib modules have backports on PyPI by their respective
stlib maintainers so we can try out new design concepts *before*
committing to supporting them in the standard library.

A published version of doctest2 that was designed to be suitable for
eventual incorporation back into doctest itself (i.e. by maintaining
backwards compatibility) sounds like it would be quite popular, and
would route around the fact that enhancing it isn't high on the
priority list for the current core development team.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Sun Feb 19 03:27:56 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 18 Feb 2012 19:27:56 -0700
Subject: [Python-ideas] doctest
In-Reply-To: <CADiSq7fBUNffoZUJ=y=gxH8Pn78kh7pt0C9haAAnUhRBntp8hQ@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>
	<CADiSq7fBUNffoZUJ=y=gxH8Pn78kh7pt0C9haAAnUhRBntp8hQ@mail.gmail.com>
Message-ID: <CALFfu7BSptRqCDFjp8=1wwrogK76qz4rVT3teVsir2uvNi7Sfg@mail.gmail.com>

On Sat, Feb 18, 2012 at 6:54 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> While it's obviously not the ideal, forking orphaned stdlib modules
> and publishing new versions on PyPI can be an *excellent* idea. The
> core development team is generally a fairly conservative bunch, so
> unless a module has a sufficiently active maintainer that feels
> entitled to make API design decisions, our default response to
> proposals is going to be "no". One of the *best* ways to change this
> is to develop a community around an enhanced version of the module -
> one of our reasons for switching to a DVCS for our development was to
> help make it easier for people to extract and merge stdlib updates
> while maintaining their own versions. Then, when you come to
> python-ideas to say "Hey, wouldn't this be a good idea?", it's
> possible to point to the PyPI version and say:
> - people have tried this and liked it
> - I've been maintaining this for a while now and would continue to do
> so for the standard library
>
> Some major (current or planned) updates to the Python 3.3 standard
> library occurred because folks decided the stdlib solutions were not
> in an acceptable state and set out to improve them (specifically, the
> packaging package came from the distutils2 fork, which continues as a
> backport to early Python versions, and MRAB's regex module has been
> approved for addition, although it hasn't actually been incorporated
> yet). In the past, other major additions like argparse came about that
> way.
>
> A few other stdlib modules have backports on PyPI by their respective
> stlib maintainers so we can try out new design concepts *before*
> committing to supporting them in the standard library.
>
> A published version of doctest2 that was designed to be suitable for
> eventual incorporation back into doctest itself (i.e. by maintaining
> backwards compatibility) sounds like it would be quite popular, and
> would route around the fact that enhancing it isn't high on the
> priority list for the current core development team.

Well said, Nick.  That's worth putting in the devguide.

-eric


From guido at python.org  Sun Feb 19 05:06:08 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 18 Feb 2012 20:06:08 -0800
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
	<CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
Message-ID: <CAP7+vJK70aWhC7NYgO_p0wiDaMtYSLWargycDbXSqcNoG8XyUg@mail.gmail.com>

On Sat, Feb 18, 2012 at 5:27 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sun, Feb 19, 2012 at 9:57 AM, Nathan Rice
> <nathan.alexander.rice at gmail.com> wrote:
>> I enjoy writing python a lot, and would prefer to use it rather than
>> ruby/lisp/java/etc in most cases. My suggestions come from
>> frustrations that occur when using python in areas where the right
>> answer is probably just to use a different language. ?If I knew that
>> what I wanted was at odds with the vision for python, I would have
>> less of an issue just accepting circumstances, and would just get to
>> work rather than sidetracking discussions on this list.
>
> The core problem comes down to the differences between Guido's
> original PEP 340 idea (which was much closer in power to Ruby's
> blocks, since it was a new looping construct that allowed 0-or-more
> executions of the contained block) and the more constrained with
> statement that is defined in PEP 343 (which will either execute the
> body once or throw an exception, distinguishing it clearly from both
> the looping constructs and if statements).
>
> The principle Guido articulated when making that decision was:
> "different forms of flow control should look different at the point of
> invocation".
>
> So, where a language like Ruby just defines one protocol (callbacks,
> supplemented by anonymous blocks that run directly in the namespace of
> the containing function) and uses it for pretty much *all* flow
> control (including all their loop constructs), Python works the other
> way around, defining *different* protocols for different patterns of
> invocation.
>
> This provides a gain in readability on the Python side. When you see
> any of the following in Python:
>
> ? @whatever()
> ? def f():
> ? ? ? ?pass
>
> ? ?with whatever():
> ? ? ? ?# Do something!
>
> ? ?for x in whatever():
> ? ? ? ?# Do something!
>
> It places a lot of constraints on the nature of the object returned by
> "whatever()" - even without knowing anything else about it, you know
> the first must return a decorator, the second a context manager, and
> the third an iterable. If that's all you need to know at this point in
> time, you don't need to worry about the details - the local syntax
> tells you the important things you need to know about the flow
> control.
>
> In Ruby, though, all of them (assuming it isn't actually important
> that the function name be bound locally) could be written like this:
>
> ? ?whatever() do:
> ? ? ? ?# Do something!
> ? ?end
>
> Is it a context manager? An iterable? Some other kind of callback?
> There's nothing in the syntax to tell you that - you're relying on
> naming conventions to provide that information (like the ".foreach"
> convention for iteration methods). That approach can obviously work
> (otherwise Ruby wouldn't be as popular as it is), but it *does* make
> it harder to pick up a piece of code and understand the possible
> control flows without looking elsewhere.
>
> However, this decision to be explicit about flow control for the
> benefit of the *reader* brings with it a high *cost* on the Python
> side for the code *author*: where Ruby works by defining a nice syntax
> and semantics for callback based programming and building other
> language constructs on top of that, Python *doesn't currently have* a
> particularly nice general purpose native syntax for callback based
> programming.
>
> Decorators do work in many cases (especially simple callback
> registration), but they sometimes feel wrong because they're mainly
> designed to modify how a function is defined, not implement key
> program flow control constructs. However, their flexibility shouldn't
> be underestimated, and the CallbackStack API is designed to help
> Python developers push decorators and context managers closer to those
> limits *without* needing new language constructs. By decoupling the
> callback stack from the code layout, it gives you full *programmatic*
> control of the kinds of things context managers can help with when you
> know in advance exactly what you want to do.
>
> *If* CallbackStack proves genuinely popular (and given the number of
> proposals I have seen along these lines, and the feedback I have
> received on ContextStack to date, I expect it will), and people start
> to develop interesting patterns for using it, *then* we can start
> looking at the possibility of dedicated syntax to streamline
> particular use cases (just as the with statement itself was designed
> to streamline various use cases of the more general try statement).

Very lucid explanation, Nick. (I also liked your blog post that you
referenced in a previous message, which touches upon the same issues.)

Apparently I don't seem to like flow control constructs formed by
"quoting" (in Lisp terms) a block of code and leaving its execution to
some other party, with the exception of explicit function definitions.
Maybe a computer-literate psychoanalyst can do something with this...

To this day I am having trouble liking event-based architectures -- I
do see a need for them, but I immediately want to hide their
mechanisms and offer a *different* mechanism for most use cases. See
e.g. the (non-thread-based) async functionality I added to the new App
Engine datastore client, NDB:
https://docs.google.com/document/pub?id=1LhgEnZXAI8xiEkFA4tta08Hyn5vo4T6HSGLFVrP0Jag
. Deep down inside it has an event loop, but this is hidden by using
Futures, which in turn are mostly wrapped in tasklets , i.e.
yield-based coroutines. I expect that if I were to find a use for
Twisted, I'd do most of my coding using its so-called inlineCallbacks
mechanism (also yield-based coroutines). When I first saw Monocle,
which offers a simplified coroutine-based API on top of (amongst
others) Twisted, I thought it was a breath of fresh air (NDB is
heavily influenced by it).

I've probably (implicitly) trained most key Python developers and
users to think similarly, and Python isn't likely to morph into Ruby
any time soon. It's easy enough to write an event-based architecture
in Python (see Twisted and Tornado); but an event loop is never going
to be the standard way to solve all your programming problems in
Python.

I do kind of like the 'defer' idea that started this thread (even if I
had syntactic quibbles with it that already came up before the thread
was derailed), but I notice that it is a far cry from an event-driven
architecture -- like the referenced counterparts in Go and D, 'defer'
blocks are not anonymous functions that can be passed off to arbitrary
other libraries for possibly later and/or repeated execution -- they
are a way to specify out-of-order execution within the current scope,
which "tames" them enough to be acceptable from my perspective. Though
they may also not be powerful enough to be convincing as a new
feature, since you can do everything they can do by rearranging the
code of your function somewhat and carefully using try/finally.

-- 
--Guido van Rossum (python.org/~guido)


From nathan.alexander.rice at gmail.com  Sun Feb 19 05:31:31 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Sat, 18 Feb 2012 23:31:31 -0500
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
	<CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
Message-ID: <CAOFbRmL=N0wxgKP2+BoSBZ1NuxHCS5-Qgwu=_BVP9ssW2MD7NA@mail.gmail.com>

> The core problem comes down to the differences between Guido's
> original PEP 340 idea (which was much closer in power to Ruby's
> blocks, since it was a new looping construct that allowed 0-or-more
> executions of the contained block) and the more constrained with
> statement that is defined in PEP 343 (which will either execute the
> body once or throw an exception, distinguishing it clearly from both
> the looping constructs and if statements).
>
> The principle Guido articulated when making that decision was:
> "different forms of flow control should look different at the point of
> invocation".
>
> So, where a language like Ruby just defines one protocol (callbacks,
> supplemented by anonymous blocks that run directly in the namespace of
> the containing function) and uses it for pretty much *all* flow
> control (including all their loop constructs), Python works the other
> way around, defining *different* protocols for different patterns of
> invocation.
>
> This provides a gain in readability on the Python side. When you see
> any of the following in Python:
>
> ? @whatever()
> ? def f():
> ? ? ? ?pass
>
> ? ?with whatever():
> ? ? ? ?# Do something!
>
> ? ?for x in whatever():
> ? ? ? ?# Do something!
>
> It places a lot of constraints on the nature of the object returned by
> "whatever()" - even without knowing anything else about it, you know
> the first must return a decorator, the second a context manager, and
> the third an iterable. If that's all you need to know at this point in
> time, you don't need to worry about the details - the local syntax
> tells you the important things you need to know about the flow
> control.

I can appreciate the intention there.  That particular case isn't as
big a deal from my perspective, my non-local code pain points tend to
be centered around boneheaded uses of inheritance and dynamic
modification of classes..

> In Ruby, though, all of them (assuming it isn't actually important
> that the function name be bound locally) could be written like this:
>
> ? ?whatever() do:
> ? ? ? ?# Do something!
> ? ?end
>
> Is it a context manager? An iterable? Some other kind of callback?
> There's nothing in the syntax to tell you that - you're relying on
> naming conventions to provide that information (like the ".foreach"
> convention for iteration methods). That approach can obviously work
> (otherwise Ruby wouldn't be as popular as it is), but it *does* make
> it harder to pick up a piece of code and understand the possible
> control flows without looking elsewhere.

More often than not, when I am reading other people's code, I am
debugging it (and thus have local/global context information) or just
interested in nailing down a poorly documented corner of an API.  I
think if I were regularly in the habit of working with lots of
undocumented code I would probably appreciate this more.

> However, this decision to be explicit about flow control for the
> benefit of the *reader* brings with it a high *cost* on the Python
> side for the code *author*: where Ruby works by defining a nice syntax
> and semantics for callback based programming and building other
> language constructs on top of that, Python *doesn't currently have* a
> particularly nice general purpose native syntax for callback based
> programming.
>
> Decorators do work in many cases (especially simple callback
> registration), but they sometimes feel wrong because they're mainly
> designed to modify how a function is defined, not implement key
> program flow control constructs. However, their flexibility shouldn't
> be underestimated, and the CallbackStack API is designed to help
> Python developers push decorators and context managers closer to those
> limits *without* needing new language constructs. By decoupling the
> callback stack from the code layout, it gives you full *programmatic*
> control of the kinds of things context managers can help with when you
> know in advance exactly what you want to do.
>
> *If* CallbackStack proves genuinely popular (and given the number of
> proposals I have seen along these lines, and the feedback I have
> received on ContextStack to date, I expect it will), and people start
> to develop interesting patterns for using it, *then* we can start
> looking at the possibility of dedicated syntax to streamline
> particular use cases (just as the with statement itself was designed
> to streamline various use cases of the more general try statement).

I wasn't suggesting syntax needs to change necessarily, all the pieces
are already there.  I see it more along the lines of function.Event,
class.Event, module.Event, context_manager.Event, etc.  It is a moot
point though.

Thank you again for taking the time to clarify the rational for me.
It wasn't intuitive to me because it does not really address issues I
have.


From nathan.alexander.rice at gmail.com  Sun Feb 19 06:14:37 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Sun, 19 Feb 2012 00:14:37 -0500
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAP7+vJK70aWhC7NYgO_p0wiDaMtYSLWargycDbXSqcNoG8XyUg@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
	<CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
	<CAP7+vJK70aWhC7NYgO_p0wiDaMtYSLWargycDbXSqcNoG8XyUg@mail.gmail.com>
Message-ID: <CAOFbRmJy+kGwr=_6Bsw2-hokKCM0zKJMQ5TeruewX2VkFvDAug@mail.gmail.com>

> Apparently I don't seem to like flow control constructs formed by
> "quoting" (in Lisp terms) a block of code and leaving its execution to
> some other party, with the exception of explicit function definitions.
> Maybe a computer-literate psychoanalyst can do something with this...
>
> To this day I am having trouble liking event-based architectures -- I
> do see a need for them, but I immediately want to hide their
> mechanisms and offer a *different* mechanism for most use cases. See
> e.g. the (non-thread-based) async functionality I added to the new App
> Engine datastore client, NDB:
> https://docs.google.com/document/pub?id=1LhgEnZXAI8xiEkFA4tta08Hyn5vo4T6HSGLFVrP0Jag
> . Deep down inside it has an event loop, but this is hidden by using
> Futures, which in turn are mostly wrapped in tasklets , i.e.
> yield-based coroutines. I expect that if I were to find a use for
> Twisted, I'd do most of my coding using its so-called inlineCallbacks
> mechanism (also yield-based coroutines). When I first saw Monocle,
> which offers a simplified coroutine-based API on top of (amongst
> others) Twisted, I thought it was a breath of fresh air (NDB is
> heavily influenced by it).

The main attraction of events for me is that they are a decent model
of computational flow that makes it easy to "reach into" other
people's code.  I won't argue against the statement that they can be
less clear or convenient to work with in some cases than other
mechanisms.  My personal preference would be to have the more powerful
mechanism as the underlying technology, and build simpler abstractions
on top of that (kind of like @property vs manually creating a
descriptor).

> I've probably (implicitly) trained most key Python developers and
> users to think similarly, and Python isn't likely to morph into Ruby
> any time soon. It's easy enough to write an event-based architecture
> in Python (see Twisted and Tornado); but an event loop is never going
> to be the standard way to solve all your programming problems in
> Python.

I agree that events can make code harder to follow in some cases.  I
feel the same way about message passing and channels versus method
invocation.  In both cases I think there is an argument to be made for
representing the simpler techniques as a special cases which are
emphasized for general use.  I also understand not wanting to be stuck
dealing with someone else's event or message passing fetish when it's
not necessary (and they often aren't), and that is certainly a fair
counterargument.

Thank you for clarifying your views somewhat, it was instructive.  I
enjoy writing python code in general, but I shouldn't let that lead me
astray when it isn't the right tool for the job.


Take care,

Nathan


From anacrolix at gmail.com  Sun Feb 19 08:36:48 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 19 Feb 2012 15:36:48 +0800
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <CAOFbRmJy+kGwr=_6Bsw2-hokKCM0zKJMQ5TeruewX2VkFvDAug@mail.gmail.com>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>
	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>
	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>
	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>
	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<CAOFbRmJRxZB_6HeZVH=6iVHg5207bVLVnbA-icDqMvqdObKj+w@mail.gmail.com>
	<CADiSq7eG-ZAvxfDEidkn8DV=r7GbM4eo9d=dCNdiEF=M2koBjA@mail.gmail.com>
	<CAP7+vJK70aWhC7NYgO_p0wiDaMtYSLWargycDbXSqcNoG8XyUg@mail.gmail.com>
	<CAOFbRmJy+kGwr=_6Bsw2-hokKCM0zKJMQ5TeruewX2VkFvDAug@mail.gmail.com>
Message-ID: <CAB4yi1OCmiictQEOwnR_nQU_yVfiFCp0W_2LKpLkc4WY9ztqGw@mail.gmail.com>

Out of interest, do you see an alternative to events or message passing
when they _are_ required? I'm in Guido's apparently minority camp in that I
can't stand events.  The only decent alternative I've seen is message
passing.
On Feb 19, 2012 1:15 PM, "Nathan Rice" <nathan.alexander.rice at gmail.com>
wrote:

> > Apparently I don't seem to like flow control constructs formed by
> > "quoting" (in Lisp terms) a block of code and leaving its execution to
> > some other party, with the exception of explicit function definitions.
> > Maybe a computer-literate psychoanalyst can do something with this...
> >
> > To this day I am having trouble liking event-based architectures -- I
> > do see a need for them, but I immediately want to hide their
> > mechanisms and offer a *different* mechanism for most use cases. See
> > e.g. the (non-thread-based) async functionality I added to the new App
> > Engine datastore client, NDB:
> >
> https://docs.google.com/document/pub?id=1LhgEnZXAI8xiEkFA4tta08Hyn5vo4T6HSGLFVrP0Jag
> > . Deep down inside it has an event loop, but this is hidden by using
> > Futures, which in turn are mostly wrapped in tasklets , i.e.
> > yield-based coroutines. I expect that if I were to find a use for
> > Twisted, I'd do most of my coding using its so-called inlineCallbacks
> > mechanism (also yield-based coroutines). When I first saw Monocle,
> > which offers a simplified coroutine-based API on top of (amongst
> > others) Twisted, I thought it was a breath of fresh air (NDB is
> > heavily influenced by it).
>
> The main attraction of events for me is that they are a decent model
> of computational flow that makes it easy to "reach into" other
> people's code.  I won't argue against the statement that they can be
> less clear or convenient to work with in some cases than other
> mechanisms.  My personal preference would be to have the more powerful
> mechanism as the underlying technology, and build simpler abstractions
> on top of that (kind of like @property vs manually creating a
> descriptor).
>
> > I've probably (implicitly) trained most key Python developers and
> > users to think similarly, and Python isn't likely to morph into Ruby
> > any time soon. It's easy enough to write an event-based architecture
> > in Python (see Twisted and Tornado); but an event loop is never going
> > to be the standard way to solve all your programming problems in
> > Python.
>
> I agree that events can make code harder to follow in some cases.  I
> feel the same way about message passing and channels versus method
> invocation.  In both cases I think there is an argument to be made for
> representing the simpler techniques as a special cases which are
> emphasized for general use.  I also understand not wanting to be stuck
> dealing with someone else's event or message passing fetish when it's
> not necessary (and they often aren't), and that is certainly a fair
> counterargument.
>
> Thank you for clarifying your views somewhat, it was instructive.  I
> enjoy writing python code in general, but I shouldn't let that lead me
> astray when it isn't the right tool for the job.
>
>
> Take care,
>
> Nathan
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120219/96569336/attachment.html>

From storchaka at gmail.com  Sun Feb 19 10:25:39 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 19 Feb 2012 11:25:39 +0200
Subject: [Python-ideas] ScopeGuardStatement/Defer Proposal
In-Reply-To: <4F4039D9.2010909@pearwood.info>
References: <CANf+-jkiBK0yjEAGu8iX2M45KCCZc9XdHM95Au+rWS3n7rLRHA@mail.gmail.com>	<CADiSq7ftf-k5DLVsVScLZJgeBU1SOpBEKV9TsMJG0LGEdXkK3A@mail.gmail.com>	<CAOFbRm+pbRWxVVfLLCCZ9D4CWRnx=2vTpmutmb6RZN=-Xc---Q@mail.gmail.com>	<CADiSq7dG0YQ-o+0TXMc-V6AMV4xRo-cYMfJ6Bd1ykeafBs8EvA@mail.gmail.com>	<CAOFbRmK7TOwzgy78W_gHnnump1WLpHYUOLLoyvcx5LWYMnWupg@mail.gmail.com>
	<CAP7+vJJsT5x+zMvX=8STZtgDSHv8svi00=FZpB_rJbqfPYy-Lw@mail.gmail.com>
	<4F4039D9.2010909@pearwood.info>
Message-ID: <jhqf6r$fti$1@dough.gmane.org>

19.02.12 01:52, Steven D'Aprano ???????(??):
> I hope not. I like Pascal. It has nice, clean syntax (if a tad verbose,
> with the BEGIN/END tags) and straight-forward, simple semantics.
> Standard Pascal is somewhat lacking (e.g. no strings) but who uses
> standard Pascal?

Python is not Pascal. For me it s BASIC of nowadays. Really basic, 
simple and clear (even for non-specialists) language. Not old BASIC with 
line numbers, GOTO, GOSUB and 1- or 2-symbol identifiers, but modern 
language with modules, structured programming, powerful basic data 
structures, OOP, first-class functions, automatic resource management etc.


From cs at zip.com.au  Sun Feb 19 11:05:12 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 19 Feb 2012 21:05:12 +1100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F404968.5070000@molden.no>
References: <4F404968.5070000@molden.no>
Message-ID: <20120219100512.GA31747@cskk.homeip.net>

On 19Feb2012 01:59, Sturla Molden <sturla at molden.no> wrote:
| Den 19.02.2012 01:39, skrev Matt Joiner:
| >
| > Yes, channels can allow for this, but as with locks directionality and 
| > ordering matter. Typically messages will only run in a particular 
| > direction.
| >
| 
| Actually, it was only a synchronous MPI_Recv that did this in MPI, a 
| synchronous
| MPI_Send would have been even worse. Which is why MPI got the
| asynchronous method MPI_Irecv...
| 
| Sounds like you just want a barrier or a condition primitive. E.g. have 
| the sender
| call .wait() on a condition and let the receiver call .notify() the 
| condition.

A condition is essentially a boolean (with waiting).
A channel is a value passing mechanism.
Sometimes you really do want a zero-storage Queue i.e. a channel.

Saying "but you could put a value in a shared variable and just use a
condition" removes the abstraction/metaphor. If I was thinking that way
more than once in some code I'd write a small class to do that. And it would
be a channel!

Seriously, a channel is semanticly equivalent to a zero-storage Queue, which
is a mode not provided by the current Queue implementation.
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

No good deed shall go unpunished!       - David Wood <davewood at teleport.com>


From jeanpierreda at gmail.com  Sun Feb 19 16:18:58 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Sun, 19 Feb 2012 10:18:58 -0500
Subject: [Python-ideas] doctest
In-Reply-To: <CADiSq7fBUNffoZUJ=y=gxH8Pn78kh7pt0C9haAAnUhRBntp8hQ@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>
	<CADiSq7fBUNffoZUJ=y=gxH8Pn78kh7pt0C9haAAnUhRBntp8hQ@mail.gmail.com>
Message-ID: <CABicbJLTZFyQxo4p3Eas8u4kAguc3weCjpGtsdyAapfiYZQXeg@mail.gmail.com>

On Sat, Feb 18, 2012 at 8:54 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> A published version of doctest2 that was designed to be suitable for
> eventual incorporation back into doctest itself (i.e. by maintaining
> backwards compatibility) sounds like it would be quite popular, and
> would route around the fact that enhancing it isn't high on the
> priority list for the current core development team.

Heh, "quite popular". Whenever I mention doctest2, people think of
doctest. And apparently people really dislike doctest. The way I try
to address the immediate fear response is, "sure, doctest is terrible
-- why do you think I'm forking it? ;)"; however, I think popularity
would be difficult outside of the existing doctest user base.


P.S., some uninvited advice to would-be forkers:

- Make the starting commit of your repository identical to the
original module that you're forking, to make tracking the original
module easier.
- On that note, also write down the hg revision of the module that
you're forking so that you can find later changes.
- Immediately change the name of your forked module so that unit tests
only run against it rather than accidentally testing the original
module. (Also, delete the original from your Python to be sure you
edited the test cases right too. And, uh, don't forget the pyc.)

Maybe these are obvious to everyone else, but I'd never forked
anything before, and so I made all those mistakes. The first dozen or
two commits are full of sad things.

-- Devin


From guido at python.org  Sun Feb 19 16:53:14 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 19 Feb 2012 07:53:14 -0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <20120219100512.GA31747@cskk.homeip.net>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
Message-ID: <CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>

How hard would it be to add Channel to the stdlib? Perhaps even in the
threading module, which already has a bunch of different primitives
like Lock, RLock, Condition, Event, Semaphore, Barrier.

On Sun, Feb 19, 2012 at 2:05 AM, Cameron Simpson <cs at zip.com.au> wrote:
> On 19Feb2012 01:59, Sturla Molden <sturla at molden.no> wrote:
> | Den 19.02.2012 01:39, skrev Matt Joiner:
> | >
> | > Yes, channels can allow for this, but as with locks directionality and
> | > ordering matter. Typically messages will only run in a particular
> | > direction.
> | >
> |
> | Actually, it was only a synchronous MPI_Recv that did this in MPI, a
> | synchronous
> | MPI_Send would have been even worse. Which is why MPI got the
> | asynchronous method MPI_Irecv...
> |
> | Sounds like you just want a barrier or a condition primitive. E.g. have
> | the sender
> | call .wait() on a condition and let the receiver call .notify() the
> | condition.
>
> A condition is essentially a boolean (with waiting).
> A channel is a value passing mechanism.
> Sometimes you really do want a zero-storage Queue i.e. a channel.
>
> Saying "but you could put a value in a shared variable and just use a
> condition" removes the abstraction/metaphor. If I was thinking that way
> more than once in some code I'd write a small class to do that. And it would
> be a channel!
>
> Seriously, a channel is semanticly equivalent to a zero-storage Queue, which
> is a mode not provided by the current Queue implementation.
> --
> Cameron Simpson <cs at zip.com.au> DoD#743
> http://www.cskk.ezoshosting.com/cs/
>
> No good deed shall go unpunished! ? ? ? - David Wood <davewood at teleport.com>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


-- 
--Guido van Rossum (python.org/~guido)


From sturla at molden.no  Sun Feb 19 16:58:07 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 16:58:07 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
Message-ID: <4F411C0F.6030008@molden.no>

Den 19.02.2012 16:53, skrev Guido van Rossum:
> How hard would it be to add Channel to the stdlib?

It might take 10 lines of code...

Sturla


From solipsis at pitrou.net  Sun Feb 19 17:01:03 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 19 Feb 2012 17:01:03 +0100
Subject: [Python-ideas] channel (synchronous queue)
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no>
Message-ID: <20120219170103.17f5a8b9@pitrou.net>

On Sun, 19 Feb 2012 16:58:07 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 19.02.2012 16:53, skrev Guido van Rossum:
> > How hard would it be to add Channel to the stdlib?
> 
> It might take 10 lines of code...

Even for multiprocessing?

(I realize we didn't implement a Barrier in multiprocessing; patches
welcome :-))

Regards

Antoine.


From sturla at molden.no  Sun Feb 19 17:58:53 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 17:58:53 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <20120219170103.17f5a8b9@pitrou.net>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
Message-ID: <4F412A4D.4090102@molden.no>

Den 19.02.2012 17:01, skrev Antoine Pitrou:

> Even for multiprocessing? (I realize we didn't implement a Barrier in 
> multiprocessing; patches welcome :-)) Regards Antoine. 
> _______________________________________________ Python-ideas mailing 
> list Python-ideas at python.org 
> http://mail.python.org/mailman/listinfo/python-ideas 


Here is a sceleton (replace self._data with some IPC mechanism for 
multiprocessing).

I'll post a barrier for multiprocessing asap, I happen to have one ;-)

Sturla


from threading import Lock, Event

class Channel(object):

     def __init__(self):
         self._writelock = Lock()
         self._readlock = Lock()
         self._new_data = Event()
         self._recv_data = Event()
         self._data = None

     def put(self, msg):
         with self._writelock:
             self._data = msg
             self._new_data.set()
             self._recv_data.wait()
             self._recv_data.clear()

     def get(self):
         with self._readlock:
             self._new_data.wait()
             msg = self._data
             self._data = None
             self._new_data.clear()
             self._recv_data.set()
         return msg


if __name__ == "__main__":

     from threading import Thread
     from sys import stdout

     def thread2(channel):
         for i in range(1000):
             msg = channel.get()
             stdout.flush()
             print "Thread 2 received '%s'\n" % msg,
             stdout.flush()

     def thread1(channel):
         for i in range(1000):
             stdout.flush()
             print "Thread 1 preparing to send 'message %d'\n" % i,
             stdout.flush()
             msg = channel.put(("message %d" % i,))
             stdout.flush()
             print "Thread 1 finished sending 'message %d'\n" % i,
             stdout.flush()

     channel = Channel()
     t2 = Thread(target=thread2, args=(channel,))
     t2.start()
     thread1(channel)
     t2.join()


From sturla at molden.no  Sun Feb 19 18:05:57 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 18:05:57 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F412A4D.4090102@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no>
Message-ID: <4F412BF5.90504@molden.no>

Den 19.02.2012 17:58, skrev Sturla Molden:
> Den 19.02.2012 17:01, skrev Antoine Pitrou:
>
>> Even for multiprocessing? (I realize we didn't implement a Barrier in 
>> multiprocessing; patches welcome :-)) Regards Antoine. 
>> _______________________________________________ Python-ideas mailing 
>> list Python-ideas at python.org 
>> http://mail.python.org/mailman/listinfo/python-ideas 
>
>
> Here is a sceleton (replace self._data with some IPC mechanism for 
> multiprocessing).
>
> I'll post a barrier for multiprocessing asap, I happen to have one ;-)
>

from multiprocessing import Event
from math import ceil, log

class Barrier(object):

     def __init__(self, numproc):
         self._events = [mp.Event() for n in range(numproc**2)]
         self._numproc = numproc

     def wait(self, rank):

         # loop log2(numproc) times, rounding up
         for k in range(int(ceil(log(self._numproc)/log(2)))):

             # send event to process
             # (rank + 2**k) % numproc
             receiver = (rank + 2**k) % self._numproc
             evt = self._events[rank * self._numproc + receiver]
             evt.set()

             # wait for event from process
             # (rank - 2**k) % numproc
             sender = (rank - 2**k) % self._numproc
             evt = self._events[sender * self._numproc + rank]
             evt.wait()
             evt.clear()


From solipsis at pitrou.net  Sun Feb 19 18:18:52 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 19 Feb 2012 18:18:52 +0100
Subject: [Python-ideas] channel (synchronous queue)
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no>
Message-ID: <20120219181852.1bce007c@pitrou.net>

On Sun, 19 Feb 2012 17:58:53 +0100
Sturla Molden <sturla at molden.no> wrote:
> 
>      def put(self, msg):
>          with self._writelock:
>              self._data = msg
>              self._new_data.set()
>              self._recv_data.wait()
>              self._recv_data.clear()

This begs the question: what does it achieve? You know that the data
has been "received" on the other side (i.e. get() has been called),
but this doesn't tell you anything was done with the data, so: why is
this an useful way to synchronize?

Regards

Antoine.


From sturla at molden.no  Sun Feb 19 18:27:23 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 18:27:23 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <20120219181852.1bce007c@pitrou.net>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
Message-ID: <4F4130FB.4040204@molden.no>

Den 19.02.2012 18:18, skrev Antoine Pitrou:
> This begs the question: what does it achieve? You know that the data 
> has been "received" on the other side (i.e. get() has been called), 
> but this doesn't tell you anything was done with the data, so: why is 
> this an useful way to synchronize? 

I think it achieves nothing, except making deadlocks more likely.

Sturla


From sturla at molden.no  Sun Feb 19 18:36:34 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 18:36:34 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F4130FB.4040204@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no>
Message-ID: <4F413322.2070703@molden.no>

Den 19.02.2012 18:27, skrev Sturla Molden:
> Den 19.02.2012 18:18, skrev Antoine Pitrou:
>> This begs the question: what does it achieve? You know that the data 
>> has been "received" on the other side (i.e. get() has been called), 
>> but this doesn't tell you anything was done with the data, so: why is 
>> this an useful way to synchronize? 
>
> I think it achieves nothing, except making deadlocks more likely.

Which is to say, I just wanted to prove how ridiculously simple Matt 
Joiner's complaint about a "channel" was.

The multiprocessing barrier on the other hand is quite useful. (Though 
the butterfly method is not the most efficient implementation of a barrier.)

Sturla


From sturla at molden.no  Sun Feb 19 18:43:58 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 18:43:58 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F413322.2070703@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
Message-ID: <4F4134DE.5070609@molden.no>

Den 19.02.2012 18:36, skrev Sturla Molden:
>
> The multiprocessing barrier on the other hand is quite useful. (Though 
> the butterfly method is not the most efficient implementation of a 
> barrier.)

Oops... it is the dissemination barrier, not the butterfly barrier. It 
scales better for non-power of two number of threads.

Sturla


From guido at python.org  Sun Feb 19 18:44:00 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 19 Feb 2012 09:44:00 -0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F413322.2070703@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
Message-ID: <CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>

On Sun, Feb 19, 2012 at 9:36 AM, Sturla Molden <sturla at molden.no> wrote:
> Den 19.02.2012 18:27, skrev Sturla Molden:
>
>> Den 19.02.2012 18:18, skrev Antoine Pitrou:
>>>
>>> This begs the question: what does it achieve? You know that the data has
>>> been "received" on the other side (i.e. get() has been called), but this
>>> doesn't tell you anything was done with the data, so: why is this an useful
>>> way to synchronize?
>>
>>
>> I think it achieves nothing, except making deadlocks more likely.
>
>
> Which is to say, I just wanted to prove how ridiculously simple Matt
> Joiner's complaint about a "channel" was.

I may be taking this out of context, but I have a really hard time
understanding what you were trying to say. What does it mean for a
complaint to be simple? Did you leave out a word in haste? (I know
that happens a lot to me. :-)

> The multiprocessing barrier on the other hand is quite useful. (Though the
> butterfly method is not the most efficient implementation of a barrier.)

Glad to see some real code. It's probably time to move the code
samples to the bug tracker where they can be reviewed and have a
chance of getting incorporated into the next release.

-- 
--Guido van Rossum (python.org/~guido)


From sturla at molden.no  Sun Feb 19 19:01:31 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 19:01:31 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
Message-ID: <4F4138FB.8010103@molden.no>

Den 19.02.2012 18:44, skrev Guido van Rossum:
> I may be taking this out of context, but I have a really hard time 
> understanding what you were trying to say. What does it mean for a 
> complaint to be simple? Did you leave out a word in haste? (I know 
> that happens a lot to me. :-) 

Sorry for the rude language. I ment I think  it is a problem that does 
not belong in the standard library, but perhaps in a cookbook. It is ~20 
lines of trivial code with objects already in the standard library. 
Well, one could say the same thing about a queue too (it's just deque 
and a lock), but it is very useful and commonly used, so there is a 
difference.

Sturla


From shibturn at gmail.com  Sun Feb 19 19:07:21 2012
From: shibturn at gmail.com (shibturn)
Date: Sun, 19 Feb 2012 18:07:21 +0000
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F412BF5.90504@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <4F412BF5.90504@molden.no>
Message-ID: <jhrdoq$muv$1@dough.gmane.org>

On 19/02/2012 5:05pm, Sturla Molden wrote:
> from multiprocessing import Event
> from math import ceil, log
> ...

I presume rank is the index of the process?  Sounds very MPIish.

One problem with multiprocessing's Event uses 5 semaphores.  (Condition 
uses 4 and Lock, RLock, Semaphore use 1).  So your Barrier will use 
5*numproc semaphores.  This is likely to be a problem for those Unixes 
(such as oldish versions of FreeBSD) which allow a very limited number 
of semaphores.

It would probably better to use something which has an API which is a 
closer match to threading.Barrier.  The code below gets closer in API 
but does not implement reset() (which I think is pretty pointless 
anyway), and wait() returns None instead of an index.  It is not 
properly tested though.


import multiprocessing as mp

class BrokenBarrierError(Exception):
     pass

class Barrier(object):

     def __init__(self, size):
         assert size > 0
         self.size = size
         self._lock = mp.Lock()
         self._entry_sema = mp.Semaphore(size-1)
         self._exit_sema = mp.Semaphore(0)
         self._broken_sema = mp.BoundedSemaphore(1)

     def wait(self, timeout=None):
         if self.broken:
             raise BrokenBarrierError
         try:
             if self._entry_sema.acquire(timeout=0):
                 if not self._exit_sema.acquire(timeout=timeout):
                     self.abort()
             else:
                 for i in range(self.size-1):
                     self._exit_sema.release()
                 for i in range(self.size-1):
                     self._entry_sema.release()
         except:
             self.abort()
             raise
         if self.broken:
             raise BrokenBarrierError

     def abort(self):
         with self._lock:
             self._broken_sema.acquire(timeout=5)
         for i in range(self.size):
             self._entry_sema.release()
             self._exit_sema.release()

     def reset(self):
         raise NotImplementedError

     @property
     def broken(self):
         with self._lock:
             if not self._broken_sema.acquire(timeout=0):
                 return True
             self._broken_sema.release()
             return False

##

import time, random

def child(b,l):
     for i in range(5):
         time.sleep(random.random()*5)
         with l:
             print i, "entering barrier:", mp.current_process().name
         b.wait()
         with l:
             print '\t', i, "exiting barrier:", mp.current_process().name

if __name__ == '__main__':
     b = Barrier(5)
     l = mp.Lock()
     for i in range(5):
         mp.Process(target=child, args=(b,l)).start()
     time.sleep(10)
     print("ABORTING")
     b.abort()


From luoyonggang at gmail.com  Sun Feb 19 19:08:13 2012
From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=)
Date: Mon, 20 Feb 2012 02:08:13 +0800
Subject: [Python-ideas] Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject *o),
 This is important for OS dependent feature port.
Message-ID: <CAE2XoE8atxG=_fSzyzksQS=FF5qCyiFFcqo-1nCeAEtKNLwSjQ@mail.gmail.com>

Py_UCS1 <http://docs.python.org/dev/c-api/unicode.html#Py_UCS1>*
PyUnicode_1BYTE_DATA(PyObject<http://docs.python.org/dev/c-api/structures.html#PyObject>
* *o*) Py_UCS2 <http://docs.python.org/dev/c-api/unicode.html#Py_UCS2>*
PyUnicode_2BYTE_DATA(PyObject<http://docs.python.org/dev/c-api/structures.html#PyObject>
* *o*) <http://docs.python.org/dev/c-api/unicode.html#PyUnicode_1BYTE_DATA>
<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_2BYTE_DATA>
Py_UCS4<http://docs.python.org/dev/c-api/unicode.html#Py_UCS4>
* PyUnicode_4BYTE_DATA(PyObject<http://docs.python.org/dev/c-api/structures.html#PyObject>
* *o*) <http://docs.python.org/dev/c-api/unicode.html#PyUnicode_4BYTE_DATA>

Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4
integer types for direct character access. No checks are performed if the
canonical representation has the correct character size; use
PyUnicode_KIND()<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_KIND>
to
select the right macro. Make sure
PyUnicode_READY()<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_READY>
has
been called before accessing this.

New in version 3.3.
PyUnicode_WCHAR_KIND<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_WCHAR_KIND>
PyUnicode_1BYTE_KIND<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_1BYTE_KIND>
PyUnicode_2BYTE_KIND<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_2BYTE_KIND>
PyUnicode_4BYTE_KIND<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_4BYTE_KIND>

Return values of the
PyUnicode_KIND()<http://docs.python.org/dev/c-api/unicode.html#PyUnicode_KIND>
 macro.

New in version 3.3.
int PyUnicode_KIND(PyObject<http://docs.python.org/dev/c-api/structures.html#PyObject>
* *o*) <http://docs.python.org/dev/c-api/unicode.html#PyUnicode_KIND>

Return one of the PyUnicode kind constants (see above) that indicate how
many bytes per character this Unicode object uses to store its data. *o* has
to be a Unicode object in the ?canonical? representation (not checked).

New in version 3.3.
-- 
         ??
?
???
Yours
    sincerely,
Yonggang Luo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/0693f9d3/attachment.html>

From p.f.moore at gmail.com  Sun Feb 19 19:08:41 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 19 Feb 2012 18:08:41 +0000
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F4138FB.8010103@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
Message-ID: <CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>

On 19 February 2012 18:01, Sturla Molden <sturla at molden.no> wrote:
> It is ~20 lines
> of trivial code with objects already in the standard library. Well, one
> could say the same thing about a queue too (it's just deque and a lock), but
> it is very useful and commonly used, so there is a difference.

FWIW, I wouldn't have got this code right if I'd tried to write it.
I'd have missed a lock or something. So it's possible that having it
in the standard library avoids people like me writing buggy
implementations. On the other hand, I can't imagine ever needing to
use a channel object like this, so it would probably be worth having
some real-world use cases to justify it.

Paul


From sturla at molden.no  Sun Feb 19 19:18:04 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 19:18:04 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <jhrdoq$muv$1@dough.gmane.org>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <4F412BF5.90504@molden.no>
	<jhrdoq$muv$1@dough.gmane.org>
Message-ID: <4F413CDC.30707@molden.no>

Den 19.02.2012 19:07, skrev shibturn:
>
>
> One problem with multiprocessing's Event uses 5 semaphores.  
> (Condition uses 4 and Lock, RLock, Semaphore use 1).  So your Barrier 
> will use 5*numproc semaphores.  This is likely to be a problem for 
> those Unixes (such as oldish versions of FreeBSD) which allow a very 
> limited number of semaphores.

I actually overallocated the number of events, only O(n log n) should be 
needed. So a dict could have been used for sparse storage instead.  
Still that is a lot of semaphores.

Sturla


From sturla at molden.no  Sun Feb 19 19:29:00 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 19:29:00 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <jhrdoq$muv$1@dough.gmane.org>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <4F412BF5.90504@molden.no>
	<jhrdoq$muv$1@dough.gmane.org>
Message-ID: <4F413F6C.9000806@molden.no>

Den 19.02.2012 19:07, skrev shibturn:
>
> One problem with multiprocessing's Event uses 5 semaphores.  
> (Condition uses 4 and Lock, RLock, Semaphore use 1).  So your Barrier 
> will use 5*numproc semaphores. 

It is of course trivial to implement a dissemination barrier in C, 
atomic read/write (and shared memory for multiprocessing). It would take 
O(n log2 n) amount of shared memory. One iteration of .wait() would take 
O(log2 n) time.

Sturla


From solipsis at pitrou.net  Sun Feb 19 19:29:21 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 19 Feb 2012 19:29:21 +0100
Subject: [Python-ideas] Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject
 *o), This is important for OS dependent feature port.
References: <CAE2XoE8atxG=_fSzyzksQS=FF5qCyiFFcqo-1nCeAEtKNLwSjQ@mail.gmail.com>
Message-ID: <20120219192921.0e366a41@pitrou.net>


> Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject *o), This is
> important for OS dependent feature port.

Why can't you use the existing wchar_t functions:
http://docs.python.org/dev/c-api/unicode.html#wchar-t-support
?

Regards

Antoine.


From shibturn at gmail.com  Sun Feb 19 19:46:22 2012
From: shibturn at gmail.com (shibturn)
Date: Sun, 19 Feb 2012 18:46:22 +0000
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <jhrdoq$muv$1@dough.gmane.org>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <4F412BF5.90504@molden.no>
	<jhrdoq$muv$1@dough.gmane.org>
Message-ID: <jhrg1v$6pg$1@dough.gmane.org>

On 19/02/2012 6:07pm, shibturn wrote:
> 5*numproc semaphores. This is likely to be a problem for those Unixes
   ^^^^^^^^^
5*numproc**2

sbt


From guido at python.org  Sun Feb 19 20:04:45 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 19 Feb 2012 11:04:45 -0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
Message-ID: <CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>

On Sun, Feb 19, 2012 at 10:08 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 19 February 2012 18:01, Sturla Molden <sturla at molden.no> wrote:
>> It is ~20 lines
>> of trivial code with objects already in the standard library. Well, one
>> could say the same thing about a queue too (it's just deque and a lock), but
>> it is very useful and commonly used, so there is a difference.
>
> FWIW, I wouldn't have got this code right if I'd tried to write it.
> I'd have missed a lock or something. So it's possible that having it
> in the standard library avoids people like me writing buggy
> implementations.

It would also encourage using it as an interface between libraries
with different authors, which would not happen if it was just a recipe
-- every author would implement their own version of the recipe, and
they would not be API-compatible even if they did the same thing. Many
of the existing primitives in threading.py are very simple
combinations of the basic Lock; but that doesn't make it less valuable
to have them.

Also, writing a performant Channel implementation for multiprocessing
would hardly be a trivial job; it seems primitives don't make it into
multiprocessing without first existing in threading.py.

So all this suggests to me that there is no great harm in adding
threading.Channel and it might open up some interesting new approaches
to synchronization.

That said, it certainly isn't a panacea; e.g. some Go examples written
using Channels are better done with coroutines instead of threads in
Python. (IIUC Go intentionally blurs the difference, but that's not
given to Python.)

> On the other hand, I can't imagine ever needing to
> use a channel object like this, so it would probably be worth having
> some real-world use cases to justify it.

I think Matt Joiner's original post hinted at some. Matt, could you
elaborate? We may be only an inch away from getting this into the
stdlib...

-- 
--Guido van Rossum (python.org/~guido)


From sturla at molden.no  Sun Feb 19 21:23:03 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 21:23:03 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
Message-ID: <4F415A27.5060605@molden.no>

Den 19.02.2012 20:04, skrev Guido van Rossum:
> Also, writing a performant Channel implementation for multiprocessing 
> would hardly be a trivial job;

I think this should work :-)

Sturla


from multiprocessing import Lock, Event, Pipe

class Channel(object):

     def __init__(self):
         self._writelock = Lock()
         self._readlock = Lock()
         self._new_data = Event()
         self._recv_data = Event()
         self._conn1, self._conn2 = Pipe(False)

     def put(self, msg):
         with self._writelock:
             self._conn2.send(msg)
             self._new_data.set()
             self._recv_data.wait()
             self._recv_data.clear()

     def get(self):
         with self._readlock:
             self._new_data.wait()
             msg = self._conn1.recv()
             self._new_data.clear()
             self._recv_data.set()
         return msg


## -------------

def proc2(channel):
     from sys import stdout
     for i in range(1000):
         msg = channel.get()
         stdout.flush()
         print "Process 2 received '%s'\n" % msg,
         stdout.flush()

def proc1(channel):
     from sys import stdout
     for i in range(1000):
         stdout.flush()
         print "Process 1 preparing to send 'message %d'\n" % i,
         stdout.flush()
         msg = channel.put(("message %d" % i,))
         stdout.flush()
         print "Process 1 finished sending 'message %d'\n" % i,
         stdout.flush()

if __name__ == "__main__":

     from multiprocessing import Process

     channel = Channel()
     p2 = Process(target=proc2, args=(channel,))
     p2.start()
     proc1(channel)
     p2.join()


From sturla at molden.no  Sun Feb 19 21:51:03 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 21:51:03 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F415A27.5060605@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F415A27.5060605@molden.no>
Message-ID: <4F4160B7.9030409@molden.no>

If someone want to write a PEP for Go'ish "channels" (or whatever), put 
it on the bug tracker, here is the example implementation (with a lock 
around stdout in the example code, stupid me...) It would still need a 
timeout argument.

A unittest could check the output of the test code as it is a known in 
advance. I don't really see the usefulness of a "channel" primitive, for 
those who do, here is my contribution.

I have only tested on Win64.

Sturla


-------------- next part --------------
A non-text attachment was scrubbed...
Name: channels.zip
Type: application/x-zip-compressed
Size: 2407 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120219/1110c03b/attachment.bin>

From sturla at molden.no  Sun Feb 19 22:29:23 2012
From: sturla at molden.no (Sturla Molden)
Date: Sun, 19 Feb 2012 22:29:23 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
Message-ID: <4F4169B3.6050101@molden.no>

Den 19.02.2012 20:04, skrev Guido van Rossum:
> I think Matt Joiner's original post hinted at some. Matt, could you 
> elaborate? We may be only an inch away from getting this into the 
> stdlib... 

One thing I could think of, is "atomic messaging" with multiple 
producers or consumers talking on the same channel. E.g. while process A 
sends a message to process B, process C cannot write and process D 
cannot read. So you always get a 1 to 1 conversation.

But I am not sure why (or if) Go has this mechanism.

On the other hand, if we put in N**2 pipes (or channels), we could 
achieve the same atomicity of transaction by having an index for sender 
and receiver of a message. This is what MPI does in the functions 
MPI_Send and MPI_Recv. But then I will be scolded for using to many 
semaphores on FreeBSD again :-(

But there are some other useful mechanisms from MPI (and ?MQ) to 
consider as well. For example message broadcasting, message scatter and 
gather, and reductions. The latter is a reduce operation (e.g. add or 
multiply) on messages coming in from multiple processes. OpenMP also has 
reductions in the API. So there is a lot to be considered on the area of 
concurrency if we want to put in more classes in threading and 
multiprocessing.

But now I'll stop before someone tells me to take this to the 
concurrency list :-)


Sturla


From solipsis at pitrou.net  Mon Feb 20 00:26:15 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 20 Feb 2012 00:26:15 +0100
Subject: [Python-ideas] channel (synchronous queue)
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F4169B3.6050101@molden.no>
Message-ID: <20120220002615.46c57b2a@pitrou.net>

On Sun, 19 Feb 2012 22:29:23 +0100
Sturla Molden <sturla at molden.no> wrote:
> Den 19.02.2012 20:04, skrev Guido van Rossum:
> > I think Matt Joiner's original post hinted at some. Matt, could you 
> > elaborate? We may be only an inch away from getting this into the 
> > stdlib... 
> 
> One thing I could think of, is "atomic messaging" with multiple 
> producers or consumers talking on the same channel. E.g. while process A 
> sends a message to process B, process C cannot write and process D 
> cannot read. So you always get a 1 to 1 conversation.

What would be the point exactly?

Regards

Antoine.


From massimo.dipierro at gmail.com  Mon Feb 20 00:38:41 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Sun, 19 Feb 2012 17:38:41 -0600
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F4169B3.6050101@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F4169B3.6050101@molden.no>
Message-ID: <569B5ADF-FF08-4604-9A0B-60185F7543BA@gmail.com>

On Feb 19, 2012, at 3:29 PM, Sturla Molden wrote:
> On the other hand, if we put in N**2 pipes (or channels), we could achieve the same atomicity of transaction by having an index for sender and receiver of a message. This is what MPI does in the functions MPI_Send and MPI_Recv. But then I will be scolded for using to many semaphores on FreeBSD again :-(

I like this a lot.

Below is some toy code I use in my parallel algorithms class (I removed the global communications broadcast, scatter, gather, reduce and I removed logging, network topology constraints, and checks).

class PSim(object):
    def __init__(self,p):
        """                                                                                       
        forks p-1 processes and creates p*p pipes                                                       
        """
        self.nprocs = p
        self.pipes = {}
        for i in range(p):
            for j in range(p):
                self.pipes[i,j] = os.pipe()
        self.rank = 0
        for i in range(1,p):
            if not os.fork():
                self.rank = i
   def send(self,j,data):
        s = cPickle.dumps(data)
        os.write(self.pipes[self.rank,j][1], string.zfill(str(len(s)),10))
        os.write(self.pipes[self.rank,j][1], s)
   def recv(self,j):
        size=int(os.read(self.pipes[j,self.rank][0],10))
        s=os.read(self.pipes[j,self.rank][0],size)
        data=cPickle.loads(s)
        return data

if __name__ == '__main__':
    comm = PSim(2)
    if comm.rank == 0: comm.send(1,'hello world')
    else: print comm.recv(0)

It would be very useful to have something like these channels built-in. Notice that using OS pipes have the problem of a OS dependent size. send is non-blocking for small data-size but becomes blocking for large data sizes. Using OS mkfifo or multiprocessing Queue is better but the OS limits the number of files open by one program.


From anacrolix at gmail.com  Mon Feb 20 01:34:15 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 20 Feb 2012 08:34:15 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F4138FB.8010103@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
Message-ID: <CAB4yi1NqfQaoAi4zWoPnXmGDXmMrWF6K=DmEBT+SEBF9ao1riA@mail.gmail.com>

Your implementation is incomplete.
On Feb 20, 2012 2:01 AM, "Sturla Molden" <sturla at molden.no> wrote:

> Den 19.02.2012 18:44, skrev Guido van Rossum:
>
>> I may be taking this out of context, but I have a really hard time
>> understanding what you were trying to say. What does it mean for a
>> complaint to be simple? Did you leave out a word in haste? (I know that
>> happens a lot to me. :-)
>>
>
> Sorry for the rude language. I ment I think  it is a problem that does not
> belong in the standard library, but perhaps in a cookbook. It is ~20 lines
> of trivial code with objects already in the standard library. Well, one
> could say the same thing about a queue too (it's just deque and a lock),
> but it is very useful and commonly used, so there is a difference.
>
> Sturla
>
>
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/e7fd3a84/attachment.html>

From sturla at molden.no  Mon Feb 20 01:40:28 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 20 Feb 2012 01:40:28 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <569B5ADF-FF08-4604-9A0B-60185F7543BA@gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F4169B3.6050101@molden.no>
	<569B5ADF-FF08-4604-9A0B-60185F7543BA@gmail.com>
Message-ID: <4F41967C.90509@molden.no>

Den 20.02.2012 00:38, skrev Massimo Di Pierro:
> It would be very useful to have something like these channels 
> built-in. Notice that using OS pipes have the problem of a OS 
> dependent size. send is non-blocking for small data-size but becomes 
> blocking for large data sizes. Using OS mkfifo or multiprocessing 
> Queue is better but the OS limits the number of files open by one 
> program. 

Most MPI implementations use shared memory on localhost. In theory one 
could implement a queue (deque and lock) using a shared memory region (a 
file on /tmp or Windows equivalent). It would be extremely fast and 
could contain any number of "pipes" of arbitrary size.

Sturla


From sturla at molden.no  Mon Feb 20 01:43:47 2012
From: sturla at molden.no (Sturla Molden)
Date: Mon, 20 Feb 2012 01:43:47 +0100
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAB4yi1NqfQaoAi4zWoPnXmGDXmMrWF6K=DmEBT+SEBF9ao1riA@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CAB4yi1NqfQaoAi4zWoPnXmGDXmMrWF6K=DmEBT+SEBF9ao1riA@mail.gmail.com>
Message-ID: <4F419743.5040205@molden.no>

Den 20.02.2012 01:34, skrev Matt Joiner:
>
> Your implementation is incomplete.
>

It does the basic communication you asked for. I know it is a 
featureless proof-of-concept, why don't you fill in the rest? (I don't 
really care.)

Sturla


From massimo.dipierro at gmail.com  Mon Feb 20 01:44:38 2012
From: massimo.dipierro at gmail.com (Massimo Di Pierro)
Date: Sun, 19 Feb 2012 18:44:38 -0600
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <4F41967C.90509@molden.no>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F4169B3.6050101@molden.no>
	<569B5ADF-FF08-4604-9A0B-60185F7543BA@gmail.com>
	<4F41967C.90509@molden.no>
Message-ID: <A95245AA-3718-4A91-9E67-BF22F2363553@gmail.com>

+1

On Feb 19, 2012, at 6:40 PM, Sturla Molden wrote:

> Den 20.02.2012 00:38, skrev Massimo Di Pierro:
>> It would be very useful to have something like these channels built-in. Notice that using OS pipes have the problem of a OS dependent size. send is non-blocking for small data-size but becomes blocking for large data sizes. Using OS mkfifo or multiprocessing Queue is better but the OS limits the number of files open by one program. 
> 
> Most MPI implementations use shared memory on localhost. In theory one could implement a queue (deque and lock) using a shared memory region (a file on /tmp or Windows equivalent). It would be extremely fast and could contain any number of "pipes" of arbitrary size.
> 
> Sturla
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From anacrolix at gmail.com  Mon Feb 20 01:57:41 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 20 Feb 2012 08:57:41 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <A95245AA-3718-4A91-9E67-BF22F2363553@gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
	<4F4138FB.8010103@molden.no>
	<CACac1F_Hwj9PDqGr1ggik0aj4F_DUX3cyRtGd8ZC0GSUACWwxw@mail.gmail.com>
	<CAP7+vJKJtzeEZMPyRh6L9-=cL39dv_oZ7-_qZgV+6gSi_Nh_ZA@mail.gmail.com>
	<4F4169B3.6050101@molden.no>
	<569B5ADF-FF08-4604-9A0B-60185F7543BA@gmail.com>
	<4F41967C.90509@molden.no>
	<A95245AA-3718-4A91-9E67-BF22F2363553@gmail.com>
Message-ID: <CAB4yi1N=0Ly1YaWkJgZOQLSOV3kWhjrhhAEqLM+3f_PuCRk=gA@mail.gmail.com>

I've created http://bugs.python.org/issue14059 for the
multiprocessing.Barrier. I suggest a new thread be started to continue
discussion on that.

On Mon, Feb 20, 2012 at 8:44 AM, Massimo Di Pierro
<massimo.dipierro at gmail.com> wrote:
> +1
>
> On Feb 19, 2012, at 6:40 PM, Sturla Molden wrote:
>
>> Den 20.02.2012 00:38, skrev Massimo Di Pierro:
>>> It would be very useful to have something like these channels built-in. Notice that using OS pipes have the problem of a OS dependent size. send is non-blocking for small data-size but becomes blocking for large data sizes. Using OS mkfifo or multiprocessing Queue is better but the OS limits the number of files open by one program.
>>
>> Most MPI implementations use shared memory on localhost. In theory one could implement a queue (deque and lock) using a shared memory region (a file on /tmp or Windows equivalent). It would be extremely fast and could contain any number of "pipes" of arbitrary size.
>>
>> Sturla
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From anacrolix at gmail.com  Mon Feb 20 01:59:26 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Mon, 20 Feb 2012 08:59:26 +0800
Subject: [Python-ideas] channel (synchronous queue)
In-Reply-To: <CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
References: <4F404968.5070000@molden.no>
	<20120219100512.GA31747@cskk.homeip.net>
	<CAP7+vJ+pN8xf1cJqqNUj3bOH0hENe+w7_JveRkL+w+kxYp9AhA@mail.gmail.com>
	<4F411C0F.6030008@molden.no> <20120219170103.17f5a8b9@pitrou.net>
	<4F412A4D.4090102@molden.no> <20120219181852.1bce007c@pitrou.net>
	<4F4130FB.4040204@molden.no> <4F413322.2070703@molden.no>
	<CAP7+vJLq-61sKczHv+n=XrTEO44Ekt7RKZCU4idom5MmX9V6Mw@mail.gmail.com>
Message-ID: <CAB4yi1Nr7_aySe9OiVecqBYz4eBOf9RXasa1kW+oxrW+nbypCg@mail.gmail.com>

I've created http://bugs.python.org/issue14060 for the possibility of
a channel implementation.

On Mon, Feb 20, 2012 at 1:44 AM, Guido van Rossum <guido at python.org> wrote:
> On Sun, Feb 19, 2012 at 9:36 AM, Sturla Molden <sturla at molden.no> wrote:
>> Den 19.02.2012 18:27, skrev Sturla Molden:
>>
>>> Den 19.02.2012 18:18, skrev Antoine Pitrou:
>>>>
>>>> This begs the question: what does it achieve? You know that the data has
>>>> been "received" on the other side (i.e. get() has been called), but this
>>>> doesn't tell you anything was done with the data, so: why is this an useful
>>>> way to synchronize?
>>>
>>>
>>> I think it achieves nothing, except making deadlocks more likely.
>>
>>
>> Which is to say, I just wanted to prove how ridiculously simple Matt
>> Joiner's complaint about a "channel" was.
>
> I may be taking this out of context, but I have a really hard time
> understanding what you were trying to say. What does it mean for a
> complaint to be simple? Did you leave out a word in haste? (I know
> that happens a lot to me. :-)
>
>> The multiprocessing barrier on the other hand is quite useful. (Though the
>> butterfly method is not the most efficient implementation of a barrier.)
>
> Glad to see some real code. It's probably time to move the code
> samples to the bug tracker where they can be reviewed and have a
> chance of getting incorporated into the next release.
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From techtonik at gmail.com  Mon Feb 20 10:47:17 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 12:47:17 +0300
Subject: [Python-ideas] sys.path is a hack - bringing it back under control
Message-ID: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>

Hi,

I often find this in my scripts/projects, that I run directly from checkout:

DEVPATH = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, DEVPATH)

This seems like a hack to me, because the process of sys.path modification
is completely out of control for Python application developer, which means
it is easy to break an application and get lost. I don't remember the exact
user story for that bad association with sys.path (perhaps Django issue
#1908), but something makes me feel that I am not alone:
http://stackoverflow.com/questions/5500736/troubleshooting-python-sys-path

What I'd like to propose is some control/info over what modified sys.path.
The simplest case:
1. make sys.path a list of pairs   (path, file-that-added-the-path)
2. make sys.path read-only
3. add sys.path.add() method for modification
4. logger for sys.path.add() events (or recipe how to implement it in
documentation)

This will help a lot.

Limiting sys.path may cause a loss of some functionality if you need to
remove some or replace it completely, but I don't know where the ability to
reset() sys.path can be useful.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/008aaba5/attachment.html>

From simon.sapin at kozea.fr  Mon Feb 20 11:06:34 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Mon, 20 Feb 2012 11:06:34 +0100
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
Message-ID: <4F421B2A.6090100@kozea.fr>

Le 20/02/2012 10:47, anatoly techtonik a ?crit :
>
> I often find this in my scripts/projects, that I run directly from
> checkout:
>
> DEVPATH =os.path.dirname(os.path.abspath(__file__))
> sys.path.insert(0,DEVPATH)
>

Hi,

You shouldn?t have to do that if you?re running 'python something.py'

> As initialized upon program startup, the first item of this list,
> path[0], is the directory containing the script that was used to
> invoke the Python interpreter. If the script directory is not
> available (e.g. if the interpreter is invoked interactively or if the
> script is read from standard input), path[0] is the empty string,
> which directs Python to search modules in the current directory
> first.

http://docs.python.org/py3k/library/sys.html#sys.path

The trick is to place the script in the directory that you want in the 
path, ie. next to top-level packages. But from your code above this 
seems to be the case already...

Regards,
-- 
Simon Sapin


From techtonik at gmail.com  Mon Feb 20 11:31:21 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 13:31:21 +0300
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <4F421B2A.6090100@kozea.fr>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<4F421B2A.6090100@kozea.fr>
Message-ID: <CAPkN8x+k3cytd6Wn4D_N95=iyNjVTPjmeepQtsRMrBMg_FCFMQ@mail.gmail.com>

On Mon, Feb 20, 2012 at 1:06 PM, Simon Sapin <simon.sapin at kozea.fr> wrote:

> I often find this in my scripts/projects, that I run directly from
>> checkout:
>>
>> DEVPATH =os.path.dirname(os.path.**abspath(__file__))
>> sys.path.insert(0,DEVPATH)
>>
>>
> You shouldn?t have to do that if you?re running 'python something.py'
>

But I did for some reason, and right now I can't even say if it was
Windows, Linux, FreeBSD, PyPy, IPython, gdb or debugging from IDE.

 As initialized upon program startup, the first item of this list,
>> path[0], is the directory containing the script that was used to
>> invoke the Python interpreter. If the script directory is not
>> available (e.g. if the interpreter is invoked interactively or if the
>> script is read from standard input), path[0] is the empty string,
>> which directs Python to search modules in the current directory
>> first.
>>
>
> http://docs.python.org/py3k/**library/sys.html#sys.path<http://docs.python.org/py3k/library/sys.html#sys.path>
>
> The trick is to place the script in the directory that you want in the
> path, ie. next to top-level packages. But from your code above this seems
> to be the case already...
>

s/trick/hack/ and it will be just what I am saying. Not many Python
projects use this structure.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/032343d6/attachment.html>

From ncoghlan at gmail.com  Mon Feb 20 11:38:58 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 20 Feb 2012 20:38:58 +1000
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
Message-ID: <CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>

On Mon, Feb 20, 2012 at 7:47 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> Hi,
>
> I often find this in my scripts/projects, that I run directly from checkout:
>
> DEVPATH = os.path.dirname(os.path.abspath(__file__))
> sys.path.insert(0, DEVPATH)

PEP 395 describes my current plan to fix sys.path initialisation
(however, I can't yet promise that it will make it into 3.3, since it
doesn't even have a reference implementation yet, and I have several
other things I want to get done first).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From luoyonggang at gmail.com  Mon Feb 20 13:19:33 2012
From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=)
Date: Mon, 20 Feb 2012 20:19:33 +0800
Subject: [Python-ideas] Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject
 *o), This is important for OS dependent feature port.
In-Reply-To: <20120219192921.0e366a41@pitrou.net>
References: <CAE2XoE8atxG=_fSzyzksQS=FF5qCyiFFcqo-1nCeAEtKNLwSjQ@mail.gmail.com>
	<20120219192921.0e366a41@pitrou.net>
Message-ID: <CAE2XoE90RP2DwHTWKH7hMtvpWRJcyCcvw_m2qj9PXvnktV-Q1w@mail.gmail.com>

2012/2/20 Antoine Pitrou <solipsis at pitrou.net>

>
> > Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject *o), This is
> > important for OS dependent feature port.
>
> Why can't you use the existing wchar_t functions:
> http://docs.python.org/dev/c-api/unicode.html#wchar-t-support
> ?
>
> Thanks, got it:).

> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 
         ??
?
???
Yours
    sincerely,
Yonggang Luo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/cfde3564/attachment.html>

From techtonik at gmail.com  Mon Feb 20 14:11:20 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 16:11:20 +0300
Subject: [Python-ideas] Personal Project Roadmap (Was: sys.path is a hack -
 bringing it back under control)
Message-ID: <CAPkN8x+F5c1-RafiszOZB_E4tQqVfBHdkyAV9R7uxYYPtx6EPg@mail.gmail.com>

On Mon, Feb 20, 2012 at 1:38 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> (however, I can't yet promise that it will make it into 3.3, since it
> doesn't even have a reference implementation yet, and I have several
> other things I want to get done first).
>

I think that the idea of personal personal project roadmap would rock.
If I'd like something to be done faster, I could look at these "other
things" to see if I can help with some of them. In addition I could copy
some stuff to my own list to say that I am also interested. Once the item
reaches the top in somebody's list (or there a critical mass is reached),
he opens a hangout with other people or schedules a time for discussion.

The login method is Python account. Items are either bugs from trackers or
short inline notes in a tree-like structure.

Will it improve the Python development process?
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/f6ba237d/attachment.html>

From techtonik at gmail.com  Mon Feb 20 14:18:22 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 16:18:22 +0300
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
Message-ID: <CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>

On Mon, Feb 20, 2012 at 1:38 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Mon, Feb 20, 2012 at 7:47 PM, anatoly techtonik <techtonik at gmail.com>
> wrote:
> > Hi,
> >
> > I often find this in my scripts/projects, that I run directly from
> checkout:
> >
> > DEVPATH = os.path.dirname(os.path.abspath(__file__))
> > sys.path.insert(0, DEVPATH)
>
> PEP 395 describes my current plan to fix sys.path initialisation
> (however, I can't yet promise that it will make it into 3.3, since it
> doesn't even have a reference implementation yet, and I have several
> other things I want to get done first).
>

tl;dr :(

The abstract doesn't give any valuable info. "proposes new mechanisms to
eliminate some longstanding traps" doesn't say anything. Which mechanssms?
What traps? I see there a mention of my problem with Django. How can it
help to debug other sys.path problems? Do I really have to read 15 page
document to understand that?
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/da7ac0ef/attachment.html>

From ncoghlan at gmail.com  Mon Feb 20 14:58:04 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 20 Feb 2012 23:58:04 +1000
Subject: [Python-ideas] Personal Project Roadmap (Was: sys.path is a
 hack - bringing it back under control)
In-Reply-To: <CAPkN8x+F5c1-RafiszOZB_E4tQqVfBHdkyAV9R7uxYYPtx6EPg@mail.gmail.com>
References: <CAPkN8x+F5c1-RafiszOZB_E4tQqVfBHdkyAV9R7uxYYPtx6EPg@mail.gmail.com>
Message-ID: <CADiSq7fEAJSz7SCdK6X06eR9d3VF_eBjbst-gHTSuoTGON1zpw@mail.gmail.com>

On Mon, Feb 20, 2012 at 11:11 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> I think that the idea of personal personal project roadmap would rock.
> If?I'd like something to be done faster, I could look at these "other
> things" to see if I can help with some of them. In addition I could?copy
> some stuff to my own list to say that I am also interested. Once the item
> reaches the top in somebody's list (or there a critical mass is reached), he
> opens a hangout with other people or schedules a time for discussion.
>
> The login method is Python account. Items are either bugs from trackers or
> short inline notes in a tree-like structure.
>
> Will it improve the Python development process?

My own time spent on Python things certainly isn't that organised.
I'll have a couple of items on the "do this next" list (e.g. PEP 394
was at the top of my list recently, and getting PEP 409 finalised now
occupies that spot). However, I may switch to other things based on
external events (e.g. the email I just sent proposing acceptance of
PEP 3144 was based on Georg posting Peter's latest draft, the PEP 408
discussions a short while back that were prompted by Eli following up
on article I'd written some time ago with a full PEP), or because I
want to get them done while they're clear in my mind (e.g. the time I
spent last weekend writing up my summary of the text file processing
in Python 3 Unicode discussion was time that I had previously planned
to spend working on either PEP 394, which had already been resolved by
then, or on PEP 409).

There's a few other things that I'd like to get up on PyPI soon
(especially contextlib2.CallbackStack) so people can tinker with them
for a few months before the first 3.3 beta, which means setting up CI
for contextlib2 before I cut a new release. I also had an illuminating
off-list discussion with the PEP 407 authors and the 3.4 RM that I
want to write up as a new PEP before the language summit in a few
week's time (even though I won't be there in person). Other things
(like revamping the sequence docs to bring them into the modern Python
era or fixing CPython's longstanding operand precedence bug for
sequences implemented in C) have been postponed until after more of
the API related changes are done.

Then there's a whole cloud of "other things to do" (such as all the
bugs I'm nosy on on the tracker, all the issues I created because I
wanted to remember them but didn't have time to address immediately
myself, my perennial efforts to try to make callback-based programming
in Python feel less forced and awkward) that may attract my interest
at any given point in time.

A reference implementation for PEP 395 is definitely in the mix of
things I want to get done, but I'm happy to postpone even thinking
particularly hard about it until after the importlib bootstrapping
effort (which appears to be progressing well) is complete.

And all that's without even considering that I'm doing almost
everything Python related in personal time rather than work time, so
there's plenty of scope for life to intervene with higher priority
interrupts :)

My impression is that the other core devs work in a similar fashion -
our personal Python to-do lists are vague, nebulous things, not
well-formed long-term plans (except in particular cases, like specific
PEPs we're working on).

In important ways, Greg Kroah-Hartman's recent description of Linux
kernel development applies to CPython, too: "We always say that Linux
kernel development is 'evolution, not intelligent design,' in that
solutions are found to problems as they come up, so making forecasts
as to what is going to happen in the future is always quite
difficult,". In the CPython case, it's a matter of solutions generally
being achievable with the language *as it already exists* - the
proposed changes are mostly just ways of reducing external
dependencies, or allowing developers to achieve the same results while
writing less code of their own.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Mon Feb 20 15:16:30 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Feb 2012 00:16:30 +1000
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
Message-ID: <CADiSq7c9U-+5_Y9reZob3gLSv1Waw2b1b72d9wFBuD5N3WEwzA@mail.gmail.com>

On Mon, Feb 20, 2012 at 11:18 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> The abstract doesn't give any valuable info. "proposes new mechanisms to
> eliminate some longstanding traps" doesn't say anything. Which mechanssms?
> What traps? I see there a mention of my problem with Django. How can it help
> to debug other sys.path problems? Do I really have to read 15 page document
> to understand that?

Perhaps you could try reading the Table of Contents, too. (Hopefully
you don't find it too long - it's 20+ lines)

Or else you may want to refrain from participating in language
discussions if you aren't interested in understanding the topic in
depth. No serious design discussion can possibly be held amongst
people that are only willing to read a PEP abstract rather than the
full PEP. (But then, it's been suggested many times in the past that
you may get better responses if you don't make a habit of effectively
calling the current core developers a bunch of incompetent idiots, and
that doesn't appear to have had the slightest effect on your style of
communication. Why should this be any different?).

Regards,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From julien at tayon.net  Mon Feb 20 15:28:57 2012
From: julien at tayon.net (julien tayon)
Date: Mon, 20 Feb 2012 15:28:57 +0100
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
Message-ID: <CAFpLVkzGWfEu0gtinhAnXjynkSVAKeMuJUybDSm2ArcqMg16kA@mail.gmail.com>

Hello,

I am a bit confused. Tests are best not located amongst code, but in a
sub directory. I was strongly stated on #python to use  unittest(2) or
nose in order not to use the path hacks.

so did stackoverflow also stated when I googled it :
http://stackoverflow.com/questions/61151/where-do-the-python-unit-tests-go

Okay unittest tries relative import by adding a dot in front of the
name, but in the fallback, in the end, does not it uses the sys path
hack ? (my eyes may be old, my brain damaged by alcohol, but it looks
very much this way).

It looks like   hiding dust under the carpet, and stating that by
tabooing sys.path hack use and locating it in very savant module, the
problem gets solved.

I am honestly, just very candid on this one, and pretty puzzled.

sys path hack, looks to me a lot  like coupling between classes,
global varaibles, or gotos, and other beasts. It may be needed and yet
powerful therefore great wisdom is needed to handle them carefully.

PS (not sure I am 100% serious after this point)

Why not create a module like __use_wisefully__ then ?
people would be warned by being compelled to write :

from __use_wisefully__ import sys.path ?


Cheers,
-- 
Jul


From techtonik at gmail.com  Mon Feb 20 16:14:25 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 18:14:25 +0300
Subject: [Python-ideas] Personal Project Roadmap (Was: sys.path is a
 hack - bringing it back under control)
In-Reply-To: <CADiSq7fEAJSz7SCdK6X06eR9d3VF_eBjbst-gHTSuoTGON1zpw@mail.gmail.com>
References: <CAPkN8x+F5c1-RafiszOZB_E4tQqVfBHdkyAV9R7uxYYPtx6EPg@mail.gmail.com>
	<CADiSq7fEAJSz7SCdK6X06eR9d3VF_eBjbst-gHTSuoTGON1zpw@mail.gmail.com>
Message-ID: <CAPkN8xKOmAq3EViW2c2XPJVZfzdKvZ2ESW+0wxqz-A4ZbXkiLA@mail.gmail.com>

On Mon, Feb 20, 2012 at 4:58 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Mon, Feb 20, 2012 at 11:11 PM, anatoly techtonik <techtonik at gmail.com>
> wrote:
> > I think that the idea of personal personal project roadmap would rock.
> > If I'd like something to be done faster, I could look at these "other
> > things" to see if I can help with some of them. In addition I could copy
> > some stuff to my own list to say that I am also interested. Once the item
> > reaches the top in somebody's list (or there a critical mass is
> reached), he
> > opens a hangout with other people or schedules a time for discussion.
> >
> > The login method is Python account. Items are either bugs from trackers
> or
> > short inline notes in a tree-like structure.
> >
> > Will it improve the Python development process?
>
> My own time spent on Python things certainly isn't that organised.
> I'll have a couple of items on the "do this next" list (e.g. PEP 394
> was at the top of my list recently, and getting PEP 409 finalised now
> occupies that spot). However, I may switch to other things based on
> external events (e.g. the email I just sent proposing acceptance of
> PEP 3144 was based on Georg posting Peter's latest draft, the PEP 408
> discussions a short while back that were prompted by Eli following up
> on article I'd written some time ago with a full PEP), or because I
> want to get them done while they're clear in my mind (e.g. the time I
> spent last weekend writing up my summary of the text file processing
> in Python 3 Unicode discussion was time that I had previously planned
> to spend working on either PEP 394, which had already been resolved by
> then, or on PEP 409).
>
> There's a few other things that I'd like to get up on PyPI soon
> (especially contextlib2.CallbackStack) so people can tinker with them
> for a few months before the first 3.3 beta, which means setting up CI
> for contextlib2 before I cut a new release. I also had an illuminating
> off-list discussion with the PEP 407 authors and the 3.4 RM that I
> want to write up as a new PEP before the language summit in a few
> week's time (even though I won't be there in person). Other things
> (like revamping the sequence docs to bring them into the modern Python
> era or fixing CPython's longstanding operand precedence bug for
> sequences implemented in C) have been postponed until after more of
> the API related changes are done.
>
> Then there's a whole cloud of "other things to do" (such as all the
> bugs I'm nosy on on the tracker, all the issues I created because I
> wanted to remember them but didn't have time to address immediately
> myself, my perennial efforts to try to make callback-based programming
> in Python feel less forced and awkward) that may attract my interest
> at any given point in time.
>
> A reference implementation for PEP 395 is definitely in the mix of
> things I want to get done, but I'm happy to postpone even thinking
> particularly hard about it until after the importlib bootstrapping
> effort (which appears to be progressing well) is complete.
>
> And all that's without even considering that I'm doing almost
> everything Python related in personal time rather than work time, so
> there's plenty of scope for life to intervene with higher priority
> interrupts :)
>
> My impression is that the other core devs work in a similar fashion -
> our personal Python to-do lists are vague, nebulous things, not
> well-formed long-term plans (except in particular cases, like specific
> PEPs we're working on).
>
> In important ways, Greg Kroah-Hartman's recent description of Linux
> kernel development applies to CPython, too: "We always say that Linux
> kernel development is 'evolution, not intelligent design,' in that
> solutions are found to problems as they come up, so making forecasts
> as to what is going to happen in the future is always quite
> difficult,". In the CPython case, it's a matter of solutions generally
> being achievable with the language *as it already exists* - the
> proposed changes are mostly just ways of reducing external
> dependencies, or allowing developers to achieve the same results while
> writing less code of their own.
>

That's a good insight. That is an interesting info for new people, because
if gives a picture which ideas can be interesting to tackle, because they
have more contributors to help.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/c91c763b/attachment.html>

From solipsis at pitrou.net  Mon Feb 20 16:29:04 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 20 Feb 2012 16:29:04 +0100
Subject: [Python-ideas] Personal Project Roadmap (Was: sys.path is a
 hack - bringing it back under control)
References: <CAPkN8x+F5c1-RafiszOZB_E4tQqVfBHdkyAV9R7uxYYPtx6EPg@mail.gmail.com>
	<CADiSq7fEAJSz7SCdK6X06eR9d3VF_eBjbst-gHTSuoTGON1zpw@mail.gmail.com>
Message-ID: <20120220162904.57eb2dfb@pitrou.net>

On Mon, 20 Feb 2012 23:58:04 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> My impression is that the other core devs work in a similar fashion -
> our personal Python to-do lists are vague, nebulous things, not
> well-formed long-term plans (except in particular cases, like specific
> PEPs we're working on).

Agreed.

cheers

Antoine.


From techtonik at gmail.com  Mon Feb 20 18:39:48 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 20 Feb 2012 20:39:48 +0300
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CADiSq7c9U-+5_Y9reZob3gLSv1Waw2b1b72d9wFBuD5N3WEwzA@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
	<CADiSq7c9U-+5_Y9reZob3gLSv1Waw2b1b72d9wFBuD5N3WEwzA@mail.gmail.com>
Message-ID: <CAPkN8xK1UHuPXmf_MvAvFw2aoY=1mCHfSeTwHwrYnOB4sK_r6Q@mail.gmail.com>

On Mon, Feb 20, 2012 at 5:16 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Mon, Feb 20, 2012 at 11:18 PM, anatoly techtonik <techtonik at gmail.com>
> wrote:
> > The abstract doesn't give any valuable info. "proposes new mechanisms to
> > eliminate some longstanding traps" doesn't say anything. Which
> mechanssms?
> > What traps? I see there a mention of my problem with Django. How can it
> help
> > to debug other sys.path problems? Do I really have to read 15 page
> document
> > to understand that?
>
> Perhaps you could try reading the Table of Contents, too. (Hopefully
> you don't find it too long - it's 20+ lines)
>

I've read the ToC, but which of these parts answers the question: "How to
make debugging sys.path problems easier?"

Abstract
  Relationship with Other PEPs
What's in a __name__?
Traps for the Unwary
  Why are my imports broken?
  Importing the main module twice
  In a bit of a pickle
  Where's the source?
  Forkless Windows
Qualified Names for Modules
  Alternative Names
Eliminating the Traps
  Fixing main module imports inside packages
    Optional addition: command line relative imports
    Compatibility with PEP 382
    Incompatibility with PEP 402
    Potential incompatibilities with scripts stored in packages
  Fixing dual imports of the main module
  Fixing pickling without breaking introspection
  Fixing multiprocessing on Windows
Explicit relative imports


> Or else you may want to refrain from participating in language
> discussions if you aren't interested in understanding the topic in
> depth. No serious design discussion can possibly be held amongst
> people that are only willing to read a PEP abstract rather than the
> full PEP.


I didn't want to offend anybody by giving an impression that what you're
doing is not important. I realize that there are papers that people need to
read, especially who are willing to participate in ideas discussion, but
the point is that I'd like to have a simple answer for a simple proposal.

I read the proposal. In the following order:
PEP-0395: Abstract
PEP-3155: Rationale (skimmed)
PEP-3155: Proposal (reread several times, a lot of questions)
PEP-3155: Discussion (skim, got a feeling that there should be a link to
the actual discussion)
PEP-3155: Naming choice
PEP-3155: References (is still not clear what is `qualified name`)
http://en.wikipedia.org/wiki/QName
http://translate.google.com/#auto|ru|qualified%20name (got translation that
it is 'full name' - that makes sense)
PEP-3155: Naming choice (all right, the more intuitive 'full name' and
'path' are not really 'full name' and filesystem path, so the name is
different)
PEP-0395: Contents
PEP-0395: Qualifed Names for Modules (started - "To make it feasible to fix
these problems once and for all, it is proposed to add a new module level
attribute: __qualname__" - which problems?)
PEP-0395: Traps for the Unwary ("The overloading of the semantics of
__name__, along with some historically associated behaviour in the
initialisation of sys.path[0], has resulted in several traps for the
unwary" - damn, how is this gonna help to debug sys.path problems? gave up,
wrote a sad tl;dr smile)

Now I hope it gives an overview what difficulties a person who is
out-of-context has while trying to solve one tiny user story of debugging
sys.path. I just want everything to be as much simplified as possible,
possibly killing the fun for prose readers. Maybe I don't really want to
think about complex PEP matters, because the idea is just an episode in the
daily workflow. I'd also really prefer to keep complicated matters (e.g.
discussions) around tiny user stories, that don't require much time to load
into the brain and you can only concentrate on two or three of them that
are conflicting. Proposal to read 15 page technical paper doesn't work well
with this scenario, so if you just said - "Yes. You have to read that.",
I'd reply "Well, ok. Next time then.".

(But then, it's been suggested many times in the past that
> you may get better responses if you don't make a habit of effectively
> calling the current core developers a bunch of incompetent idiots, and
> that doesn't appear to have had the slightest effect on your style of
> communication. Why should this be any different?).
>

I am not an English writer, but I am interested to know where did this
impression of me calling core developers a bunch of incompetent idiots is
coming from. If anybody can quote concrete example and explain in private -
I may have a chance to change something. My English is a result of
learning legal and technical English texts, not love letters, and I may not
possess the communication skills required to write proper letters in
informal language (which also I prefer more than business stuff). I can
write in third person without *you* or *I* other personal pronounce, but it
takes more time to compose the proper form, so the note like this one can
take an hour or more (it already took more), and time is that I really
lack. Not me alone, though, but I may be too obsessed with saving someone
else's time by placing too much attention to it, indeed.
-- 
anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/a9787084/attachment.html>

From barry at python.org  Mon Feb 20 19:28:32 2012
From: barry at python.org (Barry Warsaw)
Date: Mon, 20 Feb 2012 13:28:32 -0500
Subject: [Python-ideas] doctest
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
Message-ID: <20120220132832.76b772da@resist.wooz.org>

On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:

>I find myself wanting to use doctest for some test-driven development,
>and find myself slightly frustrated and wonder if others would be
>interested in seeing the following additional functionality in
>doctest:

FWIW, I think doctests are fantastic and I use them all the time.  There are
IMO a couple of things to keep in mind:

 - doctests are documentation first.  Specifically, they are testable
   documentation.  What better way to ensure that your documentation is
   accurate and up-to-date?  (And no, I do not generally find skew between the
   code and the separate-file documentation.)

 - I personally dislike docstring doctests, and much prefer separate reST
   documents.  These have several advantages, such as the ability to inject
   names into doctests globals (use with care though), and the ability to set
   up the execution context for doctests (see below).  The fact that it's so
   easy to turn these into documentation with Sphinx is a huge win.

Since so many people point this out, let me say that I completely agree that
doctests are not a *replacement* for unittests, but they are a fantastic
*complement* to unittests.  When I TDD, I always start writing the
(testable) documentation first, because if I cannot explain the component
under test in clearly intelligible English, then I probably don't really
understand what it is I'm trying to write.

My doctests usually describe mostly the good path through the API.
Occasionally I'll describe error modes if I think those are important for
understanding how to use the code.  However, for all those fuzzy corner cases,
weird behaviors, bug fixes, etc., unittests are much better suited because
ensuring you've fixed these problems and don't regress in the future doesn't
help the narrative very much.

>1. Execution context determined by outer-scope doctest defintions.

Can you explain this one?  For the separate-reST-document style I use, these
are almost always driven by a test_documentation.py which ostensibly fits into
the unittest framework.  It searches for .rst files and builds up
DocFileSuites around them.  Using this style it is very easy to clean up
resources, reset persistent state (e.g. reset the database after every
doctest), call setUp and tearDown methods, and even correctly fiddle the
__future__ state expected by doctests.

I usually put all this in an additional_tests() method, such as:

http://bazaar.launchpad.net/~barry/flufl.enum/trunk/view/head:/flufl/enum/tests/test_documentation.py

So setting up context is as easy as writing a setUp() method and passing that
to DocFileSuite.

One thing that bums me out about this is that I haven't really made the bulk
of additional_tests() very generic.  I usually cargo cult most of this code
into every package I write. :(

>2. Smart Comparisons that will detect output of a non-ordered type
>(dict/set), lift and recast it and do a real comparison.

I'm of mixed mind with these.  Yes, you must be careful with ordering, but I
find it less readable to just sort() some dictionary output for example.  What
I've found much more useful is to iterate over the sorted keys of a dictionary
and print the key/values pairs.  This general pattern has a few advantages,
such as the ability to add some filtering to the output if you don't care
about everything, and more importantly, the ability to print most string
values without their u'' prefix (for better py2/py3 compatibility from the
same code base without the use of 2to3).  Nested structures can be more
problematic, but I've often found that as the output gets uglier, the
narrative suffers, so that's a good time to re-evaluate your documentation!

>Without #1, "literate testing" becomes awash with re-defining re-used
>variables which, generally, also detracts from exact purpose of the
>test -- this creates testdoc noise and the docs become less useful.
>Without #2, "readable docs" nicely co-aligning with "testable docs"
>tends towards divergence.
>
>Perhaps not enough developers use doctest to care, but I find it one
>of the more enjoyable ways to develop python code -- I don't have to
>remember test cases nor go through the trouble of setting up
>unittests.   AND, it encourages agile development.  Another user wrote
>a while back of even having a built-in test() method.  Wouldn't that
>really encourage agile developement?  And you wouldn't have to muddy
>up your code with "if __name__ == "__main__": import doctest, yadda
>yadda".
>
>Anyway... of course patches welcome, yes...  ;^)

I've no doubt that doctests could be improved, but I actually find them quite
usable as is, with just a little bit of glue code to get it all hooked up.  As
I say though, I'm biased against docstring doctests.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120220/6f556bb9/attachment.pgp>

From tjreedy at udel.edu  Mon Feb 20 21:58:35 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 20 Feb 2012 15:58:35 -0500
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8xK1UHuPXmf_MvAvFw2aoY=1mCHfSeTwHwrYnOB4sK_r6Q@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
	<CADiSq7c9U-+5_Y9reZob3gLSv1Waw2b1b72d9wFBuD5N3WEwzA@mail.gmail.com>
	<CAPkN8xK1UHuPXmf_MvAvFw2aoY=1mCHfSeTwHwrYnOB4sK_r6Q@mail.gmail.com>
Message-ID: <jhuc69$9tv$1@dough.gmane.org>

On 2/20/2012 12:39 PM, anatoly techtonik wrote:

> I am not an English writer,

I am a native English (American) speaker/writer who can barely write in 
one other natural language (Spanish). So I have great sympathy for and 
appreciation for those who struggle with English. And I am willing to 
help those who wish to improve.

 > but I am interested to know where did this
> impression of me calling core developers a bunch of incompetent idiots
> is coming from.

 From the way you have written in the past. Nick may have been 
exaggerating a bit, but I have gotten similar impressions, though you 
have been writing better and more effectively recently. I think it was 
just a month ago that you were persuasive enough to get something added 
for 3.3.

 > If anybody can quote concrete example and explain in
> private - I may have a chance to change something.

Now that I know that it is not your intention to come across as 
antagonistic, I will try to do the above if I see bad examples in the 
future.

 > My English is a result of learning legal and technical English texts,

The legal part in interesting. Legal English in adversarial situations 
is used to metaphorically club people -- or to confuse people.

In reading the Argentina Python users list, I have noticed that 
conversational Spanish is not exactly the same as the formal Spanish I 
learned in classes some decades ago. But it is sometimes hard to know 
what is a sloppy error and what is an accepted idiom.

-- 
Terry Jan Reedy


From ncoghlan at gmail.com  Mon Feb 20 23:44:11 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Feb 2012 08:44:11 +1000
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8xK1UHuPXmf_MvAvFw2aoY=1mCHfSeTwHwrYnOB4sK_r6Q@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
	<CADiSq7c9U-+5_Y9reZob3gLSv1Waw2b1b72d9wFBuD5N3WEwzA@mail.gmail.com>
	<CAPkN8xK1UHuPXmf_MvAvFw2aoY=1mCHfSeTwHwrYnOB4sK_r6Q@mail.gmail.com>
Message-ID: <CADiSq7dqLUmpieEog4EHEShvj24Xo_kM6fVb=zRXKGDmP_CJsw@mail.gmail.com>

On Tue, Feb 21, 2012 at 3:39 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> I've read the ToC, but which of these parts answers the question: "How to
> make debugging sys.path problems easier?"

Ah, my apologies. I must confess to having misread your original email
(I paid too much attention to the first half, not enough to the
latter).

PEP 395 aims to avoid people feeling the need to mess with sys.path in
the first place, thus reducing the likelihood of problems occurring in
the first place.

For *debugging* sys.path, as you say, the problem is figuring out who
is messing it up after problems have already occurred and you've found
strange entries in there.

There's definitely a case to be made that sys.path should be a smarter
kind of object by default, one that accepts callbacks to be triggered
when modifications occur. (such behaviour would be useful for updating
namespace package __path__ attributes, for instance).

However, for purely debugging purposes, it should be sufficient to
monkeypatch sys.path with an object that write any changes through to
the original path, while overriding the various mutations methods to
report the source of the modification (via stack introspection). IIt's
the same kind of technique you can use to investigate faults in *any*
kind of container. (Versions based on UserDict and UserList might be
interesting Python Cookbook recipes).

> Now I hope it gives an overview what difficulties a?person who is
> out-of-context has while trying to solve one tiny user story of debugging
> sys.path. I just want everything to be as much simplified as possible,
> possibly killing the fun for prose readers. Maybe I don't really want to
> think about complex PEP matters, because the idea is just an episode in the
> daily workflow. I'd also really prefer to keep complicated matters (e.g.
> discussions) around tiny user stories, that don't require much time to load
> into the brain and you can only concentrate on two or three of them that are
> conflicting. Proposal to read 15 page technical paper doesn't work well with
> this scenario, so if you just said - "Yes. You have to read that.", I'd
> reply "Well, ok. Next time then.".

The fault was mine - I didn't understand your suggestion correctly, so
I didn't realise that PEP 395 doesn't actually address it.

>> (But then, it's been suggested many times in the past that
>> you may get better responses if you don't make a habit of effectively
>> calling the current core developers a bunch of incompetent idiots, and
>> that doesn't appear to have had the slightest effect on your style of
>> communication. Why should this be any different?).
>
>
> I am not an English writer, but I am interested to know where did this
> impression of me calling core developers a bunch of incompetent idiots is
> coming from. If anybody can quote concrete example and explain in private -
> I may have a chance to change something. My English is a result of
> learning?legal and technical English texts, not love letters, and I may not
> possess the communication skills required to write proper letters in
> informal language (which also I prefer more than business stuff). I can
> write in third person without *you* or *I* other personal pronounce, but it
> takes more time to compose the proper form, so the note like this one can
> take an hour or more (it already took more), and time is that I really lack.
> Not me alone, though, but I may be too obsessed with saving someone else's
> time by placing too much attention to it, indeed.

Anatoly, thanks for taking the time to explain that. The impression
comes from the fact that many of the things you object to within
Python are largely a result of limited availability of development
resources, so even things that are at least arguably good ideas simply
don't get investigated. The universe of good ideas is vast, the
universe of bad ideas is even larger, but the space we have the
capacity to explore is actually relatively tiny.

Since there's such an enormous number of things that *could* be done,
the answer to "why aren't they done?" is almost always going to
"because people don't think they're important enough to do them
instead of all the other things they're doing". Deciding how to spend
our time on Python-related efforts is a matter of perceived priorities
and potential payoffs and those are always going to vary substantially
across individuals.

Being more willing to accept that as a rationale for not doing things
would go a long way towards reducing the negative reactions - I know
mine arise not so much for your initial suggestions (which are often,
although not always, quite reasonable ideas in a world where we had
unlimited development resources), but from subsequently continuing to
push them in the face "because we're simply not interested in doing
that" and "the status quo may not be perfect, but it's good enough
that it isn't worth the hassle of changing" responses.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From victor.stinner at haypocalc.com  Tue Feb 21 00:34:59 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 21 Feb 2012 00:34:59 +0100
Subject: [Python-ideas] Lack a API WCHAR* PyUnicode_WCHAR_DATA(PyObject
 *o), This is important for OS dependent feature port.
In-Reply-To: <CAE2XoE90RP2DwHTWKH7hMtvpWRJcyCcvw_m2qj9PXvnktV-Q1w@mail.gmail.com>
References: <CAE2XoE8atxG=_fSzyzksQS=FF5qCyiFFcqo-1nCeAEtKNLwSjQ@mail.gmail.com>
	<20120219192921.0e366a41@pitrou.net>
	<CAE2XoE90RP2DwHTWKH7hMtvpWRJcyCcvw_m2qj9PXvnktV-Q1w@mail.gmail.com>
Message-ID: <CAMpsgwZF8M-a-YLirKvLk4ecUrdyYYPQBr1xgj_BjaaO3s7HFQ@mail.gmail.com>

It's not a lack, it's a design choice.

Unicode strings are no more stored as wchar_t* in Python 3.3, but as a
compact storage (1, 2 or 4 bytes per character). The conversion to
wchar_t* require a conversion in most cases (no conversion is needed
if the string already uses sizeof(wchar_t) bytes per character).


From steve at pearwood.info  Tue Feb 21 00:40:10 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 21 Feb 2012 10:40:10 +1100
Subject: [Python-ideas] sys.path is a hack - bringing it back
	under	control
In-Reply-To: <CAPkN8x+k3cytd6Wn4D_N95=iyNjVTPjmeepQtsRMrBMg_FCFMQ@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>	<4F421B2A.6090100@kozea.fr>
	<CAPkN8x+k3cytd6Wn4D_N95=iyNjVTPjmeepQtsRMrBMg_FCFMQ@mail.gmail.com>
Message-ID: <4F42D9DA.9040502@pearwood.info>

anatoly techtonik wrote:
> On Mon, Feb 20, 2012 at 1:06 PM, Simon Sapin <simon.sapin at kozea.fr> wrote:
> 
>> I often find this in my scripts/projects, that I run directly from
>>> checkout:
>>>
>>> DEVPATH =os.path.dirname(os.path.**abspath(__file__))
>>> sys.path.insert(0,DEVPATH)
>>>
>>>
>> You shouldn?t have to do that if you?re running 'python something.py'
>>
> 
> But I did for some reason, and right now I can't even say if it was
> Windows, Linux, FreeBSD, PyPy, IPython, gdb or debugging from IDE.


If you can't say why you did it, how can we judge whether you did it for a 
good reason or a bad reason?

Having a user-accessible search path is not a hack, or if it is, it is a hack 
in the positive sense: a feature, not a design bug. The same concept is used 
by Unix tools, via the PATH environment variable. It is "the simplest thing 
that could possibly work" for solving the problem of configurable search paths.

Personally, I don't believe sys.pack needs to be brought back under control, 
because I don't believe it is out of control. Most code doesn't need to mess 
with the path; of that which does, most does not lead to problems.

The only time I have seen path problems is when I have accidentally shadowed 
standard library modules, and they are simple to solve. Perhaps others have 
experienced harder problems, and if so, they have my sympathy, but I don't 
believe this is a problem so great that it needs to break backward compatibility.

I would say, though, that nearly every time I have changed sys.path, I would 
have been satisfied with some way of importing directly from a known location.

import spam from 'this/is/a/relative/path'
from spam import ham from '/and/this/is/an/absolute/path'

sort of thing, although I can see that import...from and from...import are too 
similar for comfort.


-- 
Steven


From wuwei23 at gmail.com  Tue Feb 21 04:37:55 2012
From: wuwei23 at gmail.com (alex23)
Date: Mon, 20 Feb 2012 19:37:55 -0800 (PST)
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
Message-ID: <32d96ac7-487c-4428-8aa7-7c2d7f43b296@i10g2000pbc.googlegroups.com>

On Feb 20, 11:18?pm, anatoly techtonik <techto... at gmail.com> wrote:
> tl;dr :(

You're _constantly_ bemoaning the "obvious" lack of clear
communication paths in the Python community, and yet when you're
pointed to an _explicit piece of documentation that answers your
concerns_ you can't even be bothered to read it.

It's pretty damn "obvious" that your only real issue with
communication is when it isn't being spoon fed to you.


From ncoghlan at gmail.com  Tue Feb 21 05:09:00 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 21 Feb 2012 14:09:00 +1000
Subject: [Python-ideas] sys.path is a hack - bringing it back under
	control
In-Reply-To: <32d96ac7-487c-4428-8aa7-7c2d7f43b296@i10g2000pbc.googlegroups.com>
References: <CAPkN8xLtJwcskZGdS_Z-HnwaN2Uu+d==-sHAMssnJiH_ZiKxWw@mail.gmail.com>
	<CADiSq7ci=S1zagNdjBcHY583dNzEt5uPj6pa+tqs+e+-rGnZLQ@mail.gmail.com>
	<CAPkN8x+OwrkKnSvGiWcauo_qOyRb7ViYKdDyfLEZ3tT4bXGxTA@mail.gmail.com>
	<32d96ac7-487c-4428-8aa7-7c2d7f43b296@i10g2000pbc.googlegroups.com>
Message-ID: <CADiSq7e9ycdYHsLViYA+3ViAVonNrGideG7oOg-R6LkOoD2LQQ@mail.gmail.com>

On Tue, Feb 21, 2012 at 1:37 PM, alex23 <wuwei23 at gmail.com> wrote:
> On Feb 20, 11:18?pm, anatoly techtonik <techto... at gmail.com> wrote:
>> tl;dr :(
>
> You're _constantly_ bemoaning the "obvious" lack of clear
> communication paths in the Python community, and yet when you're
> pointed to an _explicit piece of documentation that answers your
> concerns_ you can't even be bothered to read it.
>
> It's pretty damn "obvious" that your only real issue with
> communication is when it isn't being spoon fed to you.

In Anatoly's defence (and as he clarified in a later message), PEP 395
really *didn't* answer his question, and he had made his way through
quite a bit of it (and PEP 3155, which it references) before giving up
on trying to figure out how it was relevant - I had simply
misunderstood the original email.

After I *did* understand it, I pointed out that investigating
unexpected or undesirable modifications to mutable containers when the
data changes aren't enough to pinpoint the culprit is actually one of
the valid use cases for monkeypatching rather than a reason to change
the language behaviour. (That said, there are other, more valid,
arguments in favour of providing a notification mechanism for sys.path
changes, mainly relating to namespace packages. That's a different
discussion, though, and one more appropriate for import-sig).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From dcolish at gmail.com  Wed Feb 22 16:49:14 2012
From: dcolish at gmail.com (Dan Colish)
Date: Wed, 22 Feb 2012 07:49:14 -0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
Message-ID: <4F450E7A.4060804@gmail.com>

Hey,

I was reading over the difflib docs this morning and when I got to the
bottom, I expected, probably due to lack of coffee, that the example
would be callable as the module from the command line. There are already
a number of modules which export command line functionality, ie.
unittest, and I thought it would be great if difflib module offered the
same. The code is pretty much there in the example from the
documentation. It would just need to be included in the module itself.

--Dan


From dreamingforward at gmail.com  Wed Feb 22 20:47:57 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Wed, 22 Feb 2012 12:47:57 -0700
Subject: [Python-ideas]  doctest (re-send to list)
Message-ID: <CAMjeLr8L1BDJSc1-tgNnh=mfeo2Hkn_rUtsRqOY8EjEtDgugcg@mail.gmail.com>

On Mon, Feb 20, 2012 at 11:28 AM, Barry Warsaw <barry at python.org> wrote:
> On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:
> FWIW, I think doctests are fantastic and I use them all the time. ?There are
> IMO a couple of things to keep in mind:
>
> ?- doctests are documentation first. ?Specifically, they are testable
> ? documentation. ?What better way to ensure that your documentation is
> ? accurate and up-to-date? ?(And no, I do not generally find skew between the
> ? code and the separate-file documentation.)
>
> ?- I personally dislike docstring doctests, and much prefer separate reST
> ? documents. ?These have several advantages, such as the ability to inject
> ? names into doctests globals (use with care though), and the ability to set
> ? up the execution context for doctests (see below). ?The fact that it's so
> ? easy to turn these into documentation with Sphinx is a huge win.
>
> Since so many people point this out, let me say that I completely agree that
> doctests are not a *replacement* for unittests, but they are a fantastic
> *complement* to unittests. ?When I TDD, I always start writing the
> (testable) documentation first, because if I cannot explain the component
> under test in clearly intelligible English, then I probably don't really
> understand what it is I'm trying to write.
>
> My doctests usually describe mostly the good path through the API.
> Occasionally I'll describe error modes if I think those are important for
> understanding how to use the code. ?However, for all those fuzzy corner cases,
> weird behaviors, bug fixes, etc., unittests are much better suited because
> ensuring you've fixed these problems and don't regress in the future doesn't
> help the narrative very much.

I think is an example of (mal)adapting to an incomplete module, rather
than fixing it. ?I think doctest can handle all the points you're
making. ?See clarification pointers below...

>>1. Execution context determined by outer-scope doctest defintions.
>
> Can you explain this one?

I gave an example in a prior message on this thread, dated Feb 17. ?I
think it's clear there but let me know.

Basically, the idea is that since the class def can also have a
docstring, where better would setup and teardown code go to provide
the execution context of the inner method docstrings?

Now the question: ?is it useful or appropriate to put setup and
teardown code in a classdef docstring? ?Well, I think this requires a
committment on the behalf of the coder/documentor to concoct useful
(didactic) example that could go there. ?For example, (as in the
prior-referenced message) I imagine putting example of defining a
variable of the classes type (">>> g = Graph({some complex,
interesting initialization})"), which might return a (testable) value
upon creation.

Now this could, logically, be put in the classes __init__ method, but
that doesn't make sense for defining an execution context, and *in
addition*, that can be saved for those complex corner cases you
mentioned earlier.

> I usually put all this in an additional_tests() method, such as:

Yes, I do the same for my modules with doctests. ?A dummy function
which can catch all the non-interesting tests. ?This, still superior,
in my opinion, than unittest. ?It is easier syntactically, as well as
for casual users of your code (It has no leaning curve like
understanding unittest).

This superiority to unittest, by the way, is only realized if the
second suggestion (smart comparisons) is implemented into doctest.

>>2. Smart Comparisons that will detect output of a non-ordered type
>>(dict/set), lift and recast it and do a real comparison.
>
> I'm of mixed mind with these. ?Yes, you must be careful with ordering, but I
> find it less readable to just sort() some dictionary output for example. ?What
> I've found much more useful is to iterate over the sorted keys of a dictionary
> and print the key/values pairs.

Yes, but you see you're destroying the very intent and spirit of
doctest. ?The point is to make literate documentation. ?If you adapt
to it's incompleteness, you reduce the power of it.

>>Without #1, "literate testing" becomes awash with re-defining re-used
>>variables which, generally, also detracts from exact purpose of the
>>test -- this creates testdoc noise and the docs become less useful.
>>Without #2, "readable docs" nicely co-aligning with "testable docs"
>>tends towards divergence.
>
> I've no doubt that doctests could be improved, but I actually find them quite
> usable as is, with just a little bit of glue code to get it all hooked up. ?As
> I say though, I'm biased against docstring doctests.

Well, hopefully, I've convinced you a little that the limitations in
doctests over unittests are almost, if not entirely due, to the
incompleteness of the module. ?If the two items I mentioned were
implemented I think it would be far superior to unittest. ?(Corner
cases, etc can all find a place, because every corner case should be
documented somewhere anyway!!)

Cheers!!

mark
santa fe, nm


From tjreedy at udel.edu  Wed Feb 22 22:40:32 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 22 Feb 2012 16:40:32 -0500
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F450E7A.4060804@gmail.com>
References: <4F450E7A.4060804@gmail.com>
Message-ID: <ji3ncv$p1a$1@dough.gmane.org>

On 2/22/2012 10:49 AM, Dan Colish wrote:

> I was reading over the difflib docs this morning and when I got to the
> bottom, I expected, probably due to lack of coffee, that the example
> would be callable as the module from the command line.

This is slightly garbled, but after looking, I see what you mean.
As the doc says, the 'example' is available as Tools/Scripts/diff.
Tools/Scripts/ndiff is another command-line front end for difflib.
I believe difflib was extracted from the original version of ndiff.

 > There are already
> a number of modules which export command line functionality, ie.
> unittest, and I thought it would be great if difflib module offered the
> same.

If you run difflib directly, it runs difflib._test. which runs a doctest 
on difflib. Most modules do something similar. Having a real 
command-line interface in the module itself is unusual.

 > The code is pretty much there in the example from the
> documentation. It would just need to be included in the module itself.

I don't immediately see it as worth the trouble. I bet someone somewhere 
has a script that uses the interface in its current location.

-- 
Terry Jan Reedy


From dcolish at gmail.com  Wed Feb 22 22:47:59 2012
From: dcolish at gmail.com (Dan Colish)
Date: Wed, 22 Feb 2012 13:47:59 -0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <ji3ncv$p1a$1@dough.gmane.org>
References: <4F450E7A.4060804@gmail.com> <ji3ncv$p1a$1@dough.gmane.org>
Message-ID: <4F45628F.6060400@gmail.com>

On 2/22/12 1:40 PM, Terry Reedy wrote:
> On 2/22/2012 10:49 AM, Dan Colish wrote:
>
>> I was reading over the difflib docs this morning and when I got to the
>> bottom, I expected, probably due to lack of coffee, that the example
>> would be callable as the module from the command line.
>
> This is slightly garbled, but after looking, I see what you mean.
> As the doc says, the 'example' is available as Tools/Scripts/diff.
> Tools/Scripts/ndiff is another command-line front end for difflib.
> I believe difflib was extracted from the original version of ndiff.
>
Yes, I realized shortly after sending how unintelligible that sounded.
Yes, even thought those tools exist, they are not installed as part of
the Python build.
> > There are already
>> a number of modules which export command line functionality, ie.
>> unittest, and I thought it would be great if difflib module offered the
>> same.
>
> If you run difflib directly, it runs difflib._test. which runs a
> doctest on difflib. Most modules do something similar. Having a real
> command-line interface in the module itself is unusual.
>
Oh, I was unaware of that behavior. That's really good to know. Is this
behavior documented?
> > The code is pretty much there in the example from the
>> documentation. It would just need to be included in the module itself.
>
> I don't immediately see it as worth the trouble. I bet someone
> somewhere has a script that uses the interface in its current location.
>
I didn't think it would be that much trouble. It would be simple to
install the scripts from Tools/Scripts. Either way I liked the idea of
providing a cli frontend to difflib as part of the python install.

--Dan


From ncoghlan at gmail.com  Wed Feb 22 23:08:08 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Feb 2012 08:08:08 +1000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <ji3ncv$p1a$1@dough.gmane.org>
References: <4F450E7A.4060804@gmail.com>
	<ji3ncv$p1a$1@dough.gmane.org>
Message-ID: <CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>

On Thu, Feb 23, 2012 at 7:40 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> If you run difflib directly, it runs difflib._test. which runs a doctest on
> difflib. Most modules do something similar. Having a real command-line
> interface in the module itself is unusual.

That's largely a historical artifact though - prior to -m direct
execution was a pain, so the only time it really happened was in a
source checkout during development. (plus I don't believe regrtest
always had selective test execution, so run the library directly was a
good way to only run some of the tests).

If there's useful functionality that can be provided via -m, I'm a fan
of moving tests out of the way to make room for it (it's also a good
opportunity to make sure regrtest is covering whatever __main__
execution tests).

I think there's also an open tracker issue suggesting the creation of
a dedicated section in the standard library docs that summarises all
the modules that offer useful -m functionality.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From masklinn at masklinn.net  Wed Feb 22 23:24:59 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 22 Feb 2012 23:24:59 +0100
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com> <ji3ncv$p1a$1@dough.gmane.org>
	<CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>
Message-ID: <99329814-E535-4C2E-9762-6D53BBE6D65B@masklinn.net>

On 2012-02-22, at 23:08 , Nick Coghlan wrote:
> On Thu, Feb 23, 2012 at 7:40 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>> If you run difflib directly, it runs difflib._test. which runs a doctest on
>> difflib. Most modules do something similar. Having a real command-line
>> interface in the module itself is unusual.
> 
> That's largely a historical artifact though - prior to -m direct
> execution was a pain, so the only time it really happened was in a
> source checkout during development. (plus I don't believe regrtest
> always had selective test execution, so run the library directly was a
> good way to only run some of the tests).
> 
> If there's useful functionality that can be provided via -m, I'm a fan
> of moving tests out of the way to make room for it (it's also a good
> opportunity to make sure regrtest is covering whatever __main__
> execution tests).
> 
> I think there's also an open tracker issue suggesting the creation of
> a dedicated section in the standard library docs that summarises all
> the modules that offer useful -m functionality.

Last time this popped up, Raymond Hettinger noted undocumented
command-line interfaces to stdlib modules are mostly intentional:
http://mail.python.org/pipermail/docs/2011-February/003171.html

Maybe things have changed since, at the time the sentiment
Raymond expressed was pretty much "not going to happen".

But if you want a list, there's one at
http://www.reddit.com/r/Python/comments/fofan/suggestion_for_a_python_blogger_figure_out_what/

Though things may have changed since and it's for Python 2, it's
a starting point.

From ncoghlan at gmail.com  Thu Feb 23 01:27:45 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Feb 2012 10:27:45 +1000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <99329814-E535-4C2E-9762-6D53BBE6D65B@masklinn.net>
References: <4F450E7A.4060804@gmail.com> <ji3ncv$p1a$1@dough.gmane.org>
	<CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>
	<99329814-E535-4C2E-9762-6D53BBE6D65B@masklinn.net>
Message-ID: <CADiSq7eojG1Ttn6aD8Z1o-399Bf5bgTDpuY5hqfMBs6mzrWUrA@mail.gmail.com>

On Thu, Feb 23, 2012 at 8:24 AM, Masklinn <masklinn at masklinn.net> wrote:
> Last time this popped up, Raymond Hettinger noted undocumented
> command-line interfaces to stdlib modules are mostly intentional:
> http://mail.python.org/pipermail/docs/2011-February/003171.html

In my view, the most important points in Raymond's email are the first
and the last:

* Many of the undocumented command-line interfaces are
intentionally undocumented -- they were there for the
convenience of the developer for exercising the module
as it was being developed and are not part of the official API.
Most are not production quality and would have been done
much differently if that had been the intent.

* All that being said, there are some exceptions and it
make may sense to document the interface in some where
we really do want a command-line app.  I'll look at
any patches you want to submit, but try to not go wild
turning the library into a suite of applications.  For
the most part, that is not what the standard library
is about.

What I'm envisioning is a dedicated section along the lines of

X. Command Line Functionality in the Standard Library
X.1 Supported Command Line Interfaces
This section would list modules that provide a command line interface
as detailed in the module documentation. A brief description would be
given here, along with a link to the relevant section of the module
docs. It would mainly consist of Python specific utilities for dumping
diagnostic information about the interpreter's own state or analysing
Python programs. Any CLIs in this section should also have associated
unittests in their regression test suites.

Interpreter Diagnostics
- site
- platform
- locale

Execution and Analysis of Python Code
- runpy
- unittest
- doctest
- pydoc
- timeit
- dis
- tokenize
- pdb
- profile
- pstats
- modulefinder

X.2 Unsupported Command Line Interfaces

This section would list modules that offer command line functionality
that is *not* designed to be production quality, but rather exists
primarily as an interactive testing tool for sanity checking when
working on the modules themselves. The only documentation of the
functionality would be the brief descriptions here and the module's
own interactive help (if any). It should be made clear that these
interfaces are *not* covered by the regression test suite and they may
break without warning.

All the simple cross-platform file processing, networking and protocol
handling utilities would be listed here.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From dcolish at gmail.com  Thu Feb 23 05:56:31 2012
From: dcolish at gmail.com (Dan Colish)
Date: Wed, 22 Feb 2012 20:56:31 -0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CADiSq7eojG1Ttn6aD8Z1o-399Bf5bgTDpuY5hqfMBs6mzrWUrA@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com> <ji3ncv$p1a$1@dough.gmane.org>
	<CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>
	<99329814-E535-4C2E-9762-6D53BBE6D65B@masklinn.net>
	<CADiSq7eojG1Ttn6aD8Z1o-399Bf5bgTDpuY5hqfMBs6mzrWUrA@mail.gmail.com>
Message-ID: <4F45C6FF.9060402@gmail.com>

On 2/22/12 4:27 PM, Nick Coghlan wrote:
> On Thu, Feb 23, 2012 at 8:24 AM, Masklinn <masklinn at masklinn.net> wrote:
>> Last time this popped up, Raymond Hettinger noted undocumented
>> command-line interfaces to stdlib modules are mostly intentional:
>> http://mail.python.org/pipermail/docs/2011-February/003171.html
> In my view, the most important points in Raymond's email are the first
> and the last:
>
> * Many of the undocumented command-line interfaces are
> intentionally undocumented -- they were there for the
> convenience of the developer for exercising the module
> as it was being developed and are not part of the official API.
> Most are not production quality and would have been done
> much differently if that had been the intent.
This makes perfect sense. If they are going to be documented then they
need to work well. Just going over a few of the ones listed on the
reddit list, I ran into a number of issues with their behavior. Dis was
one example of a very useful module with a cli interface that could use
some improvement.

>
> What I'm envisioning is a dedicated section along the lines of
>
> X. Command Line Functionality in the Standard Library
> X.1 Supported Command Line Interfaces
> This section would list modules that provide a command line interface
> as detailed in the module documentation. A brief description would be
> given here, along with a link to the relevant section of the module
> docs. It would mainly consist of Python specific utilities for dumping
> diagnostic information about the interpreter's own state or analysing
> Python programs. Any CLIs in this section should also have associated
> unittests in their regression test suites.
>
> Interpreter Diagnostics
> - site
> - platform
> - locale
>
> Execution and Analysis of Python Code
> - runpy
> - unittest
> - doctest
> - pydoc
> - timeit
> - dis
> - tokenize
> - pdb
> - profile
> - pstats
> - modulefinder
>
> X.2 Unsupported Command Line Interfaces
>
> This section would list modules that offer command line functionality
> that is *not* designed to be production quality, but rather exists
> primarily as an interactive testing tool for sanity checking when
> working on the modules themselves. The only documentation of the
> functionality would be the brief descriptions here and the module's
> own interactive help (if any). It should be made clear that these
> interfaces are *not* covered by the regression test suite and they may
> break without warning.
>
> All the simple cross-platform file processing, networking and protocol
> handling utilities would be listed here.
That sounds like a good guide to getting started. I like the idea of
only supporting modules which help with python development. I am also
wondering if libraries which are not going to be supported should have
their cli removed? I've come around to see difflib is probably not that
critical for that since we're all using hg these days.

Finally, I tried a number of searches in the bug tracker to see if a
ticket for something like this existed and I found nothing. Nick had
mentioned that a ticket might already exist?


--Dan


From techtonik at gmail.com  Thu Feb 23 11:56:37 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 23 Feb 2012 12:56:37 +0200
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F450E7A.4060804@gmail.com>
References: <4F450E7A.4060804@gmail.com>
Message-ID: <CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>

On Wed, Feb 22, 2012 at 6:49 PM, Dan Colish <dcolish at gmail.com> wrote:
> Hey,
>
> I was reading over the difflib docs this morning and when I got to the
> bottom, I expected, probably due to lack of coffee, that the example
> would be callable as the module from the command line. There are already
> a number of modules which export command line functionality, ie.
> unittest, and I thought it would be great if difflib module offered the
> same. The code is pretty much there in the example from the
> documentation. It would just need to be included in the module itself.

+1 if it will produce git-style unified patches by default
It seems that every single VCS in Python reinvents own differ.
-m option will help it become more polished/useful
--
anatoly t.


From rob.cliffe at btinternet.com  Thu Feb 23 12:01:53 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Thu, 23 Feb 2012 11:01:53 +0000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
Message-ID: <4F461CA1.3020503@btinternet.com>

Can I put in a plea that postings to this list try to minimise the use 
of acronyms and jargon that may not be universally intelligible?
This list is often read with interest by non-specialists such as myself.
I have no idea for example what "VCS" means.
Thanks
Rob Cliffe

On 23/02/2012 10:56, anatoly techtonik wrote:
> On Wed, Feb 22, 2012 at 6:49 PM, Dan Colish<dcolish at gmail.com>  wrote:
>> Hey,
>>
>> I was reading over the difflib docs this morning and when I got to the
>> bottom, I expected, probably due to lack of coffee, that the example
>> would be callable as the module from the command line. There are already
>> a number of modules which export command line functionality, ie.
>> unittest, and I thought it would be great if difflib module offered the
>> same. The code is pretty much there in the example from the
>> documentation. It would just need to be included in the module itself.
> +1 if it will produce git-style unified patches by default
> It seems that every single VCS in Python reinvents own differ.
> -m option will help it become more polished/useful
> --
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From p.f.moore at gmail.com  Thu Feb 23 12:22:48 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 23 Feb 2012 11:22:48 +0000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F461CA1.3020503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
Message-ID: <CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>

On 23 February 2012 11:01, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
> I have no idea for example what "VCS" means.

Version Control System (things like Subversion, Mercurial, or Git)

Paul.


From anacrolix at gmail.com  Thu Feb 23 12:32:30 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Thu, 23 Feb 2012 19:32:30 +0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
	<CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>
Message-ID: <CAB4yi1OLaYqCRSVY+=by8rpO8WDV77h63b-8xFKGmmVnQe2Q+g@mail.gmail.com>

I don't think it was an actual question, and clearly for Rob it's not a
sustainable approach to be expanding acronyms on request.

I'd suggest a acronym FAQ but that also isn't sustainable, and google won't
always help.

Status: Won't fix, maintain status quo.
On Feb 23, 2012 7:25 PM, "Paul Moore" <p.f.moore at gmail.com> wrote:

> On 23 February 2012 11:01, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
> > I have no idea for example what "VCS" means.
>
> Version Control System (things like Subversion, Mercurial, or Git)
>
> Paul.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120223/38ffedc3/attachment.html>

From ncoghlan at gmail.com  Thu Feb 23 13:15:40 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 23 Feb 2012 22:15:40 +1000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CAB4yi1OLaYqCRSVY+=by8rpO8WDV77h63b-8xFKGmmVnQe2Q+g@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
	<CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>
	<CAB4yi1OLaYqCRSVY+=by8rpO8WDV77h63b-8xFKGmmVnQe2Q+g@mail.gmail.com>
Message-ID: <CADiSq7e3cuNFJV75jygeigrU1+NqJQEDc_34cs1PmwSg83VyWg@mail.gmail.com>

On Thu, Feb 23, 2012 at 9:32 PM, Matt Joiner <anacrolix at gmail.com> wrote:
> Status: Won't fix, maintain status quo.

But also, since language related discussions *will* occasional
encounter domain specific discussions, people shouldn't be afraid to
ask that such jargon be clarified.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Thu Feb 23 13:31:21 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 23 Feb 2012 23:31:21 +1100
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F461CA1.3020503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
Message-ID: <4F463199.9090504@pearwood.info>

Rob Cliffe wrote:
> Can I put in a plea that postings to this list try to minimise the use 
> of acronyms and jargon that may not be universally intelligible?

"Universally intelligible" is an awfully big request. There are English 
speakers who don't know what you mean by either "postings" or "list", since 
both of those are themselves jargon. (My parents, for two.) To say nothing of 
children or non-English speakers who may not know what "acronym" means.


> This list is often read with interest by non-specialists such as myself.
> I have no idea for example what "VCS" means.

While I sympathise, this is a list aimed at programmers, and while 
non-specialists are welcome, they are not the primary audience.

I think you will be better off trying to learn programmer's jargon than asking 
programmers not to use common, if specialised, words in their technical 
conversations. You wouldn't expect (say) car enthusiasts to stop using the 
word "torque", or doctors not to use "dialysis", just because a non-specialist 
might wander by and be listening in.


-- 
Steven


From rob.cliffe at btinternet.com  Thu Feb 23 13:42:55 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Thu, 23 Feb 2012 12:42:55 +0000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F463199.9090504@pearwood.info>
References: <4F450E7A.4060804@gmail.com>	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com> <4F463199.9090504@pearwood.info>
Message-ID: <4F46344F.2040503@btinternet.com>

I am a programmer, of some 30-odd years full-time.
But that doesn't mean I understand every acronym of every specialised 
field under the sun.
"Version Control System" instead of "VCS" is perfectly comprehensible 
and only takes a little longer to type.  "VCS" meant nothing to me.
I follow the postings on python-dev and python-ideas with keen interest.

On 23/02/2012 12:31, Steven D'Aprano wrote:
> Rob Cliffe wrote:
>> Can I put in a plea that postings to this list try to minimise the 
>> use of acronyms and jargon that may not be universally intelligible?
>
> "Universally intelligible" is an awfully big request. There are 
> English speakers who don't know what you mean by either "postings" or 
> "list", since both of those are themselves jargon. (My parents, for 
> two.) To say nothing of children or non-English speakers who may not 
> know what "acronym" means.
>
>
>> This list is often read with interest by non-specialists such as myself.
>> I have no idea for example what "VCS" means.
>
> While I sympathise, this is a list aimed at programmers, and while 
> non-specialists are welcome, they are not the primary audience.
>
> I think you will be better off trying to learn programmer's jargon 
> than asking programmers not to use common, if specialised, words in 
> their technical conversations. You wouldn't expect (say) car 
> enthusiasts to stop using the word "torque", or doctors not to use 
> "dialysis", just because a non-specialist might wander by and be 
> listening in.
>
>


From breamoreboy at yahoo.co.uk  Thu Feb 23 14:36:29 2012
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Thu, 23 Feb 2012 13:36:29 +0000
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F46344F.2040503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com> <4F463199.9090504@pearwood.info>
	<4F46344F.2040503@btinternet.com>
Message-ID: <ji5fcs$jdu$1@dough.gmane.org>

On 23/02/2012 12:42, Rob Cliffe wrote:
> I am a programmer, of some 30-odd years full-time.
> But that doesn't mean I understand every acronym of every specialised
> field under the sun.
> "Version Control System" instead of "VCS" is perfectly comprehensible
> and only takes a little longer to type. "VCS" meant nothing to me.
> I follow the postings on python-dev and python-ideas with keen interest.
>
> On 23/02/2012 12:31, Steven D'Aprano wrote:
>> Rob Cliffe wrote:
>>> Can I put in a plea that postings to this list try to minimise the
>>> use of acronyms and jargon that may not be universally intelligible?
>>
>> "Universally intelligible" is an awfully big request. There are
>> English speakers who don't know what you mean by either "postings" or
>> "list", since both of those are themselves jargon. (My parents, for
>> two.) To say nothing of children or non-English speakers who may not
>> know what "acronym" means.
>>
>>
>>> This list is often read with interest by non-specialists such as myself.
>>> I have no idea for example what "VCS" means.
>>
>> While I sympathise, this is a list aimed at programmers, and while
>> non-specialists are welcome, they are not the primary audience.
>>
>> I think you will be better off trying to learn programmer's jargon
>> than asking programmers not to use common, if specialised, words in
>> their technical conversations. You wouldn't expect (say) car
>> enthusiasts to stop using the word "torque", or doctors not to use
>> "dialysis", just because a non-specialist might wander by and be
>> listening in.
>>
>>

Including the postings that repeatedly ask people not to top post?

-- 
Cheers.

Mark Lawrence.


From ned at nedbatchelder.com  Thu Feb 23 15:35:31 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 23 Feb 2012 09:35:31 -0500
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F461CA1.3020503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
Message-ID: <4F464EB3.8020300@nedbatchelder.com>

On 2/23/2012 6:01 AM, Rob Cliffe wrote:
> Can I put in a plea that postings to this list try to minimise the use 
> of acronyms and jargon that may not be universally intelligible?
> This list is often read with interest by non-specialists such as myself.
> I have no idea for example what "VCS" means.
> Thanks
> Rob Cliffe
>
Googling either "vcs git" or "vcs python" shows "Version Control System" 
clearly highlighted right on the search results page.

--Ned.


From ethan at stoneleaf.us  Thu Feb 23 15:00:48 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 23 Feb 2012 06:00:48 -0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F46344F.2040503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>	<4F461CA1.3020503@btinternet.com>
	<4F463199.9090504@pearwood.info> <4F46344F.2040503@btinternet.com>
Message-ID: <4F464690.3000906@stoneleaf.us>

Rob Cliffe wrote:
> I am a programmer, of some 30-odd years full-time.
> But that doesn't mean I understand every acronym of every specialised 
> field under the sun.
> "Version Control System" instead of "VCS" is perfectly comprehensible 
> and only takes a little longer to type.  "VCS" meant nothing to me.

I also sympathize, but the reality is it's not going to happen.  If the 
search engines don't help then post the question.

~Ethan~


From guido at python.org  Thu Feb 23 19:00:21 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 23 Feb 2012 10:00:21 -0800
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F464690.3000906@stoneleaf.us>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com> <4F463199.9090504@pearwood.info>
	<4F46344F.2040503@btinternet.com> <4F464690.3000906@stoneleaf.us>
Message-ID: <CAP7+vJJCmvQT7gg-R_4HExPPDn0WWo7wgo1xRCZB+20T40q+Aw@mail.gmail.com>

On Thu, Feb 23, 2012 at 6:00 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> Rob Cliffe wrote:
>>
>> I am a programmer, of some 30-odd years full-time.
>> But that doesn't mean I understand every acronym of every specialised
>> field under the sun.
>> "Version Control System" instead of "VCS" is perfectly comprehensible and
>> only takes a little longer to type. ?"VCS" meant nothing to me.
>
>
> I also sympathize, but the reality is it's not going to happen. ?If the
> search engines don't help then post the question.

+1 to this advice. I don't even sympathize. I have to look up the new
jargon invented by the youngsters *all the time*. But using a search
engine to educate myself is much more effective than asking around.
And yes, if the search engine somehow doesn't help, just ask an
explanation for a specific term. Not every problem can be fixed by
asking everyone else to change their behavior. This is a technical
list and technical jargon will be flouted. Deal with it.

-- 
--Guido van Rossum (python.org/~guido)


From solipsis at pitrou.net  Thu Feb 23 19:21:33 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 23 Feb 2012 19:21:33 +0100
Subject: [Python-ideas] Make Difflib example callable as module __main__
References: <4F450E7A.4060804@gmail.com> <ji3ncv$p1a$1@dough.gmane.org>
	<CADiSq7fRRJpSux0kCL4t3wTRTs23F8Gp_BJ3Wk_Wma=fw_O7rQ@mail.gmail.com>
Message-ID: <20120223192133.7fdfc60f@pitrou.net>

On Thu, 23 Feb 2012 08:08:08 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Feb 23, 2012 at 7:40 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> > If you run difflib directly, it runs difflib._test. which runs a doctest on
> > difflib. Most modules do something similar. Having a real command-line
> > interface in the module itself is unusual.
> 
> That's largely a historical artifact though - prior to -m direct
> execution was a pain, so the only time it really happened was in a
> source checkout during development. (plus I don't believe regrtest
> always had selective test execution, so run the library directly was a
> good way to only run some of the tests).
> 
> If there's useful functionality that can be provided via -m, I'm a fan
> of moving tests out of the way to make room for it (it's also a good
> opportunity to make sure regrtest is covering whatever __main__
> execution tests).

+1 for moving self-tests to the regular test suite. Nobody, and
especially not the buildbots, runs self-tests included in __main__
sections.
(and, as a matter of fact, many of those may be broken without anyone
noticing)

Regards

Antoine.


From tjreedy at udel.edu  Thu Feb 23 20:24:35 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 23 Feb 2012 14:24:35 -0500
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F464EB3.8020300@nedbatchelder.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
	<4F464EB3.8020300@nedbatchelder.com>
Message-ID: <ji63q5$4do$1@dough.gmane.org>

On 2/23/2012 9:35 AM, Ned Batchelder wrote:
> On 2/23/2012 6:01 AM, Rob Cliffe wrote:
>> Can I put in a plea that postings to this list try to minimise the use
>> of acronyms and jargon that may not be universally intelligible?
>> This list is often read with interest by non-specialists such as myself.
>> I have no idea for example what "VCS" means.

> Googling either "vcs git" or "vcs python" shows "Version Control System"
> clearly highlighted right on the search results page.

Googling just vcs returns as third hit "Version Control System" and a 
Wikipedia link. Alternatives like Verified Carbon Standard and Veterans 
Canteen Service are easily rejected in the context of this list ;-).

-- 
Terry Jan Reedy


From phd at phdru.name  Thu Feb 23 20:56:27 2012
From: phd at phdru.name (Oleg Broytman)
Date: Thu, 23 Feb 2012 23:56:27 +0400
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <ji63q5$4do$1@dough.gmane.org>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
	<4F464EB3.8020300@nedbatchelder.com> <ji63q5$4do$1@dough.gmane.org>
Message-ID: <20120223195627.GD8946@iskra.aviel.ru>

On Thu, Feb 23, 2012 at 02:24:35PM -0500, Terry Reedy wrote:
> On 2/23/2012 9:35 AM, Ned Batchelder wrote:
> >On 2/23/2012 6:01 AM, Rob Cliffe wrote:
> >>Can I put in a plea that postings to this list try to minimise the use
> >>of acronyms and jargon that may not be universally intelligible?
> >>This list is often read with interest by non-specialists such as myself.
> >>I have no idea for example what "VCS" means.
> 
> >Googling either "vcs git" or "vcs python" shows "Version Control System"
> >clearly highlighted right on the search results page.
> 
> Googling just vcs returns as third hit "Version Control System" and
> a Wikipedia link. Alternatives like Verified Carbon Standard and
> Veterans Canteen Service are easily rejected in the context of this
> list ;-).

   http://www.acronymfinder.com/VCS.html lists VCS at the second place.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From techtonik at gmail.com  Thu Feb 23 21:19:16 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 23 Feb 2012 23:19:16 +0300
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com>
	<CACac1F-rLANa5dDzwDn2fDBdfMi-RGebYRk_==pUi1UwHxytJQ@mail.gmail.com>
Message-ID: <CAPkN8x+8==nr75R5aiFtU40eZEQCAZA6LP2hCBirLKSNq6qAGQ@mail.gmail.com>

On Thu, Feb 23, 2012 at 2:22 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 23 February 2012 11:01, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
>> I have no idea for example what "VCS" means.
>
> Version Control System (things like Subversion, Mercurial, or Git)

Bazaar and Mercurial in this case.

Mercurial's differ:
http://selenic.com/hg/file/816211dfa3a5/mercurial/pure/bdiff.py
Bazaar's:
http://bazaar.launchpad.net/~bzr-pqm/bzr/bzr.dev/view/head:/bzrlib/diff.py
--
anatoly t.


From techtonik at gmail.com  Thu Feb 23 21:22:09 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 23 Feb 2012 23:22:09 +0300
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <4F46344F.2040503@btinternet.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<4F461CA1.3020503@btinternet.com> <4F463199.9090504@pearwood.info>
	<4F46344F.2040503@btinternet.com>
Message-ID: <CAPkN8xJLYYrudZHD6ofvVj8DaH2EPC=1xfT9ryNgQ4bqz43xoQ@mail.gmail.com>

On Thu, Feb 23, 2012 at 3:42 PM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
> I am a programmer, of some 30-odd years full-time.
> But that doesn't mean I understand every acronym of every specialised field
> under the sun.
> "Version Control System" instead of "VCS" is perfectly comprehensible and
> only takes a little longer to type. ?"VCS" meant nothing to me.
> I follow the postings on python-dev and python-ideas with keen interest.

VCS is a good new word to know in difflib context. Thanks for asking,
--
anatoly t.


From g.brandl at gmx.net  Thu Feb 23 22:43:51 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 23 Feb 2012 22:43:51 +0100
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
Message-ID: <ji6bn9$330$1@dough.gmane.org>

Am 23.02.2012 11:56, schrieb anatoly techtonik:
> On Wed, Feb 22, 2012 at 6:49 PM, Dan Colish <dcolish at gmail.com> wrote:
>> Hey,
>>
>> I was reading over the difflib docs this morning and when I got to the
>> bottom, I expected, probably due to lack of coffee, that the example
>> would be callable as the module from the command line. There are already
>> a number of modules which export command line functionality, ie.
>> unittest, and I thought it would be great if difflib module offered the
>> same. The code is pretty much there in the example from the
>> documentation. It would just need to be included in the module itself.
> 
> +1 if it will produce git-style unified patches by default
> It seems that every single VCS in Python reinvents own differ.

"Every single" makes it sounds like there are dozens...

Apart from that: a diff/patch algorithm is such an integral part of
version control that I would *not* expect them to use difflib, but
something more sophisticated/optimized/etc.

Georg


From victor.stinner at haypocalc.com  Fri Feb 24 00:34:49 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 24 Feb 2012 00:34:49 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
Message-ID: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>

Hi,

I'm trying to create read-only objects using a "frozendict" class.
frozendict is a read-only dict. I would like to use frozendict for the
class dict using a metaclass, but type.__new__() expects a dict and
creates a copy of the input dict.

I would be nice to support custom dict type: OrderedDict and
frozendict for example. It looks possible to patch CPython to
implement this feature, but first I would like first to know your
opinion about this idea :-)

Victor


From pyideas at rebertia.com  Fri Feb 24 00:51:22 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 23 Feb 2012 15:51:22 -0800
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
Message-ID: <CAMZYqRSdBKUp2nAud1Ahto5L7-1WLf4HvtaFOtdm1CyuOJxEsw@mail.gmail.com>

On Thu, Feb 23, 2012 at 3:34 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Hi,
>
> I'm trying to create read-only objects using a "frozendict" class.
> frozendict is a read-only dict. I would like to use frozendict for the
> class dict using a metaclass, but type.__new__() expects a dict and
> creates a copy of the input dict.

And you can't use __slots__ because...?

Cheers,
Chris


From victor.stinner at haypocalc.com  Fri Feb 24 01:27:37 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 24 Feb 2012 01:27:37 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMZYqRSdBKUp2nAud1Ahto5L7-1WLf4HvtaFOtdm1CyuOJxEsw@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CAMZYqRSdBKUp2nAud1Ahto5L7-1WLf4HvtaFOtdm1CyuOJxEsw@mail.gmail.com>
Message-ID: <CAMpsgwYAQSTdAnXaDtCAhwSDQZuoJ5XpcdBR4ME=upw3Q=vT2w@mail.gmail.com>

> And you can't use __slots__ because...?

Hum, here is an example:
---
def Enum(**kw):
   class _Enum(object):
       __slots__ = list(kw.keys())
       def __new__(cls, **kw):
           inst = object.__new__(cls)
           for key, value in kw.items():
               setattr(inst, key, value)
           return inst
   return _Enum(**kw)

components = Enum(red=0, green=1, blue=2)
print(components.red)
components.red=2
print(components.red)
components.unknown=10
---

components.unknown=10 raises an error, but not components.red=2.
__slots__ denies to add new attributes, but not to modify existing
attributes.

The idea of using a frozendict is to deny the modification of an
attribute value after the creation of the object. I don't see how to
use __slots__ to implement such constraints.

Victor


From pyideas at rebertia.com  Fri Feb 24 01:56:02 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 23 Feb 2012 16:56:02 -0800
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwYAQSTdAnXaDtCAhwSDQZuoJ5XpcdBR4ME=upw3Q=vT2w@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CAMZYqRSdBKUp2nAud1Ahto5L7-1WLf4HvtaFOtdm1CyuOJxEsw@mail.gmail.com>
	<CAMpsgwYAQSTdAnXaDtCAhwSDQZuoJ5XpcdBR4ME=upw3Q=vT2w@mail.gmail.com>
Message-ID: <CAMZYqRTv-Psg8Z+qs9CRaZ3LKYYRWSQj-v9bSxgbMnYhCJCkMg@mail.gmail.com>

On Thu, Feb 23, 2012 at 4:27 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
>> And you can't use __slots__ because...?
<snip>
> components.unknown=10 raises an error, but not components.red=2.
> __slots__ denies to add new attributes, but not to modify existing
> attributes.
>
> The idea of using a frozendict is to deny the modification of an
> attribute value after the creation of the object. I don't see how to
> use __slots__ to implement such constraints.

Right, stupid question; didn't think that one all the way through.

- Chris


From ncoghlan at gmail.com  Fri Feb 24 02:08:05 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 11:08:05 +1000
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
Message-ID: <CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>

On Fri, Feb 24, 2012 at 9:34 AM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Hi,
>
> I'm trying to create read-only objects using a "frozendict" class.
> frozendict is a read-only dict. I would like to use frozendict for the
> class dict using a metaclass, but type.__new__() expects a dict and
> creates a copy of the input dict.

Do you have a particular reason for doing it that way rather than just
overriding __setattr__ and __delattr__ to raise TypeError?

Or overriding the __dict__ descriptor to return a read-only proxy?

There are a *lot* of direct calls to the PyDict APIs in the object
machinery. Without benchmark results clearly showing a negligible
speed impact, I'd be -1 on increasing the complexity of all that code
(and making it slower) to support niche use cases that can already be
handled a couple of other ways.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Feb 24 04:48:32 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 13:48:32 +1000
Subject: [Python-ideas] Current status of PEP 403 (was Re: peps: Switch back
 to named functions, since the Ellipsis version degenerated badly)
Message-ID: <CADiSq7f2iT8UovPCX0Yt4e4zuFYUi3967vGc=AM5eKev=_H+rg@mail.gmail.com>

(switching lists to python-ideas, dropping python-dev and python-checkins)

Context for anyone not following python-checkins: I recently moved PEP
3150 (statement local namespaces) to Withdrawn, rewrote PEP 403 with a
different syntax proposal, retitled it to "Statement local class and
function definitions" and moved it to Deferred.

On Fri, Feb 24, 2012 at 2:37 AM, Jim Jewett <jimjjewett at gmail.com> wrote:
> I understand that adding a colon and indent has its own problems, but
> ... I'm not certain this is better, and I am certain that the desire
> for indentation is strong enough to at least justify discussion in the
> PEP.

Fair point.

The reason for the flat structure is that allowing a full suite screws
with the scoping rules and gets us into the land of insane complexity
that was PEP 3150. A decorator-inspired syntax makes it very clear
(both to the reader and to the compiler) that there's only *one* name
being forward referenced, rather than potentially hiding declarations
of forward references an arbitrary distance from the statement that
uses them. This doesn't actually lose any flexibility, since you can
just make a forward reference to a class instead and use that as your
local namespace (with ordinary attribute access semantics), rather
than the brain-bender that was the proposed scoping rules for PEP
3150's given clause.

The only other alternative syntax would be to use a custom suite
definition that allowed only a single class or function definition
statement, but I think having something that looks like a suite, but
isn't one would be significantly worse than the current proposed
syntax that merely allows a function (or class) definition's implied
local name binding to be overridden with a custom statement.

To (almost*) recreate the effect of an ordinary function definition
with the in statement, you could write:

in f = f
def f():
    pass

And a decorated definition like:

@deco1
@deco2
@deco3
def f():
    pass

Could (almost*) be expressed as:

in f = deco1(deco2(deco3(f)))
def f():
    pass

* The reason for the "almost" caveat is that, given the current PEP
403 semantics, recursive references to f() will resolve differently
for the "in" statement cases - for the in statement, they will resolve
directly to the innermost function definition, while for ordinary
definitions they will be resolved according to the scoping rules for
any name lookup.

This could be an argument in favour of allowing *decorated* function
and class definitions, rather than requiring that they be undecorated
- if decorators are allowed, then recursive references would resolve
directly to the post-decorated version. Alternatively, people could
adopt a convention of prepending an underscore to the actual function
name in cases where it mattered, meaning they would have easy access
to *both* forms of the function (decorated and undecorated):

in f = deco1(deco2(deco3(_f)))
def _f():
    return f, _f  # decorated, undecorated

In either case, whereas an ordinary recursive function definition can
get confused by reassignments in the outer scope, an in-statement
based definition would be truly recursive (via a cell reference) and
hence ignore any subsequent changes in the outer namespace.

Something else the PEP should mention explicitly is that, like
__class__, a class object obviously won't be available while the class
body is being executed. Only methods will be able to refer to the
class by name, just as only methods can use __class__.

Updates to PEP 403 are going to be pretty sporadic until some time
after 3.3 release though - it's still very much in "this is a problem
I am thinking about" territory rather than "this is a language
addition I am proposing" (the latest round of updates were just to
make sure I recorded my latest idea before I forgot about the
details).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From stephen at xemacs.org  Fri Feb 24 05:55:10 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 24 Feb 2012 13:55:10 +0900
Subject: [Python-ideas] Make Difflib example callable as module __main__
In-Reply-To: <ji6bn9$330$1@dough.gmane.org>
References: <4F450E7A.4060804@gmail.com>
	<CAPkN8xL9UDRe4KtrJiY29SX3nt_JbuFaYO1j2Nxx3UwCJQhBkQ@mail.gmail.com>
	<ji6bn9$330$1@dough.gmane.org>
Message-ID: <8762ewn841.fsf@uwakimon.sk.tsukuba.ac.jp>

Georg Brandl writes:

 > > +1 if it will produce git-style unified patches by default
 > > It seems that every single VCS in Python reinvents own differ.
 > 
 > "Every single" makes it sounds like there are dozens...
 > 
 > Apart from that: a diff/patch algorithm is such an integral part of
 > version control that I would *not* expect them to use difflib, but
 > something more sophisticated/optimized/etc.

But Anatoly isn't talking about the algorithm.  He's talking about the
output, and actually, I would expect them to use something diff(1) and
diff3(1) compatible for hunk-oriented changes.[1]  My experience with
home-grown diff functions suggests that very few produce output as
good as that of diff(1), and only git seems to be an improvement (but
it's not backward compatible, as the tracker/review tool maintainers
regularly mention).

It's true that there are better algorithms than the one used by
diff(1) (such as the "patience diff" Bazaar uses, and git offers as an
option), but there's no need to change the hunk format as far as I
have seen, and the file headers could easily be standardized I would
think.

Footnotes: 
[1]  Darcs for one allows non-hunk-based changes, specifically a
token-replace patch.  And there are binary diffs such as xdelta, and
word diffs like wdiff, which necessarily use a different format since
they are not line-oriented.


From aquavitae69 at gmail.com  Fri Feb 24 06:33:48 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Fri, 24 Feb 2012 07:33:48 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
Message-ID: <CAEgL-feF_5nzCiAvX1me83w8MOc5b4Gr-LKF-FceCkrkT-xk0g@mail.gmail.com>

On Fri, Feb 24, 2012 at 3:08 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Fri, Feb 24, 2012 at 9:34 AM, Victor Stinner
> <victor.stinner at haypocalc.com> wrote:
> > Hi,
> >
> > I'm trying to create read-only objects using a "frozendict" class.
> > frozendict is a read-only dict. I would like to use frozendict for the
> > class dict using a metaclass, but type.__new__() expects a dict and
> > creates a copy of the input dict.
>
> Do you have a particular reason for doing it that way rather than just
> overriding __setattr__ and __delattr__ to raise TypeError?
>
> Or overriding the __dict__ descriptor to return a read-only proxy?
>
> There are a *lot* of direct calls to the PyDict APIs in the object
> machinery. Without benchmark results clearly showing a negligible
> speed impact, I'd be -1 on increasing the complexity of all that code
> (and making it slower) to support niche use cases that can already be
> handled a couple of other ways.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>

Can't this also be done using metaclasses?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120224/a96b6bc1/attachment.html>

From Ronny.Pfannschmidt at gmx.de  Fri Feb 24 09:07:31 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Fri, 24 Feb 2012 09:07:31 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwYAQSTdAnXaDtCAhwSDQZuoJ5XpcdBR4ME=upw3Q=vT2w@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CAMZYqRSdBKUp2nAud1Ahto5L7-1WLf4HvtaFOtdm1CyuOJxEsw@mail.gmail.com>
	<CAMpsgwYAQSTdAnXaDtCAhwSDQZuoJ5XpcdBR4ME=upw3Q=vT2w@mail.gmail.com>
Message-ID: <4F474543.7090008@gmx.de>

On 02/24/2012 01:27 AM, Victor Stinner wrote:
>> And you can't use __slots__ because...?
>
> Hum, here is an example:
> ---
note untested, since written in mail client:

class Enum(object):
   __slots__ = ("_data",)
   _data = WriteOnceDescr('_data') # left as exercise

   def __init__(self, **kw):
     self._data = frozendict(kw)

   def __getattr__(self, key):
     try:
        return self._data[key]
     except KeyError:
        raise AttributeError(key)


> def Enum(**kw):
>     class _Enum(object):
>         __slots__ = list(kw.keys())
>         def __new__(cls, **kw):
>             inst = object.__new__(cls)
>             for key, value in kw.items():
>                 setattr(inst, key, value)
>             return inst
>     return _Enum(**kw)
>
> components = Enum(red=0, green=1, blue=2)
> print(components.red)
> components.red=2
> print(components.red)
> components.unknown=10
> ---
>
> components.unknown=10 raises an error, but not components.red=2.
> __slots__ denies to add new attributes, but not to modify existing
> attributes.
>
> The idea of using a frozendict is to deny the modification of an
> attribute value after the creation of the object. I don't see how to
> use __slots__ to implement such constraints.
>
> Victor
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From simon.sapin at kozea.fr  Fri Feb 24 10:22:32 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Fri, 24 Feb 2012 10:22:32 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAEgL-feF_5nzCiAvX1me83w8MOc5b4Gr-LKF-FceCkrkT-xk0g@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAEgL-feF_5nzCiAvX1me83w8MOc5b4Gr-LKF-FceCkrkT-xk0g@mail.gmail.com>
Message-ID: <4F4756D8.6000109@kozea.fr>

Le 24/02/2012 06:33, David Townshend a ?crit :
> Can't this also be done using metaclasses?

Hi,

Are you thinking of __prepare__? I did to but I read the details of this:

http://docs.python.org/py3k/reference/datamodel.html#customizing-class-creation

The class body can be executed "in" any mapping. Then I?m not sure but 
it looks like type.__new__ only takes a real dict. You have to do 
something in your overridden __new__ to eg. keep the OrderedDict?s order.

Regards,
-- 
Simon Sapin


From techtonik at gmail.com  Fri Feb 24 11:36:46 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 12:36:46 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
Message-ID: <CAPkN8x+6Y6GXhqeo=P37H-EkOvWx09FC1nAS2S_92vnU9We5YQ@mail.gmail.com>

On Fri, Feb 24, 2012 at 4:08 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Feb 24, 2012 at 9:34 AM, Victor Stinner
> <victor.stinner at haypocalc.com> wrote:
>> Hi,
>>
>> I'm trying to create read-only objects using a "frozendict" class.
>> frozendict is a read-only dict. I would like to use frozendict for the
>> class dict using a metaclass, but type.__new__() expects a dict and
>> creates a copy of the input dict.
>
> Do you have a particular reason for doing it that way rather than just
> overriding __setattr__ and __delattr__ to raise TypeError?
>
> Or overriding the __dict__ descriptor to return a read-only proxy?
>
> There are a *lot* of direct calls to the PyDict APIs in the object
> machinery. Without benchmark results clearly showing a negligible
> speed impact, I'd be -1 on increasing the complexity of all that code
> (and making it slower) to support niche use cases that can already be
> handled a couple of other ways.

I also think about reverse process of removing things that were proved
to be underused. That probably requires AST spider that crawls
existing Python project to see how various constructs are used.
--
anatoly t.


From techtonik at gmail.com  Fri Feb 24 11:52:10 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 12:52:10 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
Message-ID: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>

Hello,

subprocess is low level, cryptic, does too much, with poor usability,
i.e. "don't make me think" is not about it. I don't know about you,
but I can hardly write any subprocess call without spending at least
5-10 meditating over the documentation. So, I propose two high level
KISS functions for shell utils (shutil) module:

runret(command)   - run command through shell, return ret code
runout(command)  - run command through shell, return output

To avoid subprocess story (that makes Python too complicated) I
deliberately limit the scope to:
- executing from shell only
- return one thing at a time

I hope that this covers 80% of what _users_ need to execute commands
from Python. If somebody needs more - there is `subprocess`. But if
your own scripts are mostly outside these 80% - feel free to provide
your user story and arguments, why this should be done in shutil and
not in subprocess.

Open questions:
- security quoting for 'command'

--
anatoly t.


From dirkjan at ochtman.nl  Fri Feb 24 11:58:21 2012
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Fri, 24 Feb 2012 11:58:21 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
Message-ID: <CAKmKYaBBYWdeqH6Kynw91p7ATwYOuGFFuOo-xuOPP28vJMYMfw@mail.gmail.com>

On Fri, Feb 24, 2012 at 11:52, anatoly techtonik <techtonik at gmail.com> wrote:
> subprocess is low level, cryptic, does too much, with poor usability,
> i.e. "don't make me think" is not about it. I don't know about you,
> but I can hardly write any subprocess call without spending at least
> 5-10 meditating over the documentation. So, I propose two high level
> KISS functions for shell utils (shutil) module:
>
> runret(command) ? - run command through shell, return ret code
> runout(command) ?- run command through shell, return output

Have you seen subprocess.check_call() and subprocess.check_output()?

I don't think your proposed functions add much benefit over these two.

Cheers,

Dirkjan


From ncoghlan at gmail.com  Fri Feb 24 11:59:42 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 20:59:42 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
Message-ID: <CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>

On Fri, Feb 24, 2012 at 8:52 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> Hello,
>
> subprocess is low level, cryptic, does too much, with poor usability,
> i.e. "don't make me think" is not about it. I don't know about you,
> but I can hardly write any subprocess call without spending at least
> 5-10 meditating over the documentation.

Hi Anatoly,

I believe you'll find the simple convenience methods you are
requesting already exist, in the form of subprocess.call(),
subprocess.check_call() and subprocess.check_output(). The
documentation has also been updated to emphasise these convenience
functions over the Popen swiss army knife.

If you do "pip install shell-command" you can also access the
shell_call(), shell_check_call() and shell_output() functions I
currently plan to include in subprocess for 3.3. (I'm not sure which
versions of Python that module currently supports though - 2.7 and
3.2, IIRC).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ziade.tarek at gmail.com  Fri Feb 24 12:10:08 2012
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 24 Feb 2012 12:10:08 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
Message-ID: <CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>

On Fri, Feb 24, 2012 at 11:52 AM, anatoly techtonik <techtonik at gmail.com>wrote:

> Hello,
>
> subprocess is low level, cryptic, does too much, with poor usability,
> i.e. "don't make me think" is not about it. I don't know about you,
> but I can hardly write any subprocess call without spending at least
> 5-10 meditating over the documentation. So, I propose two high level
> KISS functions for shell utils (shutil) module:
>
> runret(command)   - run command through shell, return ret code
>

mmm you are describing subprocess.call()  here... I don't see how this new
command makes thing better, besides shell=True.


> runout(command)  - run command through shell, return output
>

what is 'output' ? the stderr ? the stdout ? a merge of both ?

what about subprocess.check_output() ?


>
> To avoid subprocess story (that makes Python too complicated)
>

I seems to me that the only complication here is shell=True, which seems ok
to me to have it at False for security reasons.


Cheers
Tarek

-- 
Tarek Ziad? | http://ziade.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120224/6d4f0fce/attachment.html>

From techtonik at gmail.com  Fri Feb 24 12:12:29 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 13:12:29 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
Message-ID: <CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>

On Fri, Feb 24, 2012 at 1:59 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Feb 24, 2012 at 8:52 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>> Hello,
>>
>> subprocess is low level, cryptic, does too much, with poor usability,
>> i.e. "don't make me think" is not about it. I don't know about you,
>> but I can hardly write any subprocess call without spending at least
>> 5-10 meditating over the documentation.
>
> I believe you'll find the simple convenience methods you are
> requesting already exist, in the form of subprocess.call(),
> subprocess.check_call() and subprocess.check_output(). The
> documentation has also been updated to emphasise these convenience
> functions over the Popen swiss army knife.

I don't find the names of these functions more intuitive than Popen().
I also think they far from being simple, because (in the order of appearance):

1. they require try/catch
2. docs still refer Popen, which IS complicated
3. contain shell FUD
4. completely confuse users with stdout=PIPE or stderr=PIPE stuff

http://docs.python.org/library/subprocess.html#subprocess.check_call

My verdict - these fail to be simple, and require the same low-level
system knowledge as Popen() for confident use.

> If you do "pip install shell-command" you can also access the
> shell_call(), shell_check_call() and shell_output() functions I
> currently plan to include in subprocess for 3.3. (I'm not sure which
> versions of Python that module currently supports though - 2.7 and
> 3.2, IIRC).

Don't you find strange that shell utils module don't have any
functions for the main shell function - command execution?
In game development current state of subprocess bloat is called
"featurecrepping" and the "scope definition" is a method to cope with
this disease.
--
anatoly t.


From techtonik at gmail.com  Fri Feb 24 12:23:57 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 13:23:57 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
Message-ID: <CAPkN8xK14Dhwbsm+of+zRjiA7nWU3gue09ism2rM3DH0kcPLGA@mail.gmail.com>

On Fri, Feb 24, 2012 at 2:10 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Fri, Feb 24, 2012 at 11:52 AM, anatoly techtonik <techtonik at gmail.com>
> wrote:
>>
>> subprocess is low level, cryptic, does too much, with poor usability,
>> i.e. "don't make me think" is not about it. I don't know about you,
>> but I can hardly write any subprocess call without spending at least
>> 5-10 meditating over the documentation. So, I propose two high level
>> KISS functions for shell utils (shutil) module:
>>
>> runret(command) ? - run command through shell, return ret code
>
>
> mmm you are describing subprocess.call()? here... I don't see how this new
> command makes thing better, besides shell=True.

shutil.runret()  - by definition has shell=True

>>
>> runout(command) ?- run command through shell, return output
>
>
> what is 'output' ? the stderr ? the stdout ? a merge of both ?

That's a high-level _user_ function. When user runs command in shell
he sees both. So, this 'shell util' is an analogue. If you have you
own user scripts that require stdout or stderr separately, I am free
to discuss the cases. The main purpose of this function is to be
useful from Python console, so the interface should be very simple to
remember from the first try. Like runout(command,
ret='stdout|stderr|both'). No universal PIPEs.

> what about subprocess.check_output() ?

See my reply above.

>> To avoid subprocess story (that makes Python too complicated)
>
>
> I seems to me that the only complication here is shell=True, which seems ok
> to me to have it at False for security reasons.

It won't be 'shell util' function anymore. If you're using shell
execution functions, you already realize that will happen if your
input parameters are not validated properly. Isolating calls that
require shell execution in shutil module will also simplify security
analysis for 3rd party libraries.

-- 
anatoly t.


From mwm at mired.org  Fri Feb 24 12:25:25 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 24 Feb 2012 06:25:25 -0500
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
Message-ID: <20120224062525.0e168a39@bhuda.mired.org>

On Fri, 24 Feb 2012 12:10:08 +0100
Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Fri, Feb 24, 2012 at 11:52 AM, anatoly techtonik <techtonik at gmail.com>wrote:
> > Hello,
> > subprocess is low level, cryptic, does too much, with poor usability,
> > i.e. "don't make me think" is not about it. I don't know about you,
> > but I can hardly write any subprocess call without spending at least
> > 5-10 meditating over the documentation. So, I propose two high level
> > KISS functions for shell utils (shutil) module:
> > runret(command)   - run command through shell, return ret code
> mmm you are describing subprocess.call()  here... I don't see how this new
> command makes thing better, besides shell=True.

The stated purpose of the new functions is to allow people to run
shell commands without thinking about them. That's a bad idea (isn't
most programming without thinking about it?). The first problem is
that it's a great way to add data injection vulnerabilities to your
application. It's also a good way to introduce bugs in your
application when asked to (for instance) process user-provided file
names.

-1

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Fri Feb 24 12:31:19 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 21:31:19 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
Message-ID: <CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>

On Fri, Feb 24, 2012 at 9:12 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> Don't you find strange that shell utils module don't have any
> functions for the main shell function - command execution?
> In game development current state of subprocess bloat is called
> "featurecrepping" and the "scope definition" is a method to cope with
> this disease.

They may still end up in shutil. I haven't really decided which
location I like better.

However, if you (or anyone else) wants to see Python's innate
capabilities improve in this area (and they really are subpar compared
to Perl 5, for example), your best bet is to download my Shell Command
module and give me feedback on any problems you find with it via the
BitBucket issue tracker.

http://shell-command.readthedocs.org

Cheers,
Nick.
-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From techtonik at gmail.com  Fri Feb 24 12:46:12 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 13:46:12 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <20120224062525.0e168a39@bhuda.mired.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
Message-ID: <CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>

On Fri, Feb 24, 2012 at 2:25 PM, Mike Meyer <mwm at mired.org> wrote:
> On Fri, 24 Feb 2012 12:10:08 +0100
> Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>> On Fri, Feb 24, 2012 at 11:52 AM, anatoly techtonik <techtonik at gmail.com>wrote:
>> > Hello,
>> > subprocess is low level, cryptic, does too much, with poor usability,
>> > i.e. "don't make me think" is not about it. I don't know about you,
>> > but I can hardly write any subprocess call without spending at least
>> > 5-10 meditating over the documentation. So, I propose two high level
>> > KISS functions for shell utils (shutil) module:
>> > runret(command) ? - run command through shell, return ret code
>> mmm you are describing subprocess.call() ?here... I don't see how this new
>> command makes thing better, besides shell=True.
>
> The stated purpose of the new functions is to allow people to run
> shell commands without thinking about them. That's a bad idea (isn't
> most programming without thinking about it?). The first problem is
> that it's a great way to add data injection vulnerabilities to your
> application. It's also a good way to introduce bugs in your
> application when asked to (for instance) process user-provided file
> names.
>
> -1

The proposal doesn't took into account security implications, so your
-1 is premature.

I agree with your point that users should think about *security* when
they run commands. But they should not think about how tons of
different ways to execute their command and different combinations on
different operating systems, *and* security implications about this.

This is *the main point* that make subprocess module a failure, and a
basis (main reason) of this proposal.

If users choose to trade security over simplicity, they should know
what the risks are, and what to do if they want to avoid them. So I
completely support the idea of shutil docs containing a user friendly
explanation of how to exploit and how to protect (i.e. use subprocess)
from the flaws provided by this method of execution - if they need to
protect. Python is not a Java - it should give users a choice of
simple API when they don't need security, and let this choice of
shooting themselves in the foot be explicit.. and simple.

--
anatoly t.


From masklinn at masklinn.net  Fri Feb 24 12:50:35 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 24 Feb 2012 12:50:35 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
Message-ID: <B363833B-AFBF-4EDA-90FA-7F2DF22BE33A@masklinn.net>

On 2012-02-24, at 12:12 , anatoly techtonik wrote:
> 
> 1. they require try/catch

No.

> 2. docs still refer Popen, which IS complicated

True.

> 3. contain shell FUD

No, they contain warnings, against shell injection security
risks. Warnings are not FUD, it's not trying to sell some sort
of alternative it's just warning that `shell=True` is dangerous
on untrusted input.

> 4. completely confuse users with stdout=PIPE or stderr=PIPE stuff
> 
> http://docs.python.org/library/subprocess.html#subprocess.check_call

On the one hand, these notes are a bit clumsy. On the other hand,
piping is a pretty fundamental concept of shell execution, I see
nothing wrong about saying that these functions *can't* be involved
in pipes. In fact stating it upfront looks sensible.

>> If you do "pip install shell-command" you can also access the
>> shell_call(), shell_check_call() and shell_output() functions I
>> currently plan to include in subprocess for 3.3. (I'm not sure which
>> versions of Python that module currently supports though - 2.7 and
>> 3.2, IIRC).
> 
> Don't you find strange that shell utils module don't have any
> functions for the main shell function - command execution?

What "shell utils" module? Subprocess has exactly that in `call`
and its variants. And "shutil" does not bill itself as a 
"shell utils" module right now, its description is
"High-level file operations".

> shutil.runret()  - by definition has shell=True

Great, so your recommendation is to be completely insecure by default?

> That's a high-level _user_ function. When user runs command in shell
> he sees both. So, this 'shell util' is an analogue.

That makes no sense, when users invoke shell commands programmatically
(which is what these APIs are about), they expect two semantically
different reporting streams to be split, not to be merged,
indistinguishable and unusable as a default. Dropping stderr on the
ground may be an acceptable default but munging stdout and stderr is not.

> The main purpose of this function is to be useful from Python console

Then I'm not sure it belongs in subprocess or shutil, and users with that
need should probably be driven towards iPython which provides extensive
means of calling into the system shell in interactive sessions[0].
bpython may also provide such facilities.

It *may* belong in the interactive interpreter's own namespace.

> The main purpose of this function is to be
> useful from Python console, so the interface should be very simple to
> remember from the first try. Like runout(command,
> ret='stdout|stderr|both').

As opposed to `check_output(command)`?

> It won't be 'shell util' function anymore. If you're using shell
> execution functions, you already realize that will happen if your
> input parameters are not validated properly.

This assertion demonstrably does not match reality, shell injections
(the very reason for this warning) would not exist if this were the
case.

[0] http://ipython.org/ipython-doc/rel-0.12/interactive/reference.html#system-shell-access

From mwm at mired.org  Fri Feb 24 13:13:25 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 24 Feb 2012 07:13:25 -0500
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
Message-ID: <20120224071325.08f07d32@bhuda.mired.org>

On Fri, 24 Feb 2012 13:46:12 +0200
anatoly techtonik <techtonik at gmail.com> wrote:
> On Fri, Feb 24, 2012 at 2:25 PM, Mike Meyer <mwm at mired.org> wrote:
> > On Fri, 24 Feb 2012 12:10:08 +0100
> > Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> >> On Fri, Feb 24, 2012 at 11:52 AM, anatoly techtonik <techtonik at gmail.com>wrote:
> >> > Hello,
> >> > subprocess is low level, cryptic, does too much, with poor usability,
> >> > i.e. "don't make me think" is not about it. I don't know about you,
> >> > but I can hardly write any subprocess call without spending at least
> >> > 5-10 meditating over the documentation. So, I propose two high level
> >> > KISS functions for shell utils (shutil) module:
> >> > runret(command) ? - run command through shell, return ret code
> >> mmm you are describing subprocess.call() ?here... I don't see how this new
> >> command makes thing better, besides shell=True.
> > The stated purpose of the new functions is to allow people to run
> > shell commands without thinking about them. That's a bad idea (isn't
> > most programming without thinking about it?). The first problem is
> > that it's a great way to add data injection vulnerabilities to your
> > application. It's also a good way to introduce bugs in your
> > application when asked to (for instance) process user-provided file
> > names.
> > -1
> The proposal doesn't took into account security implications, so your
> -1 is premature.

Failing to take into account security implications means the -1 isn't
premature, it's mandatory!

> I agree with your point that users should think about *security* when
> they run commands. But they should not think about how tons of
> different ways to execute their command and different combinations on
> different operating systems, *and* security implications about this.

This sounds like a documentation issue, not a code issue.

In fact, checking the shutil docs (via pydoc) turns up:

    shutil - Utility functions for copying and archiving files and directory trees.

Clearly, running commands is *not* part of this functionality, so
these new functions don't belong there.

> If users choose to trade security over simplicity, they should know
> what the risks are, and what to do if they want to avoid them. So I
> completely support the idea of shutil docs containing a user friendly
> explanation of how to exploit and how to protect (i.e. use subprocess)
> from the flaws provided by this method of execution - if they need to
> protect. Python is not a Java - it should give users a choice of
> simple API when they don't need security, and let this choice of
> shooting themselves in the foot be explicit.. and simple.

So now look at use cases. The "simple" method you propose is *only*
safe to use on a very small set of constant strings. If any of the
values in the string are supplied by the user in any way, you can't
use it. If any of the arguments contain shell meta-characters, you
either have to quote them or not use your method. Since you're
explicitly proposing passing the command to the shell, the programmer
doesn't even know which characters are meta-characters when they write
the code. This means these functions - as proposed - are more
attractive nuisances than useful utilities.

Oddly enough, I read the Julia docs on external commands between my
first answer and your reply, and their solution is both as simple as
what you want, and safe. This inspired a counter proposal:

How about adding your new function to subprocess, except instead of
passing them to the shell, they use shlex to parse them, then call
Popen with the appropriate arguments? shlex might need some work for
this.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Fri Feb 24 13:14:42 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 22:14:42 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
Message-ID: <CADiSq7dK9hHCKWXESpK-ZW4XjH7zwWqNw4tgwPX3v76Uw_Eppg@mail.gmail.com>

On Fri, Feb 24, 2012 at 9:46 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> This is *the main point* that make subprocess module a failure, and a
> basis (main reason) of this proposal.

Anatoly, this is the exact kind of blanket statement that pisses
people off and makes them stop listening to you. The subprocess module
is not a failure by any means. Safely invoking subprocesses is a *hard
problem*. Other languages make the choice "guarding against shell
injections is a problem for the user to deal with" and allow them by
default in their subprocess invocation interfaces. They also make the
choice that the risk of data leakage through user provided format
strings is something for the developer to worry about and allow
implicit string interpolation.

Python doesn't allow either of those as a *deliberate design choice*.
The current behaviour isn't an accident, or due to neglect, or because
we're stupid. Instead, we default to the more secure, less convenient
options, and allow people to explicitly request the insecure behaviour
if they either:
1. don't care; or
2. do care, but also know it isn't actually a problem for their use case.

This is a *good thing* if you're an application programmer - secure
defaults lets you conduct security audits by looking specifically for
cases where the safety checks have been bypassed. However, it mostly
sucks if you're wanting to use Python for system administration (or
similar) tasks where the shell is an essential tool rather than a
security risk and there's no untrusted data that comes anywhere near
your script.

I'll repeat my suggestion: if you want to do something *constructive*
about this, get Shell Command from PyPI and start using it, as it aims
to address both the shell invocation and the string interpolation
aspects of this issue. If you find problems, report them on the
module's issue tracker (although I'll point out in advance that STDERR
being separate from STDOUT by default is *deliberate*. If people want
them merged they can include a redirection in their shell command.
Otherwise STDERR needs to remain mapped to the same stream as it is in
the parent process so that tools like getpass() will still work in an
invoked shell command).

Regards,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Feb 24 13:19:07 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 22:19:07 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <20120224071325.08f07d32@bhuda.mired.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
Message-ID: <CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>

On Fri, Feb 24, 2012 at 10:13 PM, Mike Meyer <mwm at mired.org> wrote:
> How about adding your new function to subprocess, except instead of
> passing them to the shell, they use shlex to parse them, then call
> Popen with the appropriate arguments? shlex might need some work for
> this.

http://shell-command.readthedocs.org

>>> from shell_command import shell_call
>>> shell_call("ls *.py")
setup.py  shell_command.py  test_shell_command.py
0
>>> shell_call("ls {}", "*.py")
ls: cannot access *.py: No such file or directory
2
>>> shell_call("ls {!u}", "*.py")
setup.py  shell_command.py  test_shell_command.py
0

Unless someone uncovers a major design flaw in the next few months, at
least ShellCommand, shell_call, shell_check_call and shell_output are
likely to make an appearance in subprocess for 3.3.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Feb 24 13:41:50 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 24 Feb 2012 22:41:50 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <20120224071325.08f07d32@bhuda.mired.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
Message-ID: <CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>

On Fri, Feb 24, 2012 at 10:13 PM, Mike Meyer <mwm at mired.org> wrote:
> Oddly enough, I read the Julia docs on external commands between my
> first answer and your reply, and their solution is both as simple as
> what you want, and safe.

That *is* rather nice, although they never get around to actually
explaining *how* to capture the output from the child processes
(http://julialang.org/manual/running-external-programs/, for anyone
else that's interested).

It should definitely be possible to implement something along those
lines as a third party library on top of subprocess (although it would
be a lot more complicated than Shell Command is).

Kenneth Reitz (author of "requests") has also spent some time
tinkering with subprocess invocation API design concepts:
https://github.com/kennethreitz/envoy

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Fri Feb 24 13:59:51 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 24 Feb 2012 07:59:51 -0500
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
Message-ID: <20120224075951.0ec1076d@bhuda.mired.org>

On Fri, 24 Feb 2012 22:19:07 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Fri, Feb 24, 2012 at 10:13 PM, Mike Meyer <mwm at mired.org> wrote:
> > How about adding your new function to subprocess, except instead of
> > passing them to the shell, they use shlex to parse them, then call
> > Popen with the appropriate arguments? shlex might need some work for
> > this.
> 
> http://shell-command.readthedocs.org

That says:

    This module aims to take over where subprocess leaves off,
    providing convenient, low-level access to the system shell, that
    automatically handles filenames and paths containing whitespace,
    as well as protecting naive code from shell injection
    vulnerabilities.

That's a backwards approach to security. Rather than allowing anything
and turning off what you know isn't safe, you should disallow
everything and turn on what you know is safe. So rather than trying to
make the strings you pass to the shell safe, you should parse them
yourself and avoid calling the shell at all.

	 <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From techtonik at gmail.com  Fri Feb 24 14:00:25 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 24 Feb 2012 15:00:25 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <B363833B-AFBF-4EDA-90FA-7F2DF22BE33A@masklinn.net>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<B363833B-AFBF-4EDA-90FA-7F2DF22BE33A@masklinn.net>
Message-ID: <CAPkN8xK-vBPntB3JTB_0h5AHHSSR72Xjb3PDuHRwhhvp9E6x+w@mail.gmail.com>

On Fri, Feb 24, 2012 at 2:50 PM, Masklinn <masklinn at masklinn.net> wrote:
> On 2012-02-24, at 12:12 , anatoly techtonik wrote:
>>
>> 1. they require try/catch
>
> No.

Quote from the docs:
"Run command with arguments. Wait for command to complete. If the
return code was zero then return, otherwise raise CalledProcessError."
http://docs.python.org/library/subprocess.html#subprocess.check_call

>> 2. docs still refer Popen, which IS complicated
>
> True.
>
>> 3. contain shell FUD
>
> No, they contain warnings, against shell injection security
> risks. Warnings are not FUD, it's not trying to sell some sort
> of alternative it's just warning that `shell=True` is dangerous
> on untrusted input.

Warnings would be o.k. if they provided at least some guidelines where
shell=True can be useful and where do you need to use Popen (or
escaping). Without positive examples, and a little research to show
attack vectors (so that users can analyse if they are applicable in
their specific case) it is FUD IMO.

>> 4. completely confuse users with stdout=PIPE or stderr=PIPE stuff
>>
>> http://docs.python.org/library/subprocess.html#subprocess.check_call
>
> On the one hand, these notes are a bit clumsy. On the other hand,
> piping is a pretty fundamental concept of shell execution, I see
> nothing wrong about saying that these functions *can't* be involved
> in pipes. In fact stating it upfront looks sensible.

The point is that it makes things more complicated than necessary. As
a system programmer I feel confident about all this stuff, but users
struggle to get it and they blame Python for complexity, and I have to
agree. We can change that with high level API. The API that will
automatically provide a rolling buffer for output if required to avoid
locks (for the missing info as a drawback), and remove headache about
"what to do about that?".

>>> If you do "pip install shell-command" you can also access the
>>> shell_call(), shell_check_call() and shell_output() functions I
>>> currently plan to include in subprocess for 3.3. (I'm not sure which
>>> versions of Python that module currently supports though - 2.7 and
>>> 3.2, IIRC).
>>
>> Don't you find strange that shell utils module don't have any
>> functions for the main shell function - command execution?
>
> What "shell utils" module? Subprocess has exactly that in `call`
> and its variants. And "shutil" does not bill itself as a
> "shell utils" module right now, its description is
> "High-level file operations".
>
>> shutil.runret() ?- by definition has shell=True
>
> Great, so your recommendation is to be completely insecure by default?

Not "by default" - only if it is impossible to make shutil.run*()
functions more secure. They only make sense with shell=True, so my
recommendation is to analyse security implications and *let* users
make their grounded choice. Not frighten them, but making them think
about security.

The difference. User friendly docs for shutil.run*() docs should be
structured as following:
1. you are free to use these functions
2. but know that they are insecure
3. in these cases:
3.1
3.2
3.3
4. if you think these cases won't apply to your project, then feel
free to use, otherwise look at subprocess

Of course, if some cases 3.1-3.3 have workarounds, they should be mentioned.

>> That's a high-level _user_ function. When user runs command in shell
>> he sees both. So, this 'shell util' is an analogue.
>
> That makes no sense, when users invoke shell commands programmatically
> (which is what these APIs are about), they expect two semantically
> different reporting streams to be split, not to be merged,
> indistinguishable and unusable as a default. Dropping stderr on the
> ground may be an acceptable default but munging stdout and stderr is not.

Conflict point:
Do users care about stdout/stderr when they invoke shell commands?
Do users care about stdout/stderr when they use Python syntax for
invoking shell commands?

These functions is no a syntax sugar for developers (as the
aforementioned "alternatives" from subprocess modules are). They are
helper for users. If you're a developer, who cares about pipes and
needs programmatic acces  - there is already a low level subprocess
API with developer's defaults. If we speak about users:

The standard shell console behaviour is to output both streams to the
screen. That means that if I want to process this output, I don't know
if it comes from stderr or stdout. So, if I want to process the output
- I use Python to do this. If I know what I need the output from
stderr only, I specify this explicitly. That's my default user story.

>> The main purpose of this function is to be useful from Python console
>
> Then I'm not sure it belongs in subprocess or shutil, and users with that
> need should probably be driven towards iPython which provides extensive
> means of calling into the system shell in interactive sessions[0].
> bpython may also provide such facilities.

I think it is a good idea to unify interface across interactive mode
in Python. Hopefully shutil.copy and friends are already good enough
so that they don't have reasons to reimplement them (and users to
learn new commands).

>> The main purpose of this function is to be
>> useful from Python console, so the interface should be very simple to
>> remember from the first try. Like runout(command,
>> ret='stdout|stderr|both').
>
> As opposed to `check_output(command)`?

As opposed to check_output(command, *, stdin=None, stdout=None,
stderr=None, shell=True)

>> It won't be 'shell util' function anymore. If you're using shell
>> execution functions, you already realize that will happen if your
>> input parameters are not validated properly.
>
> This assertion demonstrably does not match reality, shell injections
> (the very reason for this warning) would not exist if this were the
> case.

It is not assertion, it is a wannabe for shutil documentation to
clarify shell injections problems to the level that allow users to
make a reasonable choice, so if the user is "using shell execution
functions he already realizes that will happen if his input parameters
are not validated properly".

-- 
anatoly t.


From mwm at mired.org  Fri Feb 24 14:09:31 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 24 Feb 2012 08:09:31 -0500
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xK-vBPntB3JTB_0h5AHHSSR72Xjb3PDuHRwhhvp9E6x+w@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<B363833B-AFBF-4EDA-90FA-7F2DF22BE33A@masklinn.net>
	<CAPkN8xK-vBPntB3JTB_0h5AHHSSR72Xjb3PDuHRwhhvp9E6x+w@mail.gmail.com>
Message-ID: <20120224080931.6cad78db@bhuda.mired.org>

On Fri, 24 Feb 2012 15:00:25 +0200
anatoly techtonik <techtonik at gmail.com> wrote:

> On Fri, Feb 24, 2012 at 2:50 PM, Masklinn <masklinn at masklinn.net> wrote:
> > On 2012-02-24, at 12:12 , anatoly techtonik wrote:
> >> 1. they require try/catch
> > No.
> Quote from the docs:
> "Run command with arguments. Wait for command to complete. If the
> return code was zero then return, otherwise raise CalledProcessError."
> http://docs.python.org/library/subprocess.html#subprocess.check_call

Quote from the docs:

    subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False)
        Run the command described by args. Wait for command to complete,
        then return the returncode attribute.

No documented exceptions raised, so no need for try/catch.

> >> 2. docs still refer Popen, which IS complicated
> > True.
> >> 3. contain shell FUD
> > No, they contain warnings, against shell injection security
> > risks. Warnings are not FUD, it's not trying to sell some sort
> > of alternative it's just warning that `shell=True` is dangerous
> > on untrusted input.
> Warnings would be o.k. if they provided at least some guidelines where
> shell=True can be useful and where do you need to use Popen (or
> escaping). Without positive examples, and a little research to show
> attack vectors (so that users can analyse if they are applicable in
> their specific case) it is FUD IMO.

You mean something like (quoting from the docs):

    Warning

    Executing shell commands that incorporate unsanitized input from
    an untrusted source makes a program vulnerable to shell injection,
    a serious security flaw which can result in arbitrary command
    execution. For this reason, the use of shell=True is strongly
    discouraged in cases where the command string is constructed from
    external input: 

    <example removed>

        <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From tshepang at gmail.com  Fri Feb 24 14:11:06 2012
From: tshepang at gmail.com (Tshepang Lekhonkhobe)
Date: Fri, 24 Feb 2012 15:11:06 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>
Message-ID: <CAA77j2B2fc4tZQATjr=1Ap6B_+o3vCT3_sucvGwSAZA1pkEJOg@mail.gmail.com>

On Fri, Feb 24, 2012 at 13:31, Nick Coghlan <ncoghlan at gmail.com> wrote:
> However, if you (or anyone else) wants to see Python's innate
> capabilities improve in this area (and they really are subpar compared
> to Perl 5, for example), your best bet is to download my Shell Command
> module and give me feedback on any problems you find with it via the
> BitBucket issue tracker.

Just curious: If put in the stdlib, will the above-mentioned module
bring CPython shell handling to Perl 5 level?


From ncoghlan at gmail.com  Fri Feb 24 15:11:57 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Feb 2012 00:11:57 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <20120224075951.0ec1076d@bhuda.mired.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
Message-ID: <CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>

On Fri, Feb 24, 2012 at 10:59 PM, Mike Meyer <mwm at mired.org> wrote:
> That's a backwards approach to security. Rather than allowing anything
> and turning off what you know isn't safe, you should disallow
> everything and turn on what you know is safe. So rather than trying to
> make the strings you pass to the shell safe, you should parse them
> yourself and avoid calling the shell at all.

Yes, that's why these are *separate functions* (each with "shell" in
the name to make the shell's involvement rather hard to miss). Any
application (rather than system administration script) that calls them
with user provided data should immediately fail a security audit.

The new APIs are intended specifically for system administrators that
want the *system shell*, not a language level "cross platform"
reinvention of it (and when it comes to shells, "cross platform"
generally means, "POSIX even if you're on Windows, because we're not
interesting in trying to reproduce Microsoft's idiosyncratic way of
doing things"). The automatic quoting feature is mainly there to
handle spaces in filenames - providing poorly structured programs with
some minimal defence against shell injections is really just a bonus
(although I admit I wasn't thinking about it that way when I wrote the
current docs).

As things stand, Python is a lousy language for system administration
tasks - the standard APIs are either *very* low level (os.system()) or
they're written almost entirely from the point of view of an
application programmer (subprocess). Even when I *am* the
administrator writing automation scripts for my own use, the
subprocess library still keeps getting in the way, telling me it isn't
safe to access my own shell.

Normally, Python is pretty good about striking a sensible balance
between "safe defaults" and "consenting adults", but it currently
fails badly on this particular point.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From masklinn at masklinn.net  Fri Feb 24 15:12:30 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 24 Feb 2012 15:12:30 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xK-vBPntB3JTB_0h5AHHSSR72Xjb3PDuHRwhhvp9E6x+w@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<B363833B-AFBF-4EDA-90FA-7F2DF22BE33A@masklinn.net>
	<CAPkN8xK-vBPntB3JTB_0h5AHHSSR72Xjb3PDuHRwhhvp9E6x+w@mail.gmail.com>
Message-ID: <CF5D0A88-314B-4BE9-A64F-0440D14F1395@masklinn.net>

On 2012-02-24, at 14:00 , anatoly techtonik wrote:
> On Fri, Feb 24, 2012 at 2:50 PM, Masklinn <masklinn at masklinn.net> wrote:
>> On 2012-02-24, at 12:12 , anatoly techtonik wrote:
>>> 
>>> 1. they require try/catch
>> 
>> No.
> 
> Quote from the docs:
> "Run command with arguments. Wait for command to complete. If the
> return code was zero then return, otherwise raise CalledProcessError."
> http://docs.python.org/library/subprocess.html#subprocess.check_call

Yes. If you want to run commands you just do. try/except are only needed
if you call commands which may fail and want to handle them without
quitting the whole interpreter. And for your stated use case of
interactively calling those functions, there is no need whatsoever for
try/catch.

And `subprocess.call` returns the status code, no exception ever thrown.

>>> 3. contain shell FUD
>> 
>> No, they contain warnings, against shell injection security
>> risks. Warnings are not FUD, it's not trying to sell some sort
>> of alternative it's just warning that `shell=True` is dangerous
>> on untrusted input.
> 
> Warnings would be o.k. if they provided at least some guidelines where
> shell=True can be useful and where do you need to use Popen (or
> escaping). Without positive examples, and a little research to show
> attack vectors (so that users can analyse if they are applicable in
> their specific case) it is FUD IMO.

http://docs.python.org/library/subprocess.html#frequently-used-arguments

>>> 4. completely confuse users with stdout=PIPE or stderr=PIPE stuff
>>> 
>>> http://docs.python.org/library/subprocess.html#subprocess.check_call
>> 
>> On the one hand, these notes are a bit clumsy. On the other hand,
>> piping is a pretty fundamental concept of shell execution, I see
>> nothing wrong about saying that these functions *can't* be involved
>> in pipes. In fact stating it upfront looks sensible.
> 
> The point is that it makes things more complicated than necessary.

How?

> As
> a system programmer I feel confident about all this stuff

You feel confident about something which does not work, without warning?

>>>> If you do "pip install shell-command" you can also access the
>>>> shell_call(), shell_check_call() and shell_output() functions I
>>>> currently plan to include in subprocess for 3.3. (I'm not sure which
>>>> versions of Python that module currently supports though - 2.7 and
>>>> 3.2, IIRC).
>>> 
>>> Don't you find strange that shell utils module don't have any
>>> functions for the main shell function - command execution?
>> 
>> What "shell utils" module? Subprocess has exactly that in `call`
>> and its variants. And "shutil" does not bill itself as a
>> "shell utils" module right now, its description is
>> "High-level file operations".
>> 
>>> shutil.runret()  - by definition has shell=True
>> 
>> Great, so your recommendation is to be completely insecure by default?
> 
> Not "by default"

Oh? Because this:

> - only if it is impossible to make shutil.run*()
> functions more secure. They only make sense with shell=True, so my
> recommendation is to analyse security implications and *let* users
> make their grounded choice. Not frighten them, but making them think
> about security.
> 
> The difference. User friendly docs for shutil.run*() docs should be
> structured as following:
> 1. you are free to use these functions
> 2. but know that they are insecure
> 3. in these cases:
> 3.1
> 3.2
> 3.3
> 4. if you think these cases won't apply to your project, then feel
> free to use, otherwise look at subprocess
> 
> Of course, if some cases 3.1-3.3 have workarounds, they should be mentioned.

states precisely that the function would be insecure by default, and would
have caveat warnings in the docs. Which is the correct approach to
security? never as far as I know.

>>> The main purpose of this function is to be useful from Python console
>> 
>> Then I'm not sure it belongs in subprocess or shutil, and users with that
>> need should probably be driven towards iPython which provides extensive
>> means of calling into the system shell in interactive sessions[0].
>> bpython may also provide such facilities.
> 
> I think it is a good idea to unify interface across interactive mode
> in Python.

Considering IPython uses syntactic extentions (a "!" prefix)  and does not
require any importing effort currently, I doubt that's going to happen.

>>> It won't be 'shell util' function anymore. If you're using shell
>>> execution functions, you already realize that will happen if your
>>> input parameters are not validated properly.
>> 
>> This assertion demonstrably does not match reality, shell injections
>> (the very reason for this warning) would not exist if this were the
>> case.
> 
> It is not assertion,

You may want to look up the definition of that word, I did not remove any
context, you asserted people using shell-exec functions are aware of the
risks. Which is, as, factually wrong.

> it is a wannabe for shutil documentation to
> clarify shell injections problems to the level that allow users to
> make a reasonable choice, so if the user is "using shell execution
> functions he already realizes that will happen if his input parameters
> are not validated properly".

Not sufficient when the default behavior is unsafe (and broken), as
numerous users *will* discover the function through third parties and
may never come close to the caveats they *should* know for the default
usage of the function.

From ncoghlan at gmail.com  Fri Feb 24 15:16:37 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Feb 2012 00:16:37 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAA77j2B2fc4tZQATjr=1Ap6B_+o3vCT3_sucvGwSAZA1pkEJOg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>
	<CAA77j2B2fc4tZQATjr=1Ap6B_+o3vCT3_sucvGwSAZA1pkEJOg@mail.gmail.com>
Message-ID: <CADiSq7cuY8nqv3qp3KMYxkX_tcmMNQ0a1RUiAm5ROuA0-iLRYQ@mail.gmail.com>

On Fri, Feb 24, 2012 at 11:11 PM, Tshepang Lekhonkhobe
<tshepang at gmail.com> wrote:
> Just curious: If put in the stdlib, will the above-mentioned module
> bring CPython shell handling to Perl 5 level?

Closer, but it's hard to match backticks and implicit interpolation
for convenience (neither of which is going to happen in Python).

However, the trade-off is that you get things like the ability to
create pre-defined commands and easier invocation of shlex.quote when
appropriate, along with exceptions for some errors that would
otherwise pass silently.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Fri Feb 24 15:23:25 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 25 Feb 2012 01:23:25 +1100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7cuY8nqv3qp3KMYxkX_tcmMNQ0a1RUiAm5ROuA0-iLRYQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>	<CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>	<CAA77j2B2fc4tZQATjr=1Ap6B_+o3vCT3_sucvGwSAZA1pkEJOg@mail.gmail.com>
	<CADiSq7cuY8nqv3qp3KMYxkX_tcmMNQ0a1RUiAm5ROuA0-iLRYQ@mail.gmail.com>
Message-ID: <4F479D5D.6080102@pearwood.info>

Nick Coghlan wrote:
> On Fri, Feb 24, 2012 at 11:11 PM, Tshepang Lekhonkhobe
> <tshepang at gmail.com> wrote:
>> Just curious: If put in the stdlib, will the above-mentioned module
>> bring CPython shell handling to Perl 5 level?
> 
> Closer, but it's hard to match backticks and implicit interpolation
> for convenience (neither of which is going to happen in Python).

Anyone wanting to use Python as a system shell should look at IPython rather 
than the standard Python interactive interpreter.

http://ipython.org/ipython-doc/dev/interactive/shell.html


-- 
Steven


From p.f.moore at gmail.com  Fri Feb 24 15:32:05 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 24 Feb 2012 14:32:05 +0000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
Message-ID: <CACac1F_F08m2-5OOrf1cHnTVET9b=Gt+4cqz9SnVsmZin9GUfw@mail.gmail.com>

On 24 February 2012 12:41, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Kenneth Reitz (author of "requests") has also spent some time
> tinkering with subprocess invocation API design concepts:
> https://github.com/kennethreitz/envoy

Vinay Sanjip extended this with "sarge" (available on PyPI, IIRC). One
key advantage of sarge for me is that it handles piping and
redirection in a cross-platfom manner, rather than just deferring to
the shell. (I think envoy does this too, but it's not very reliable on
WIndows from what I recall of my brief experiments).

Paul.


From ncoghlan at gmail.com  Fri Feb 24 15:32:23 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Feb 2012 00:32:23 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <4F479D5D.6080102@pearwood.info>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
	<CADiSq7e8bSgJkpXmMbQ4CyqaU+vjcFm1wHHnO_fdR2Jo_4ibnw@mail.gmail.com>
	<CAA77j2B2fc4tZQATjr=1Ap6B_+o3vCT3_sucvGwSAZA1pkEJOg@mail.gmail.com>
	<CADiSq7cuY8nqv3qp3KMYxkX_tcmMNQ0a1RUiAm5ROuA0-iLRYQ@mail.gmail.com>
	<4F479D5D.6080102@pearwood.info>
Message-ID: <CADiSq7dazmBXHpG16egc02E-BT6=vbrQGc6z-C7Dc3D_KsTD2Q@mail.gmail.com>

On Sat, Feb 25, 2012 at 12:23 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> Anyone wanting to use Python as a system shell should look at IPython rather
> than the standard Python interactive interpreter.
>
> http://ipython.org/ipython-doc/dev/interactive/shell.html

Sure, but unless we add ! statements to Python itself, that doesn't
help with shell *scripting*.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Fri Feb 24 15:54:19 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 25 Feb 2012 00:54:19 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CACac1F_F08m2-5OOrf1cHnTVET9b=Gt+4cqz9SnVsmZin9GUfw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
	<CACac1F_F08m2-5OOrf1cHnTVET9b=Gt+4cqz9SnVsmZin9GUfw@mail.gmail.com>
Message-ID: <CADiSq7fEGTK45rkgZW1uDbmy-vJUfprnHsuqa7WbmLYQS2FzdA@mail.gmail.com>

On Sat, Feb 25, 2012 at 12:32 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 24 February 2012 12:41, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Kenneth Reitz (author of "requests") has also spent some time
>> tinkering with subprocess invocation API design concepts:
>> https://github.com/kennethreitz/envoy
>
> Vinay Sanjip extended this with "sarge" (available on PyPI, IIRC). One
> key advantage of sarge for me is that it handles piping and
> redirection in a cross-platfom manner, rather than just deferring to
> the shell. (I think envoy does this too, but it's not very reliable on
> WIndows from what I recall of my brief experiments).

Ah, I knew I'd seen a more polished version of that somewhere - Vinay
posted about it a while back.

As I see it, the two complement each other fairly nicely:

shell_command is for direct access to the system shell. Appropriate
when you're writing platform specific administration scripts.

sarge is for cross platform scripting support. I'm actually not sure
what this is useful for (since the default Windows shell has different
spellings for so many basic commands and different syntax for
environment variable expansion, it seems easier to just use the
*actual* cross platform abstractions in the os module instead), but
apparently it's good for something (or Vinay wouldn't have taken the
time to write it).

Of course, since it's just a convenience wrapper around Popen,
ShellCommand does let you get pretty cute:

>>> import sys
>>> from functools import partial
>>> from shell_command import ShellCommand
>>> code = """
... def f():
...     print("Python in a subprocess, easy as!")
... f()
... """
>>> PyCmd = partial(ShellCommand, executable=sys.executable)
>>> PyCmd(code).shell_call()
Python in a subprocess, easy as!
0
>>> x = PyCmd("print('Reporting for duty!')").shell_output()
>>> x
'Reporting for duty!'

(I didn't actually do a great deal in ShellCommand to enable that -
it's just a matter of passing all the keyword args through to
subprocess.Popen)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From simon.sapin at kozea.fr  Fri Feb 24 16:04:06 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Fri, 24 Feb 2012 16:04:06 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
Message-ID: <4F47A6E6.5030409@kozea.fr>

Le 24/02/2012 11:52, anatoly techtonik a ?crit :
> runret(command)   - run command through shell, return ret code
> runout(command)  - run command through shell, return output

Hi,

Brevity is nice, but I had no idea what either of these functions is 
supposed to do before reading these descriptions. The names could be 
more explicit.

(By the way, I agree with other issues raised in this thread. This was 
only my first impression.)

Regards,
-- 
Simon Sapin


From amcnabb at mcnabbs.org  Fri Feb 24 18:24:54 2012
From: amcnabb at mcnabbs.org (Andrew McNabb)
Date: Fri, 24 Feb 2012 10:24:54 -0700
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
Message-ID: <20120224172454.GA3795@mcnabbs.org>

On Sat, Feb 25, 2012 at 12:11:57AM +1000, Nick Coghlan wrote:
> 
> As things stand, Python is a lousy language for system administration
> tasks - the standard APIs are either *very* low level (os.system()) or
> they're written almost entirely from the point of view of an
> application programmer (subprocess). Even when I *am* the
> administrator writing automation scripts for my own use, the
> subprocess library still keeps getting in the way, telling me it isn't
> safe to access my own shell.
> 
> Normally, Python is pretty good about striking a sensible balance
> between "safe defaults" and "consenting adults", but it currently
> fails badly on this particular point.

I disagree with this analysis.  Python, with its fantastic subprocess
module, is the only language I really trust for system administration
tasks.  Most languages provide "shell=True" as the default, making them
extremely frustrating for system administration.  Every time I choose to
write a shell script instead of using Python, the lack of robustness
makes me eventually regret it (and then rewrite in Python with
subprocess).

Setting "shell=True" (or equivalent) seems really convenient in the
short term, but in the long term, scripts behave erratically and are
vulnerable to attacks.  The subprocess module (with "shell=False") is a
wonderful balance between "safe defaults" and "consenting adults".


--
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868


From g.brandl at gmx.net  Fri Feb 24 19:17:08 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 24 Feb 2012 19:17:08 +0100
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CADiSq7ee=PYRkqia0rfZRC+HBXpJPAMUo6NeS6-duVQDOpjgxQ@mail.gmail.com>
	<CAPkN8xKcKtf9gOx6N3veacmnjEapToRpF1GX=twfkYyngWY+VQ@mail.gmail.com>
Message-ID: <ji8jvk$3uo$1@dough.gmane.org>

Am 24.02.2012 12:12, schrieb anatoly techtonik:
> On Fri, Feb 24, 2012 at 1:59 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On Fri, Feb 24, 2012 at 8:52 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>>> Hello,
>>>
>>> subprocess is low level, cryptic, does too much, with poor usability,
>>> i.e. "don't make me think" is not about it. I don't know about you,
>>> but I can hardly write any subprocess call without spending at least
>>> 5-10 meditating over the documentation.
>>
>> I believe you'll find the simple convenience methods you are
>> requesting already exist, in the form of subprocess.call(),
>> subprocess.check_call() and subprocess.check_output(). The
>> documentation has also been updated to emphasise these convenience
>> functions over the Popen swiss army knife.
> 
> I don't find the names of these functions more intuitive than Popen().
> I also think they far from being simple, because (in the order of appearance):
> 
> 1. they require try/catch
> 2. docs still refer Popen, which IS complicated
> 3. contain shell FUD
> 4. completely confuse users with stdout=PIPE or stderr=PIPE stuff
> 
> http://docs.python.org/library/subprocess.html#subprocess.check_call
> 
> My verdict - these fail to be simple, and require the same low-level
> system knowledge as Popen() for confident use.

And therefore they need to be completely replaced by something incompatible
and in another module?  Sorry, Anatoly, this is not how Python development
happens.  We usually work incrementally, improving on what we have rather
than throwing all out the door.

I think this is what rubs most people wrong about your posts: you invariably
propose radical changes that invalidate all previous work in the related
area.  That's something apart from your style of expression, which was
discussed recently.

So here's some constructive advice: your point 1 was shown invalid.  The
points 2-4 are "merely" documentation related: how about you think about
how to improve these docs to be less confusing?

Georg


From victor.stinner at haypocalc.com  Sat Feb 25 03:19:19 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sat, 25 Feb 2012 03:19:19 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4756D8.6000109@kozea.fr>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAEgL-feF_5nzCiAvX1me83w8MOc5b4Gr-LKF-FceCkrkT-xk0g@mail.gmail.com>
	<4F4756D8.6000109@kozea.fr>
Message-ID: <CAMpsgwb6BvyrKaYFqiRHmVaPHn=b-qHyr=+-Ywno7KXz9A-a6w@mail.gmail.com>

>> Can't this also be done using metaclasses?

Yes, my current proof-of-concept (PoC) uses a metadata with __prepare__.

> The class body can be executed "in" any mapping. Then I?m not sure but it
> looks like type.__new__ only takes a real dict. You have to do something in
> your overridden __new__ to eg. keep the OrderedDict?s order.

type.__new__ accepts any class inheriting from dict. My frozendict PoC
inherits from dict, so it just works. But the point is that
type.__new__ makes a copy of the dict and later it is no more possible
to replace the dict.

I would like to be able to choose the type of the __dict__ of my class.

Victor


From victor.stinner at haypocalc.com  Sat Feb 25 03:29:12 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sat, 25 Feb 2012 03:29:12 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
Message-ID: <CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>

Hum, after thinking twice, using a "frozendict" for type.__dict__ is
maybe overkill for my needs (and intrused as noticed Nick). Attached
patch for Python 3.3 is a simpler approach: add __final__ special
value for class. If this variable is present, the type is constant.
Example:
---
class Test:
   __final__=True
   x = 1

Test.x = 2 # raise a TypeError
Test.new_attr = 1 # raise a TypeError
del Test.x # raise a TypeError
---

There are various ways to deny the modification of a class attribute,
but I don't know how to block the removal of an attribute of the
addition of a new attribute without my patch.

--

My patch is just a proof-of-concept. For example, it doesn't ensure
that values are read-only too. By the way, how can I check that "a
value is constant"? Except builtin immutable types, I suppose that the
only way is to call hash(obj) and excepts an expect a TypeError.

Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: type_final.patch
Type: text/x-patch
Size: 2345 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120225/3e4c8fba/attachment.bin>

From yselivanov.ml at gmail.com  Sat Feb 25 05:58:25 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Fri, 24 Feb 2012 23:58:25 -0500
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
Message-ID: <7A1BA916-26A6-4FE5-9508-03C5D5F58AF0@gmail.com>


On 2012-02-24, at 9:29 PM, Victor Stinner wrote:

> Hum, after thinking twice, using a "frozendict" for type.__dict__ is
> maybe overkill for my needs (and intrused as noticed Nick). Attached
> patch for Python 3.3 is a simpler approach: add __final__ special
> value for class. If this variable is present, the type is constant.
> Example:
> ---
> class Test:
>   __final__=True
>   x = 1

-1 on this.  The next move would be adding friend classes and protected methods ;)
__setattr__ works perfectly for those purposes.  Moreover, you can emulate 
your idea on unpatched python by using metaclasses.

-
Yury


From ned at nedbatchelder.com  Sat Feb 25 14:17:58 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sat, 25 Feb 2012 08:17:58 -0500
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
Message-ID: <4F48DF86.7060600@nedbatchelder.com>

On 2/24/2012 9:29 PM, Victor Stinner wrote:
> Hum, after thinking twice, using a "frozendict" for type.__dict__ is
> maybe overkill for my needs (and intrused as noticed Nick). Attached
> patch for Python 3.3 is a simpler approach: add __final__ special
> value for class. If this variable is present, the type is constant.
The Python answer for people who want read-only data structures has 
always been, "Don't modify them if you don't want to, and write docs 
that tell other people not to as well."  What are you building that this 
answer isn't good enough?

--Ned.


From stephen at xemacs.org  Sat Feb 25 15:03:11 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 25 Feb 2012 23:03:11 +0900
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
Message-ID: <87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > As things stand, Python is a lousy language for system administration
 > tasks

Yeah, the worst possible sysadmin language except for all the others.
AFAICT it more than holds its own with distro maintainers, no?


From steve at pearwood.info  Sun Feb 26 00:05:56 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 26 Feb 2012 10:05:56 +1100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F48DF86.7060600@nedbatchelder.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com>
Message-ID: <4F496954.30101@pearwood.info>

Ned Batchelder wrote:

> The Python answer for people who want read-only data structures has 
> always been, "Don't modify them if you don't want to, and write docs 
> that tell other people not to as well."  What are you building that this 
> answer isn't good enough?

That is silly. That alleged "Python answer" is like telling people that they 
don't need test frameworks or debuggers because the "Python answer" for people 
wanting to debug their code is not to write buggy code in the first place.

Python has read-only data structures: tuple, frozenset, str, etc. If you ask 
yourself why Python has immutable types, it might give you a clue why Victor 
wants the ability to create other immutable types like frozendict, and why 
"don't modify them" is not a good enough answer:

- Immutable types can be used as keys in dicts.

- Immutable types protect you from errors. While you might intend not
   to modify a data structure, bugs do happen.

Immutability gives you an immediate exception at the exact time and place you 
attempt to modify the data structure instead of at some arbitrary time later 
far from the actual bug.

Python has excellent support for read-only data structures, so long as you 
write them in C.


-- 
Steven


From masklinn at masklinn.net  Sun Feb 26 00:32:52 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sun, 26 Feb 2012 00:32:52 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F496954.30101@pearwood.info>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
Message-ID: <CEE532DD-03B6-4656-BF05-81B530DB1811@masklinn.net>

On 2012-02-26, at 00:05 , Steven D'Aprano wrote:
> - Immutable types can be used as keys in dicts.

*technically*, you can use mutable types as dict keys if you define
their __hash__ no? That is of course a bad idea when the instances
are *expected* to be modified, but it should "work".

> - Immutable types protect you from errors. While you might intend not
>  to modify a data structure, bugs do happen.

Immutables are also inherently thread-safe (since thread safety is about
shared state, and shared immutables are not state). Which is a nice
guarantee.

> Python has excellent support for read-only data structures, so long as you write them in C.

There's also good support of the "consenting adults" variety (use
_-prefixed attributes for the actual state and expose what needs to be
exposed via properties and methods). That can be simplified with a
custom descriptor type which can only be set once (similar to java's
`final`), it would be set in the type's constructor and never re-set
from this.


From ncoghlan at gmail.com  Sun Feb 26 07:27:19 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 26 Feb 2012 16:27:19 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
	<87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>

On Sun, Feb 26, 2012 at 12:03 AM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Nick Coghlan writes:
>
> ?> As things stand, Python is a lousy language for system administration
> ?> tasks
>
> Yeah, the worst possible sysadmin language except for all the others.
> AFAICT it more than holds its own with distro maintainers, no?

For applications where correctness in all circumstances is the
dominant criterion? Sure.

For throwaway scripts, though, most of the Linux sysadmins I know just
use shell scripts or Perl. For the devops (and deployment automation
in general) crowd, there's no real Python-based competitor to Chef and
Puppet (both Ruby based) (my understanding is that the Python-based
Fabric doesn't play in *quite* the same space as the other two).

As things currently stand, Python deliberately makes it hard to say "I
want my individual commands to be shell commands, but I also want
Python's superior flow control constructs to decide which shell
commands to run". For an application, that's a good thing. For
personal automation, it's not.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From aquavitae69 at gmail.com  Sun Feb 26 09:07:39 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Sun, 26 Feb 2012 10:07:39 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CEE532DD-03B6-4656-BF05-81B530DB1811@masklinn.net>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CEE532DD-03B6-4656-BF05-81B530DB1811@masklinn.net>
Message-ID: <CAEgL-ffNrVPSR+kVaZvdqJ6m4G_ZD5eAC46TKm5=bwjJMEL-2Q@mail.gmail.com>

On Feb 26, 2012 1:35 AM, "Masklinn" <masklinn at masklinn.net> wrote:
>
> On 2012-02-26, at 00:05 , Steven D'Aprano wrote:
> > - Immutable types can be used as keys in dicts.
>
> *technically*, you can use mutable types as dict keys if you define
> their __hash__ no? That is of course a bad idea when the instances
> are *expected* to be modified, but it should "work".
I wouldn't say this is necessarily a bad thing at all. It just depends what
defines the object. If an instance represent a specific object (e.g. a
database record) you wouldn't expect the hash to change if you modified an
attribute of it, since the instance still represents the same object.

>
> > - Immutable types protect you from errors. While you might intend not
> >  to modify a data structure, bugs do happen.
>
> Immutables are also inherently thread-safe (since thread safety is about
> shared state, and shared immutables are not state). Which is a nice
> guarantee.
>
> > Python has excellent support for read-only data structures, so long as
you write them in C.
>
> There's also good support of the "consenting adults" variety (use
> _-prefixed attributes for the actual state and expose what needs to be
> exposed via properties and methods). That can be simplified with a
> custom descriptor type which can only be set once (similar to java's
> `final`), it would be set in the type's constructor and never re-set
> from this.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

Maybe I'm missing something here, but what's wrong with just using
__getattr__, __setattr__ and __delattr__ to restrict access?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/6fa5125a/attachment.html>

From simon.sapin at kozea.fr  Sun Feb 26 09:26:50 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Sun, 26 Feb 2012 09:26:50 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
Message-ID: <4F49ECCA.7060802@kozea.fr>

Le 24/02/2012 00:34, Victor Stinner a ?crit :
> I'm trying to create read-only objects using a "frozendict" class.
> frozendict is a read-only dict. I would like to use frozendict for the
> class dict using a metaclass, but type.__new__() expects a dict and
> creates a copy of the input dict.
>
> I would be nice to support custom dict type: OrderedDict and
> frozendict for example. It looks possible to patch CPython to
> implement this feature, but first I would like first to know your
> opinion about this idea:-)

Hi,

Combining ideas from other messages in this thread: would this work?

1. Inherit from frozendict
2. Define a __getattr__ that defers to frozendict.__getitem__
3. Use an empty __slots__ so that there is no "normal" instance attribute.

Thinking about it a bit more, it?s probably the same as having a normal 
__dict__ and raising in __setattr__ and __delattr__. Isn?t this how you 
implement frozendict? (Raise in __setitem__, __delitem__, update, etc.)

Regards,
-- 
Simon Sapin


From eliben at gmail.com  Sun Feb 26 09:53:28 2012
From: eliben at gmail.com (Eli Bendersky)
Date: Sun, 26 Feb 2012 10:53:28 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
	<87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
Message-ID: <CAF-Rda8B3r0z=u-jRPpp7Yx_94T=fmb7YCXrxzebP_yAeA0How@mail.gmail.com>

On Sun, Feb 26, 2012 at 08:27, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Sun, Feb 26, 2012 at 12:03 AM, Stephen J. Turnbull
> <stephen at xemacs.org> wrote:
> > Nick Coghlan writes:
> >
> >  > As things stand, Python is a lousy language for system administration
> >  > tasks
> >
> > Yeah, the worst possible sysadmin language except for all the others.
> > AFAICT it more than holds its own with distro maintainers, no?
>
> For applications where correctness in all circumstances is the
> dominant criterion? Sure.
>
> For throwaway scripts, though, most of the Linux sysadmins I know just
> use shell scripts or Perl. For the devops (and deployment automation
> in general) crowd, there's no real Python-based competitor to Chef and
> Puppet (both Ruby based) (my understanding is that the Python-based
> Fabric doesn't play in *quite* the same space as the other two).
>
> As things currently stand, Python deliberately makes it hard to say "I
> want my individual commands to be shell commands, but I also want
> Python's superior flow control constructs to decide which shell
> commands to run". For an application, that's a good thing. For
> personal automation, it's not.
>

Personally I find Python just find for all kinds of automation, including
bash/Perl replacement. Yes, some things may be a few characters more to
type than in Perl, but I'm happy to have all the other Python features and
libraries in my arsenal. Sysadmins use what they learned, and it also
depends on culture. Some places do use Python for sysadmin stuff too.

The Chef/Puppet/Fabric example is a good one to support this point - Ruby,
like Python is also more a dev language than a sysadmin language, and yet
Chef & Puppet are written in Ruby and not Perl.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/ecb95daf/attachment.html>

From aquavitae69 at gmail.com  Sun Feb 26 10:42:54 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Sun, 26 Feb 2012 11:42:54 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F49ECCA.7060802@kozea.fr>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
Message-ID: <CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>

On Feb 26, 2012 10:27 AM, "Simon Sapin" <simon.sapin at kozea.fr> wrote:

> Le 24/02/2012 00:34, Victor Stinner a ?crit :
>
>> I'm trying to create read-only objects using a "frozendict" class.
>> frozendict is a read-only dict. I would like to use frozendict for the
>> class dict using a metaclass, but type.__new__() expects a dict and
>> creates a copy of the input dict.
>>
>> I would be nice to support custom dict type: OrderedDict and
>> frozendict for example. It looks possible to patch CPython to
>> implement this feature, but first I would like first to know your
>> opinion about this idea:-)
>>
>
> Hi,
>
> Combining ideas from other messages in this thread: would this work?
>
> 1. Inherit from frozendict
> 2. Define a __getattr__ that defers to frozendict.__getitem__
> 3. Use an empty __slots__ so that there is no "normal" instance attribute.
>
> Thinking about it a bit more, it?s probably the same as having a normal
> __dict__ and raising in __setattr__ and __delattr__. Isn?t this how you
> implement frozendict? (Raise in __setitem__, __delitem__, update, etc.)
>
> Regards,
> --
> Simon Sapin
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>


Using frozendict, and especially inheriting from it sounds unnecessarily
complicated to me.  A simple class which doesn't allow changes to instance
attributes could be implemented something like this:

class Three_inst:

    @property
    def value(self):
        return 3

    def __setattr__(self, attr, value):
        raise AttributeError

    def __delattr__(self, attr):
        raise AttributeError

Or, if you're worried about changes to the class attributes, you could do
basically the same thing using a metaclass (python 2.7 syntax):

class FinalMeta(type):

    def __setattr__(cls, attr, value):
        if attr in cls.__dict__ or '__done__' in cls.__dict__:
            raise AttributeError
        else:
            type.__setattr__(cls, attr, value)

    def __delattr__(cls, attr):
        raise AttributeError


class Three:
    __metaclass__ = FinalMeta
    value = 3
    __done__ = True   # There may be a neater way to do this...

Each of the following examples will fail:

>>> Three.another_value = 4
>>> Three.value = 4
>>> del Three.value
>>> three = Three(); three.value = 4

Actually, I think this is quite a nice illustration of what can be done
with metaclasses!

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/c46e65f8/attachment.html>

From ncoghlan at gmail.com  Sun Feb 26 12:46:33 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 26 Feb 2012 21:46:33 +1000
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAF-Rda8B3r0z=u-jRPpp7Yx_94T=fmb7YCXrxzebP_yAeA0How@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
	<87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
	<CAF-Rda8B3r0z=u-jRPpp7Yx_94T=fmb7YCXrxzebP_yAeA0How@mail.gmail.com>
Message-ID: <CADiSq7dPZJHWV7a1zpY92zWaMmTTrWEE5H+8wiQO94fyH5pmZw@mail.gmail.com>

On Sun, Feb 26, 2012 at 6:53 PM, Eli Bendersky <eliben at gmail.com> wrote:
> The Chef/Puppet/Fabric example is a good one to support this point - Ruby,
> like Python is also more a dev language than a sysadmin language, and yet
> Chef & Puppet are written in Ruby and not Perl.

For the key operation I'm talking about here, though, Ruby works the
same way Perl does: it supports shell command execution via backtick
quoted strings with implicit string interpolation.

Is it really that hard to admit that there are some tasks that other
languages are currently just plain better for than Python, and perhaps
we can learn something useful from that? (And no, I'm not suggesting
we adopt backtick command execution or implicit string interpolation.
A convenience API that combines shell invocation, explicit string
interpolation and whitespace and shell metacharacter quoting, though,
*that* I support).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From storchaka at gmail.com  Sun Feb 26 12:49:00 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 26 Feb 2012 13:49:00 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
Message-ID: <jid67j$ag9$1@dough.gmane.org>

24.02.12 14:41, Nick Coghlan ???????(??):
> On Fri, Feb 24, 2012 at 10:13 PM, Mike Meyer<mwm at mired.org>  wrote:
>> Oddly enough, I read the Julia docs on external commands between my
>> first answer and your reply, and their solution is both as simple as
>> what you want, and safe.

Yes, I want this in Python:

readall(cmd('cut -d: -f3 $file', file='/etc/passwd') | cmd('sort -n') | cmd('tail -n5'))

or

cmd('cut', '-d:', '-f3', '/etc/passwd').pipe('sort', '-n').pipe('tail', '-n5').readlines()

or something similar.

> That *is* rather nice, although they never get around to actually
> explaining *how* to capture the output from the child processes
> (http://julialang.org/manual/running-external-programs/, for anyone
> else that's interested).

https://github.com/JuliaLang/julia/blob/10aabddc3834223568a87721149d05765e7e9997/j/process.j
See readall and each_line.


From anacrolix at gmail.com  Sun Feb 26 12:59:17 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 26 Feb 2012 19:59:17 +0800
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <jid67j$ag9$1@dough.gmane.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
	<jid67j$ag9$1@dough.gmane.org>
Message-ID: <CAB4yi1PfDMHFpJi8p=hJNg2UoZgOGU+sYe-jpfTcU7dVQuwD=g@mail.gmail.com>

I strongly suspect such 3rd party library exists.
On Feb 26, 2012 7:49 PM, "Serhiy Storchaka" <storchaka at gmail.com> wrote:

> 24.02.12 14:41, Nick Coghlan ???????(??):
> > On Fri, Feb 24, 2012 at 10:13 PM, Mike Meyer<mwm at mired.org>  wrote:
> >> Oddly enough, I read the Julia docs on external commands between my
> >> first answer and your reply, and their solution is both as simple as
> >> what you want, and safe.
>
> Yes, I want this in Python:
>
> readall(cmd('cut -d: -f3 $file', file='/etc/passwd') | cmd('sort -n') |
> cmd('tail -n5'))
>
> or
>
> cmd('cut', '-d:', '-f3', '/etc/passwd').pipe('sort', '-n').pipe('tail',
> '-n5').readlines()
>
> or something similar.
>
> > That *is* rather nice, although they never get around to actually
> > explaining *how* to capture the output from the child processes
> > (http://julialang.org/manual/running-external-programs/, for anyone
> > else that's interested).
>
>
> https://github.com/JuliaLang/julia/blob/10aabddc3834223568a87721149d05765e7e9997/j/process.j
> See readall and each_line.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/32bea332/attachment.html>

From stephen at xemacs.org  Sun Feb 26 13:58:01 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 26 Feb 2012 21:58:01 +0900
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
	<87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
Message-ID: <87r4xhlpk6.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:

 > For throwaway scripts, though, most of the Linux sysadmins I know just
 > use shell scripts

Sure, but it's really hard to beat *sh plus GNU readline for brevity
in using recent history to create a script.  At some point, we "just
don't want to go there."  As for the Perl arm of your disjunction, do
those sysadmins use Python for anything?  There's a lot of history in
the Linux sysadmin community favoring Perl.  (Although the l33t
Perlmonger I know is a Ruby hacker now....)

 > For the devops (and deployment automation in general) crowd,
 > there's no real Python-based competitor to Chef and Puppet (both
 > Ruby based) (my understanding is that the Python-based Fabric
 > doesn't play in *quite* the same space as the other two).

No, there isn't, but creating one could be rather hard, as Puppet and
Chef both make heavy use of Ruby features conducive to writing DSLs.
Note that although Fabric plays in a distinct space, its
implementation looks like Chef, far more so than Puppet (ie, you write
Fabric configs in Python, and Chef configs in a (domain-specific
extension of) Ruby, while Puppet is a restricted DSL).

One of the Puppet rationales for using Puppet rather than Chef is
telling here:

    3. Choice of configuration languages

    The language which Puppet uses to configure servers is designed
    specifically for the task: it is a domain language optimised for
    the task of describing and linking resources such as users and
    files.

    Chef uses an extension of the Ruby language. Ruby is a good
    general-purpose programming language, but it is not designed for
    configuration management - and learning Ruby is a lot harder than
    learning Puppet?s language.

    Some people think that Chef?s lack of a special-purpose language
    is an advantage. ?You get the power of Ruby for free,? they
    argue. Unfortunately, there are many things about Ruby which
    aren?t so intuitive, especially for beginners, and there is a
    large and complex syntax that has to be mastered.

    -- http://bitfieldconsulting.com/puppet-vs-chef

That applies equally well to "DSL"s that are extensions of (function
calls in) Python.

Making it easier to write DSLs in Python has come up many times, and
so far the answer has always been "if you want to write a DSL in
Python, write a DSL in Python; but you can't, and won't soon be able
to, run it directly in the Python interpreter."  DSLs have been done;
there's configparser for one, argparse and ancestors, and things like
gitosis.  But it's hard to see Python beating Ruby at that game.

 > As things currently stand, Python deliberately makes it hard to say
 > "I want my individual commands to be shell commands, but I also
 > want Python's superior flow control constructs to decide which
 > shell commands to run".

I don't think that's ever been my motivation for writing a script in
Python.  Really, is Python's for loop so much better than bash's?  For
me, it's data structures: something where my sed fu isn't enough, or
the content has to persist longer than into the next pipe.

And quoting.  Shell quoting is such a pain, especially if there's an
ssh remote command in there somewhere.

This is not to say I'm opposed to making it easier to use Python as a
command shell in principle, but I have to wonder whether it can be
done as easily as all that, and without sacrificing some of the things
we've insisted on in past discussions.  On the other hand, for things
where avoiding shell makes sense, Python is one of my tools of choice
(the other being Emacs Lisp, where I want integration with my editor
and don't much care about performance).


From storchaka at gmail.com  Sun Feb 26 14:07:35 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Sun, 26 Feb 2012 15:07:35 +0200
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CAB4yi1PfDMHFpJi8p=hJNg2UoZgOGU+sYe-jpfTcU7dVQuwD=g@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
	<jid67j$ag9$1@dough.gmane.org>
	<CAB4yi1PfDMHFpJi8p=hJNg2UoZgOGU+sYe-jpfTcU7dVQuwD=g@mail.gmail.com>
Message-ID: <jidaqp$usv$2@dough.gmane.org>

26.02.12 13:59, Matt Joiner ???????(??):
> I strongly suspect such 3rd party library exists.

I also hope this. And such library will be better candidate for 
including in stdlib.


From victor.stinner at haypocalc.com  Sun Feb 26 14:56:08 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Sun, 26 Feb 2012 14:56:08 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
	<CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
Message-ID: <CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>

type.__setattr__(Three, 'value', 4) changes the value.

Victor

> class FinalMeta(type):
>
> ? ? def __setattr__(cls, attr, value):
> ? ? ? ? if attr in cls.__dict__ or '__done__' in cls.__dict__:
> ? ? ? ? ? ? raise AttributeError
> ? ? ? ? else:
> ? ? ? ? ? ? type.__setattr__(cls, attr, value)
>
> ? ? def __delattr__(cls, attr):
> ? ? ? ? raise AttributeError
>
>
> class Three:
> ? ? __metaclass__ = FinalMeta
> ? ? value = 3
> ? ? __done__ = True ? # There may be a neater way to do this...
>
> Each of the following examples will fail:
>
>>>> Three.another_value = 4
>>>> Three.value = 4
>>>> del Three.value
>>>> three = Three(); three.value = 4


From stephen at xemacs.org  Sun Feb 26 15:02:42 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 26 Feb 2012 23:02:42 +0900
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <jid67j$ag9$1@dough.gmane.org>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
	<jid67j$ag9$1@dough.gmane.org>
Message-ID: <87pqd1lmkd.fsf@uwakimon.sk.tsukuba.ac.jp>

Serhiy Storchaka writes:

 > Yes, I want this in Python:
 > 
 > readall(cmd('cut -d: -f3 $file', file='/etc/passwd') | cmd('sort -n') | cmd('tail -n5'))
 > 
 > or
 > 
 > cmd('cut', '-d:', '-f3', '/etc/passwd').pipe('sort', '-n').pipe('tail', '-n5').readlines()
 > 
 > or something similar.

But you can already do

    sorted([l.split(":")[2] for l in open('/etc/passwd')])[-5:]

(and I don't really care whether you were being ironic or not; either
way that one-liner is an answer<wink/>).

Actually, I wrote that off the top of my head and it almost worked.
The problem I ran into is that I'm on a Mac, and there was a bunch of
cruft comments (which don't contain any colons) in the beginning of
the file.  So I got a list index out of range when accessing the split
line.  In this case, cut | sort | tail would produce the expected
output.  But cut | sort | head would just produce garbage (the leading
comments in sorted order).  So the failure modes differ.  It might be
useful for people used to shell failure modes.


From aquavitae69 at gmail.com  Sun Feb 26 15:30:09 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Sun, 26 Feb 2012 16:30:09 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
	<CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
	<CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
Message-ID: <CAEgL-feievFZFm=2=-oKKvuQkz_bGRJEn_-VdRnK9oK9uYmD1g@mail.gmail.com>

Ah, I think I misunderstood exactly what you were trying to achieve.  To
me, that is essentially immutable - if I ever found myself using
type.__setattr__ to change a variable I'd have to seriously question what I
was doing!  But that would be a way around it, and I don't think it would
be possible to implement it fully in python.  On the other hand, the same
argument could be made for the introduction of private variables;
Class.__var is not private because it can be changed through
Class._Class__var.  I'd also consider having to do this to be indicative of
a design flaw in my code.

On Sun, Feb 26, 2012 at 3:56 PM, Victor Stinner <
victor.stinner at haypocalc.com> wrote:

> type.__setattr__(Three, 'value', 4) changes the value.
>
> Victor
>
> > class FinalMeta(type):
> >
> >     def __setattr__(cls, attr, value):
> >         if attr in cls.__dict__ or '__done__' in cls.__dict__:
> >             raise AttributeError
> >         else:
> >             type.__setattr__(cls, attr, value)
> >
> >     def __delattr__(cls, attr):
> >         raise AttributeError
> >
> >
> > class Three:
> >     __metaclass__ = FinalMeta
> >     value = 3
> >     __done__ = True   # There may be a neater way to do this...
> >
> > Each of the following examples will fail:
> >
> >>>> Three.another_value = 4
> >>>> Three.value = 4
> >>>> del Three.value
> >>>> three = Three(); three.value = 4
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/9da7f19e/attachment.html>

From anacrolix at gmail.com  Sun Feb 26 16:10:00 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Sun, 26 Feb 2012 23:10:00 +0800
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <87pqd1lmkd.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7cvUYqzMPkPiESthzZRFVx6FO2kS9e1USj=w2OMGvy4Jg@mail.gmail.com>
	<jid67j$ag9$1@dough.gmane.org>
	<87pqd1lmkd.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <CAB4yi1NkEe7N_+8isN7T+i__AoTX7A=+NteA8JnyB_j-saN7Tg@mail.gmail.com>

I did recently see "pyp" touted as a Python-like sed/awk.

I guess this stuff always comes down to what you're used to. To me it is
insane to be still using Perl yet I prefer perl regex over posix anyday :)
On Feb 26, 2012 10:03 PM, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:

> Serhiy Storchaka writes:
>
>  > Yes, I want this in Python:
>  >
>  > readall(cmd('cut -d: -f3 $file', file='/etc/passwd') | cmd('sort -n') |
> cmd('tail -n5'))
>  >
>  > or
>  >
>  > cmd('cut', '-d:', '-f3', '/etc/passwd').pipe('sort', '-n').pipe('tail',
> '-n5').readlines()
>  >
>  > or something similar.
>
> But you can already do
>
>    sorted([l.split(":")[2] for l in open('/etc/passwd')])[-5:]
>
> (and I don't really care whether you were being ironic or not; either
> way that one-liner is an answer<wink/>).
>
> Actually, I wrote that off the top of my head and it almost worked.
> The problem I ran into is that I'm on a Mac, and there was a bunch of
> cruft comments (which don't contain any colons) in the beginning of
> the file.  So I got a list index out of range when accessing the split
> line.  In this case, cut | sort | tail would produce the expected
> output.  But cut | sort | head would just produce garbage (the leading
> comments in sorted order).  So the failure modes differ.  It might be
> useful for people used to shell failure modes.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/c6b810c0/attachment.html>

From mwm at mired.org  Sun Feb 26 16:26:27 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 26 Feb 2012 10:26:27 -0500
Subject: [Python-ideas] shutil.runret and shutil.runout
In-Reply-To: <CADiSq7dPZJHWV7a1zpY92zWaMmTTrWEE5H+8wiQO94fyH5pmZw@mail.gmail.com>
References: <CAPkN8xKd=XX4hORGx56v65pnzFyK3x0O2Xn4Vu=QM6HjX+OWwg@mail.gmail.com>
	<CAGSi+Q5igws+hBkCut_0EHhiBruBw5jbQdRqA7qP+c3QkZfJVw@mail.gmail.com>
	<20120224062525.0e168a39@bhuda.mired.org>
	<CAPkN8x+d2yPDM8MEY-wKCBxBkevH5B5cY+qi6=hVwaLfpLju3w@mail.gmail.com>
	<20120224071325.08f07d32@bhuda.mired.org>
	<CADiSq7eeEDbB7mzvTBSOuO-z2CoVPpdMxkP6icjsG=_PYwFiWw@mail.gmail.com>
	<20120224075951.0ec1076d@bhuda.mired.org>
	<CADiSq7donKctgU6nzFvoONEqou-+V3hCht=3wu=kmc0vBJErQQ@mail.gmail.com>
	<87wr7bko2o.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CADiSq7fvnSoLVv4Yz+J9p_+rqmyHbJyOUKbscFRmMRc+H-=Gow@mail.gmail.com>
	<CAF-Rda8B3r0z=u-jRPpp7Yx_94T=fmb7YCXrxzebP_yAeA0How@mail.gmail.com>
	<CADiSq7dPZJHWV7a1zpY92zWaMmTTrWEE5H+8wiQO94fyH5pmZw@mail.gmail.com>
Message-ID: <20120226102627.56a86232@bhuda.mired.org>

On Sun, 26 Feb 2012 21:46:33 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sun, Feb 26, 2012 at 6:53 PM, Eli Bendersky <eliben at gmail.com> wrote:
> > The Chef/Puppet/Fabric example is a good one to support this point - Ruby,
> > like Python is also more a dev language than a sysadmin language, and yet
> > Chef & Puppet are written in Ruby and not Perl.
> For the key operation I'm talking about here, though, Ruby works the
> same way Perl does: it supports shell command execution via backtick
> quoted strings with implicit string interpolation.

Does Ruby also have something like Perl's -t/-T options and
supporting functions?

> Is it really that hard to admit that there are some tasks that other
> languages are currently just plain better for than Python, and perhaps
> we can learn something useful from that?

The key word is "perhaps". There are some things other languages are
better at than Python, and Python is the better off for it. I think
that "supporting code injection attacks" is one such feature.

> (And no, I'm not suggesting
> we adopt backtick command execution or implicit string interpolation.
> A convenience API that combines shell invocation, explicit string
> interpolation and whitespace and shell metacharacter quoting, though,
> *that* I support).

I'm only willing to support it if it's at least as safe as
Perl. Meaning that either 1) It doesn't really invoke the shell, but
handles provides those features explicitly, or 2) it throws errors if
passed tainted strings.

On the other hand, my support (or lack of it) isn't worth very much.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From simon.sapin at kozea.fr  Sun Feb 26 17:51:08 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Sun, 26 Feb 2012 17:51:08 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
	<CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
	<CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
Message-ID: <4F4A62FC.6080403@kozea.fr>

Le 26/02/2012 14:56, Victor Stinner a ?crit :
> type.__setattr__(Three, 'value', 4) changes the value.

Then there is the question of how much craziness you want to protect 
from. Nothing is ever truly private or immutable in CPython, given 
enough motivation and ctypes.

See for example Armin Ronacher?s "Bad Ideas" presentation, especially 
the "Interpreter Warfare" part near the end:

https://ep2012.europython.eu/media/conference/slides/5-years-of-bad-ideas.pdf

I think that the code patching tracebacks is in production in Jinja2. 
I?m sure frozensets could be modified in a similar way.

The point is: immutable data types protect against mistakes more than 
someone truly determined to break the rules. With that in mind, I think 
that having to go through __setattr__ is good enough to make sure it?s 
not accidental.

Regards,
-- 
Simon Sapin


From aquavitae69 at gmail.com  Sun Feb 26 20:02:21 2012
From: aquavitae69 at gmail.com (David Townshend)
Date: Sun, 26 Feb 2012 21:02:21 +0200
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4A62FC.6080403@kozea.fr>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
	<CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
	<CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
	<4F4A62FC.6080403@kozea.fr>
Message-ID: <CAEgL-fdWbvq3RGFOtOjDWaD9jkKzHscdSBuxt3SJDDq1nVOiDQ@mail.gmail.com>

My point exactly!
On Feb 26, 2012 6:51 PM, "Simon Sapin" <simon.sapin at kozea.fr> wrote:

> Le 26/02/2012 14:56, Victor Stinner a ?crit :
>
>> type.__setattr__(Three, 'value', 4) changes the value.
>>
>
> Then there is the question of how much craziness you want to protect from.
> Nothing is ever truly private or immutable in CPython, given enough
> motivation and ctypes.
>
> See for example Armin Ronacher?s "Bad Ideas" presentation, especially the
> "Interpreter Warfare" part near the end:
>
> https://ep2012.europython.eu/**media/conference/slides/5-**
> years-of-bad-ideas.pdf<https://ep2012.europython.eu/media/conference/slides/5-years-of-bad-ideas.pdf>
>
> I think that the code patching tracebacks is in production in Jinja2. I?m
> sure frozensets could be modified in a similar way.
>
> The point is: immutable data types protect against mistakes more than
> someone truly determined to break the rules. With that in mind, I think
> that having to go through __setattr__ is good enough to make sure it?s not
> accidental.
>
> Regards,
> --
> Simon Sapin
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120226/ee2fbb69/attachment.html>

From victor.stinner at haypocalc.com  Mon Feb 27 10:54:45 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 27 Feb 2012 10:54:45 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4A62FC.6080403@kozea.fr>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<4F49ECCA.7060802@kozea.fr>
	<CAEgL-fcBh+f5akx0y6K9NMyUusO5DUhy5N4uuk0yd7jfMki1pg@mail.gmail.com>
	<CAMpsgwbZBCL3K50Z0K7K5bd5=Q0ZjStFO2r+LZ=ngLGvG-3B=Q@mail.gmail.com>
	<4F4A62FC.6080403@kozea.fr>
Message-ID: <CAMpsgwY5U7TL7qus-FVgCP0J5QfLx6mZ6v+Cw+WiCtddpW+k8Q@mail.gmail.com>

>> type.__setattr__(Three, 'value', 4) changes the value.
>
> Then there is the question of how much craziness you want to protect from.
> Nothing is ever truly private or immutable in CPython, given enough
> motivation and ctypes.

My pysandbox project uses various hacks to secure Python. The attacker
doesn't care of writing pythonic code, (s)he just want to break the
sandbox :-) See my pysandbox project for more information:
https://github.com/haypo/pysandbox/

See sandbox/test/ if you like weird code :-) Tests ensure that the
sandbox is safe.

Constant types would also help optimization, especially PyPy JIT.

Victor


From dreamingforward at gmail.com  Mon Feb 27 19:32:05 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 11:32:05 -0700
Subject: [Python-ideas]  Support other dict types for type.__dict__
In-Reply-To: <CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
Message-ID: <CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>

On Sat, Feb 25, 2012 at 4:05 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Ned Batchelder wrote:
>
>> The Python answer for people who want read-only data structures has always
>> been, "Don't modify them if you don't want to, and write docs that tell
>> other people not to as well." ?What are you building that this answer isn't
>> good enough?
>
> That is silly. That alleged "Python answer" is like telling people that they
> don't need test frameworks or debuggers because the "Python answer" for
> people wanting to debug their code is not to write buggy code in the first
> place.

Perhaps a good middle ground for this is to NOT tie it to particular
data structures (like tuples vs lists), but abstract it by making an
"immutable bit" that is part of the basic Object type. ?This doesn't
give complete security, but does *force* a choice by a human agent to
deliberately modify data. ?(This was actually going to be implemented
in a sort of python fork several years ago.) ?There could be a
"mutable?" check that returns True or False.

mark
Santa Fe, NM


From rob.cliffe at btinternet.com  Mon Feb 27 19:35:57 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Mon, 27 Feb 2012 18:35:57 +0000
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
Message-ID: <4F4BCD0D.8040706@btinternet.com>

On 27/02/2012 18:32, Mark Janssen wrote:
> On Sat, Feb 25, 2012 at 4:05 PM, Steven D'Aprano<steve at pearwood.info>  wrote:
>> Ned Batchelder wrote:
>>
>>> The Python answer for people who want read-only data structures has always
>>> been, "Don't modify them if you don't want to, and write docs that tell
>>> other people not to as well."  What are you building that this answer isn't
>>> good enough?
>> That is silly. That alleged "Python answer" is like telling people that they
>> don't need test frameworks or debuggers because the "Python answer" for
>> people wanting to debug their code is not to write buggy code in the first
>> place.
> Perhaps a good middle ground for this is to NOT tie it to particular
> data structures (like tuples vs lists), but abstract it by making an
> "immutable bit" that is part of the basic Object type.  This doesn't
> give complete security, but does *force* a choice by a human agent to
> deliberately modify data.  (This was actually going to be implemented
> in a sort of python fork several years ago.)  There could be a
> "mutable?" check that returns True or False.
>
> mark
> Santa Fe, NM
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
I suggested a "mutable" attribute some time ago.
This could lead to finally doing away with one of Python's FAQs: Why 
does python have lists AND tuples?  They could be unified into a single 
type.
Rob Cliffe.


From ethan at stoneleaf.us  Mon Feb 27 19:46:49 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 10:46:49 -0800
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4BCD0D.8040706@btinternet.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>	<4F48DF86.7060600@nedbatchelder.com>
	<4F496954.30101@pearwood.info>	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
Message-ID: <4F4BCF99.6030508@stoneleaf.us>

> I suggested a "mutable" attribute some time ago.
> This could lead to finally doing away with one of Python's FAQs: Why 
> does python have lists AND tuples?  They could be unified into a single 
> type.

If a tuple is just an immutable list it will become worse with regards 
to performance and memory space.

~Ethan~


From dreamingforward at gmail.com  Mon Feb 27 19:45:45 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 11:45:45 -0700
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4BCD0D.8040706@btinternet.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
Message-ID: <CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>

On Mon, Feb 27, 2012 at 11:35 AM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
> I suggested a "mutable" attribute some time ago.
> This could lead to finally doing away with one of Python's FAQs: Why does
> python have lists AND tuples? ?They could be unified into a single type.
> Rob Cliffe.

Yeah, that would be cool.  It would force (ok, *allow*) the
documenting of any non-mutable attributes (i.e. when they're mutable,
and why they're being set immutable, etc.).

There an interesting question, then, should the mutable bit be on the
Object itself (the whole type) or in each instance....?  There's
probably no "provable" or abstract answer to this, but rather just an
organization principle to the language....

m


From dreamingforward at gmail.com  Mon Feb 27 19:47:29 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 11:47:29 -0700
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4BCF99.6030508@stoneleaf.us>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com> <4F4BCF99.6030508@stoneleaf.us>
Message-ID: <CAMjeLr8vGKCjoSr21Y8XH=n7K6yOj0DLgq8ywCj9EPoLeD5yvA@mail.gmail.com>

On Mon, Feb 27, 2012 at 11:46 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> If a tuple is just an immutable list it will become worse with regards to
> performance and memory space.

That's a good point also....

m


From phd at phdru.name  Mon Feb 27 19:49:42 2012
From: phd at phdru.name (Oleg Broytman)
Date: Mon, 27 Feb 2012 22:49:42 +0400
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <4F4BCD0D.8040706@btinternet.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
Message-ID: <20120227184942.GA12927@iskra.aviel.ru>

On Mon, Feb 27, 2012 at 06:35:57PM +0000, Rob Cliffe wrote:
> I suggested a "mutable" attribute some time ago.
> This could lead to finally doing away with one of Python's FAQs: Why
> does python have lists AND tuples?  They could be unified into a
> single type.

   The main difference between lists and tuples is not mutability but
usage: lists are for a (unknown) number of similar items (a list of
messages, e.g.), tuples are for a (known) number of different items at
fixed positions (an address is a tuple of (country, city, street
address), for example).

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From ethan at stoneleaf.us  Mon Feb 27 20:02:27 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 11:02:27 -0800
Subject: [Python-ideas] [Fwd: Re: Support other dict types for type.__dict__]
Message-ID: <4F4BD343.7040900@stoneleaf.us>

[forwarding on to list]

On 27/02/2012 18:46, Ethan Furman wrote:
>> I suggested a "mutable" attribute some time ago.
>> This could lead to finally doing away with one of Python's FAQs: Why 
>> does python have lists AND tuples?  They could be unified into a 
>> single type.
>
> If a tuple is just an immutable list it will become worse with regards 
> to performance and memory space.
>
> ~Ethan~
>
Doesn't that depend on how smart the implementation is?
(Of course, toggling the mutable flag could cause performance penalties,
but that's something you can't do at all at the moment.)
Rob


From victor.stinner at haypocalc.com  Mon Feb 27 19:55:50 2012
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 27 Feb 2012 19:55:50 +0100
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <20120227184942.GA12927@iskra.aviel.ru>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<20120227184942.GA12927@iskra.aviel.ru>
Message-ID: <CAMpsgwbbx=pmPFQsd7pkaiTb5TPMxZ1rW6LB+ibB5z8KR7EfVw@mail.gmail.com>

2012/2/27 Oleg Broytman <phd at phdru.name>:
> On Mon, Feb 27, 2012 at 06:35:57PM +0000, Rob Cliffe wrote:
>> I suggested a "mutable" attribute some time ago.
>> This could lead to finally doing away with one of Python's FAQs: Why
>> does python have lists AND tuples? ?They could be unified into a
>> single type.
>
> ? The main difference between lists and tuples is not mutability but
> usage: lists are for a (unknown) number of similar items (a list of
> messages, e.g.), tuples are for a (known) number of different items at
> fixed positions (an address is a tuple of (country, city, street
> address), for example).

And tuple doesn't have append, extend, remove, ... methods.

Victor


From mwm at mired.org  Mon Feb 27 20:12:23 2012
From: mwm at mired.org (Mike Meyer)
Date: Mon, 27 Feb 2012 14:12:23 -0500
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
Message-ID: <20120227141223.6329ab8f@bhuda.mired.org>

On Mon, 27 Feb 2012 11:45:45 -0700
Mark Janssen <dreamingforward at gmail.com> wrote:

> On Mon, Feb 27, 2012 at 11:35 AM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
> > I suggested a "mutable" attribute some time ago.
> > This could lead to finally doing away with one of Python's FAQs: Why does
> > python have lists AND tuples? ?They could be unified into a single type.
> > Rob Cliffe.
> Yeah, that would be cool.  It would force (ok, *allow*) the
> documenting of any non-mutable attributes (i.e. when they're mutable,
> and why they're being set immutable, etc.).

This also has implications for people working on making python
friendlier for concurrent and parallel programming.

> There an interesting question, then, should the mutable bit be on the
> Object itself (the whole type) or in each instance....?  There's
> probably no "provable" or abstract answer to this, but rather just an
> organization principle to the language....

Ok, you said "non-mutable attributes" in the first paragraph. That to
me implies that the object bound to that attribute can't be
changed. This is different from the attribute being bound to an
immutable object, which this paragraph implies. Which do you want here?

   <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ericsnowcurrently at gmail.com  Mon Feb 27 20:18:05 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 27 Feb 2012 12:18:05 -0700
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
Message-ID: <CALFfu7DKDH+Jyd7AjGOvYKZ9JJy+JVLbWoquie3Stm5_n_1Kbw@mail.gmail.com>

On Mon, Feb 27, 2012 at 11:45 AM, Mark Janssen
<dreamingforward at gmail.com> wrote:
> On Mon, Feb 27, 2012 at 11:35 AM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
>> I suggested a "mutable" attribute some time ago.
>> This could lead to finally doing away with one of Python's FAQs: Why does
>> python have lists AND tuples? ?They could be unified into a single type.
>> Rob Cliffe.
>
> Yeah, that would be cool. ?It would force (ok, *allow*) the
> documenting of any non-mutable attributes (i.e. when they're mutable,
> and why they're being set immutable, etc.).
>
> There an interesting question, then, should the mutable bit be on the
> Object itself (the whole type) or in each instance....? ?There's
> probably no "provable" or abstract answer to this, but rather just an
> organization principle to the language....

In contrast to a flag on objects, one alternative is to have a
__mutable__() method for immutable types and __immutable__() for
mutable types.  I'd be nervous about being able to make an immutable
object mutable at an arbitrary moment with the associated effect on
the hash of the object.

-eric


From dreamingforward at gmail.com  Mon Feb 27 20:23:15 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:23:15 -0700
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
Message-ID: <CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>

I just realized I've been replying personally to these replies instead
of the whole list (damn I hate that!).  So resending a bunch of
messages that went to individuals.  [Mark]
On Fri, Feb 17, 2012 at 3:12 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com> wrote:
>> Anyway... of course patches welcome, yes... ?;^)
>
> Not really. doctest is for *testing code example in docs*.

I understand. ?This is exactly what I was wanting to use it for. ?As
Tim says "literate testing" or "executable documentation".

The suggestions I made are for enhancing those two.

Personally, I don't find unittest very suitable for test-driven
*development*, although it *is* obviously well-suited for writing
assurance tests otherwise.

The key difference, to me, is in that doctest promotes tests being
written in order to have the *additional functionality* of
documentation. ? ?That makes it fun since your getting "twice the
value for the cost of one", and that alone is the major item which
drives test-driven development (IMHO) within the spirit of python,
otherwise unittest is rather bulky to write in and of itself.

Does anyone really use unittest outside the context of shop policy?

mark


From dreamingforward at gmail.com  Mon Feb 27 20:24:31 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:24:31 -0700
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr_n00p0i5J4HiOUggNcS142-mt+P09ELsy_1H86iGec+w@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CABicbJ+H2gRKV=iBt+wLCYd0QfOY26KW=CvED+m=yPn_eHnzfw@mail.gmail.com>
	<CAMjeLr_n00p0i5J4HiOUggNcS142-mt+P09ELsy_1H86iGec+w@mail.gmail.com>
Message-ID: <CAMjeLr9baOvbLgv4fZ5D4hwzQ_yG7UwcLko1-8R_WdWEhJe9BA@mail.gmail.com>

On Fri, Feb 17, 2012 at 5:43 PM, Devin Jeanpierre
<jeanpierreda at gmail.com> wrote...

I, firstly, thank you for your thoughtful reply. ?I myself am rather
busy, but totally think it's worth the effort.

> On Fri, Feb 17, 2012 at 4:57 PM, Mark Janssen <dreamingforward at gmail.com> wrote:
>> 1. Execution context determined by outer-scope doctest defintions.
>
> I'm not sure what you mean, but it might be relevant that Sphinx lets
> you define multiple scopes for doctests.

Something like this: ?In a class definition doctest, I may put various
definitions ("SetUp" constructs) of interesting and useful class
initializations. ?If my class is a Graph, say, I might define several
Graphs (which might produce testable output, or in any case should not
throw an error), which could then be used (as "globals") in the inner
doctests of the various class methods. ?It makes no sense to define
them again in each doctest. ?The hitch I suppose would be defining the
possible "TearDown' code which would have to be done after the
inner-scope doctests all run. ?This would require some syntactical
feature either in doctest or python itself (this latter would be
something like an extra docstring at the end of a class definition,
but such a construct would really only be interesting if test()ing was
built-in to the python interpreter itself (if even then)). ?This
really is the only way doctest scoping should make sense. ?Any other
way is probably not organizing codetest to docs well. ?(In other words
it would enforce a certain testing standard of good practices.)

>I feel like its approach is
> the right one, but it isn't reusable in Python docstrings. That said,
> I think users of doctest have moved away from embedded doctests in
> docstrings -- it encourages doctests to have way too many "examples"
> (test cases), which reduces their usefulness as documentation.

Again, I think this is an example of python not really having
test-driven development built-in. ?Complicated doctests are a result
of too coarse of grain in the method definitions, usually (or
probably) a result of other called methods not having their own
doctests, so the slack is being picked-up in an ad-hoc way. ?This is
just mostly speculation, I haven't actually gone through any examples.
?I'd be interested in viewing some though if you have them.

>> 2. Smart Comparisons that will detect output of a non-ordered type
>> (dict/set), lift and recast it and do a real comparison.
>
> I think it's better to just always use ast.literal_eval on the output
> as another form of testing for equivalence. This could break code, but
> probably not any code worth caring about.
>
> (in particular,
> ? ?>>> print 'r""'
> ? ?""

Hmm, I think that would pass in doctest's current framework, which
just tests syntactic characters without regard to semantics. ?However,
if one were to fix the dict ordering issue, it would have to gain a
minimum semantic knowledge (like an unordered grouping starts and
terminates with the characters "{" and "}")

>> Anyway... of course patches welcome, yes... ?;^)
>
> Not exactly... doctest has no maintainer, and so no patches ever get
> accepted. If you want to improve it, you'll have to fork it. I hope
> you're that sort of person, because doctest can totally be improved.
> It suffers a lot from people thinking of what it is rather than what
> it could be. :(

I agree! ?I'm a bit like yourself though, swamped with other
priorities. ? But I'm glad to know about your fork, although it looks
like a efforts are a bit orthogonal to each other....
> This is all assuming your intentions are to contribute rather than
> only suggest. Not that suggestions aren't welcome, I suppose, but
> maybe not here. doctest is not actively developed or maintained
> anywhere, as far as I know. (I want to say "except by me", because
> that'd make me seem all special and so on, but I haven't committed a
> thing in months.)

Well I appreciate your taking the time. ?I will make another look at
the code and see what it would take.

> I definitely hope you help to make the doctest world better. I think
> it fills a role that should be filled, and its neglect is unfortunate.

I'm glad someone appreciates it. ?I really think the idea should be
integrated more deeply so that it becomes a natural habit for python
programmers. ?The test() as a built-in idea came from another doctest
fan. ? It would sit right alongside the help() built-in. ?Maybe the
idea will gain traction...

mark


From dreamingforward at gmail.com  Mon Feb 27 20:25:53 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:25:53 -0700
Subject: [Python-ideas]  doctest
In-Reply-To: <CAMjeLr9N+Uf4vKtfG+boshPPdSmq26cvmV3Csf5+x5mCS2Ahpw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CAHtfchoSYHT3YsfR+dH5pdfy9ZUj1_PXnqSt0u1U4+_dtYzgKg@mail.gmail.com>
	<CAMjeLr9N+Uf4vKtfG+boshPPdSmq26cvmV3Csf5+x5mCS2Ahpw@mail.gmail.com>
Message-ID: <CAMjeLr9BJg-GqP1wZrb5D86rkAt+LkJ-Ei_7ejXB78QarFsXjw@mail.gmail.com>

On Fri, Feb 17, 2012 at 9:23 PM, Ian Bicking <ianbicking at gmail.com> wrote:
> On Feb 17, 2012 3:58 PM, "Mark Janssen" <dreamingforward at gmail.com> wrote:
>> Without #1, "literate testing" becomes awash with re-defining re-used
>> variables which, generally, also detracts from exact purpose of the
>> test -- this creates testdoc noise and the docs become less useful.
>
> I dunno... I find the discipline of defining your prerequesites to be a
> helpful feature of doctest (I find TestCase.setUp to be smelly).

Yeah, I kinda agree, but in this case the doctests are always confined
to the same module (or class) and have a standardized location, so
always near at hand (at least if you're using them well) if you want
to see what a variable used in a sub-test has been defined.

>? You can
> include a namespace in doctest invocations, but I'm guessing the problem is
> that you aren't able to give these settings when using some kind of test
> collector/runner?? More flexible ways of defining doctest options (e.g.,
> ELLIPSIS) would be helpful.

Yeah, doctests has these (globs and M.__test__), it's just that it
takes you out of the mode of "executable documentation" and becomes
less fun.

>> Without #2, "readable docs" nicely co-aligning with "testable docs"
>> tends towards divergence.
>
> IMHO this could be more easily solved by replacing the standard repr with
> one that is more predictable.

Yes, but then this gets more into "my" idea of making a test()
builtin, like help(). ?In that case, you could do fancy stuff where
you wouldn't even have to test string output.

Cheers,

mark

PS. Darn, I hate when I forget to reply-all...


From dreamingforward at gmail.com  Mon Feb 27 20:26:23 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:26:23 -0700
Subject: [Python-ideas]  doctest
In-Reply-To: <CAMjeLr-Gtcy0MNWfzdvDKaOR3xRdAe9bRJZQZ1H8=ZPnc77XPQ@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
	<4F3F2E32.7070907@pearwood.info>
	<CAMjeLr-Gtcy0MNWfzdvDKaOR3xRdAe9bRJZQZ1H8=ZPnc77XPQ@mail.gmail.com>
Message-ID: <CAMjeLr-L4sTq86KdMhTak3hRU63QnpZZLbtGD6snJ5rp4iBbVw@mail.gmail.com>

On Fri, Feb 17, 2012 at 9:50 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Really? Not in my experience, although I admit I haven't tried to push the
> envelope too far.
>
> But I haven't had any problem with a literate programming model:
>
> * Use short, self-contained but not necessarily exhaustive examples in the
> code's docstrings (I don't try to give examples of *every* combination of
> good and bad data, special cases, etc. in the docstring).
>
> * Write extensive (ideally exhaustive) examples with explanatory text, in a
> separate text file.

Hmmm, interesting. ?I generally like to keep it all in one file and
define a dummy "test" function that just contains doctest code so that
it can be all kept in one file and in-sync.

> If my tests require setting up and tearing down
> resources, I stick to unittest which has better setup/teardown support. (It
> would be hard to have *less* support for setup and teardown than doctest.)

If doctest had context-scoping, I think it would be superior to
unittest. ?SetUp functionality would be contained in the class
definition's __doc__, or out in the module's own __doc__. ?If any
teardown functionality was necessary in the class's code a dummy
teardown method could be defined at the end of the class definition.
(Not as ideal as a more integrated test-driven development approach,
but likely acceptable...)

mark


From dreamingforward at gmail.com  Mon Feb 27 20:27:23 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:27:23 -0700
Subject: [Python-ideas]  doctest
In-Reply-To: <CAMjeLr9nKpnhmEsCKkszPVN05_raFwKSdVhCwqAn1JZV1m=Sow@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<4F3F3240.4090104@pearwood.info>
	<CAMjeLr9nKpnhmEsCKkszPVN05_raFwKSdVhCwqAn1JZV1m=Sow@mail.gmail.com>
Message-ID: <CAMjeLr8q5A8Qnr1Mz7=LZdicTuiw7V8311MJ+faBzSE5h+FTcw@mail.gmail.com>

On Fri, Feb 17, 2012 at 10:08 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Mark Janssen wrote:
>> 1. Execution context determined by outer-scope doctest defintions.
>
> Can you give an example of how you would like this to work?
>

Sure, I wish I had a good example off the top of my head, but perhaps
this will convey the idea:

class MyClass():
?"""Yadda Yadda: foo's bars.

>>> m = MyClass({some, sufficiently, interesting, initialization}) ?#tPOINT1: ?this variable (m) now accessible by all methods.
"foobar check" ? #POINT2: possible output here is a useful test case
not well-definable elsewhere without losing context.
"""

? ?def method1(self, other):
? ? ? ? """Method method method method.

? ? ? ? >>> m.method("foo") ?#Now we see m is already defined and useable.
? ? ? ? "bar"
? ? ? ? ?"""

? ?def meth2(self, other):
? ? ? ?"""Method to foo all bars

? ? ? >>> m.method("bar") #would have to decide whether a fresh m is
redefined with each innerscope doctest (if we want side-effects to
carry across inner doctests).

(END)

This is a basic example, sorry it's rather crude. ?There's probably a
better example. ?(Think establishing a network socket connection or
something in the class' doc which is then used by all the methods, for
example.)

>> 2. Smart Comparisons that will detect output of a non-ordered type
>> (dict/set), lift and recast it and do a real comparison.>
>
> I would love to see a doctest directive that accepted differences in output
> order, e.g. would match {1, 2, 3} and {3, 1, 2}. But I think that's a hard
> problem to solve in the general case.

I think this would be as simple as lifting the (string) output and
doing an eval("{1,2,3}")=={3,2,1}, or (for security) using
ast.literal_eval like Devin suggested.

> I'd like a #3 as well: an abbreviated way to spell doctest directives,
> because they invariably push my tests well past the 80 character mark.

Hmm, seem like an alias could be defined easily enough, but I'll try
to think about this when I have more time.

mark


From dreamingforward at gmail.com  Mon Feb 27 20:28:17 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 12:28:17 -0700
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr_49+g4Oegyp4DggiUWhfsbTyYdrRC38VQMceunEbuVYw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<20120220132832.76b772da@resist.wooz.org>
	<CAMjeLr_49+g4Oegyp4DggiUWhfsbTyYdrRC38VQMceunEbuVYw@mail.gmail.com>
Message-ID: <CAMjeLr8nG03CfSkbY-UctkYP8x6G=+zTnEP8Mgey71qWN=HOAg@mail.gmail.com>

On Mon, Feb 20, 2012 at 11:28 AM, Barry Warsaw <barry at python.org> wrote:
> On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:
> FWIW, I think doctests are fantastic and I use them all the time. ?There are
> IMO a couple of things to keep in mind:
>
> ?- doctests are documentation first. ?Specifically, they are testable
> ? documentation. ?What better way to ensure that your documentation is
> ? accurate and up-to-date? ?(And no, I do not generally find skew between the
> ? code and the separate-file documentation.)
>
> ?- I personally dislike docstring doctests, and much prefer separate reST
> ? documents. ?These have several advantages, such as the ability to inject
> ? names into doctests globals (use with care though), and the ability to set
> ? up the execution context for doctests (see below). ?The fact that it's so
> ? easy to turn these into documentation with Sphinx is a huge win.
>
> Since so many people point this out, let me say that I completely agree that
> doctests are not a *replacement* for unittests, but they are a fantastic
> *complement* to unittests. ?When I TDD, I always start writing the
> (testable) documentation first, because if I cannot explain the component
> under test in clearly intelligible English, then I probably don't really
> understand what it is I'm trying to write.
>
> My doctests usually describe mostly the good path through the API.
> Occasionally I'll describe error modes if I think those are important for
> understanding how to use the code. ?However, for all those fuzzy corner cases,
> weird behaviors, bug fixes, etc., unittests are much better suited because
> ensuring you've fixed these problems and don't regress in the future doesn't
> help the narrative very much.

I think is an example of (mal)adapting to an incomplete module, rather
than fixing it. ?I think doctest can handle all the points you're
making. ?See clarification pointers below...

>>1. Execution context determined by outer-scope doctest defintions.
>
> Can you explain this one?

I gave an example in a prior message on this thread, dated Feb 17. ?I
think it's clear there but let me know.

Basically, the idea is that since the class def can also have a
docstring, where better would setup and teardown code go to provide
the execution context of the inner method docstrings?

Now the question: ?is it useful or appropriate to put setup and
teardown code in a classdef docstring? ?Well, I think this requires a
committment on the behalf of the coder/documentor to concoct useful
(didactic) example that could go there. ?For example, (as in the
prior-referenced message) I imagine putting example of defining a
variable of the classes type (">>> g = Graph({some complex,
interesting initialization})"), which might return a (testable) value
upon creation.

Now this could, logically, be put in the classes __init__ method, but
that doesn't make sense for defining an execution context, and *in
addition*, that can be saved for those complex corner cases you
mentioned earlier.

> I usually put all this in an additional_tests() method, such as:

Yes, I do the same for my modules with doctests. ?A dummy function
which can catch all the non-interesting tests. ?This, still superior,
in my opinion, than unittest. ?It is easier syntactically, as well as
for casual users of your code (It has no leaning curve like
understanding unittest).

This superiority to unittest, by the way, is only realized if the
second suggestion (smart comparisons) is implemented into doctest.

>>2. Smart Comparisons that will detect output of a non-ordered type
>>(dict/set), lift and recast it and do a real comparison.
>
> I'm of mixed mind with these. ?Yes, you must be careful with ordering, but I
> find it less readable to just sort() some dictionary output for example. ?What
> I've found much more useful is to iterate over the sorted keys of a dictionary
> and print the key/values pairs.

Yes, but you see you're destroying the very intent and spirit of
doctest. ?The point is to make literate documentation. ?If you adapt
to it's incompleteness, you reduce the power of it.

>>Without #1, "literate testing" becomes awash with re-defining re-used
>>variables which, generally, also detracts from exact purpose of the
>>test -- this creates testdoc noise and the docs become less useful.
>>Without #2, "readable docs" nicely co-aligning with "testable docs"
>>tends towards divergence.
>
> I've no doubt that doctests could be improved, but I actually find them quite
> usable as is, with just a little bit of glue code to get it all hooked up. ?As
> I say though, I'm biased against docstring doctests.

Well, hopefully, I've convinced you a little that the limitations in
doctests over unittests are almost, if not entirely due, to the
incompleteness of the module. ?If the two items I mentioned were
implemented I think it would be far superior to unittest. ?(Corner
cases, etc can all find a place, because every corner case should be
documented somewhere anyway!!)

Cheers!!

mark
santa fe, nm


From dreamingforward at gmail.com  Mon Feb 27 21:01:49 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 13:01:49 -0700
Subject: [Python-ideas]  Fwd: doctest
In-Reply-To: <CAMjeLr-v_gDng9N8pMRODx=xvY-T_RU6SNK5_44qVkvwNvU0Vw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
	<4F4BE07C.1000505@stoneleaf.us>
	<CAMjeLr-v_gDng9N8pMRODx=xvY-T_RU6SNK5_44qVkvwNvU0Vw@mail.gmail.com>
Message-ID: <CAMjeLr-JtfaKyQvzAuP=2E9A7KXkjT_V+9BsaDpjpgYGnc6t+g@mail.gmail.com>

On Mon, Feb 27, 2012 at 12:58 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
> Mark Janssen wrote:
>> Personally, I don't find unittest very suitable for test-driven
>> *development*, although it *is* obviously well-suited for writing
>> assurance tests otherwise.
>
> I like unittest for TDD.

I should probably correct myself. ?It is suiltable, just not
enjoyable. ?But now I know you are someone who likes all that arcana
of unittest module.

> unittest can be a bit bulky, but definitely worth it IMO, especially when
> covering the corner cases.

Corner cases are generally useful for the developer to know about, so
its worth it to mention (==> test) in the documentation.

> I have not used doctest, but I can say that I strongly dislike having more
> than one or two examples in a docstring.

This is often just a failure to separate tests property among different methods.

> The other gripe I have (possibly easily fixed): my python prompt is '-->'
> (makes email posting easier) -- should my doctests still use '>>>'? ?Will
> doctest fail on my machine?

As written, yes, but easily changeable in the module code for your
unique case....

mark


From ethan at stoneleaf.us  Mon Feb 27 21:22:10 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 12:22:10 -0800
Subject: [Python-ideas] Fwd: doctest
In-Reply-To: <CAMjeLr-JtfaKyQvzAuP=2E9A7KXkjT_V+9BsaDpjpgYGnc6t+g@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>	<4F4BE07C.1000505@stoneleaf.us>	<CAMjeLr-v_gDng9N8pMRODx=xvY-T_RU6SNK5_44qVkvwNvU0Vw@mail.gmail.com>
	<CAMjeLr-JtfaKyQvzAuP=2E9A7KXkjT_V+9BsaDpjpgYGnc6t+g@mail.gmail.com>
Message-ID: <4F4BE5F2.0@stoneleaf.us>

Mark Janssen wrote:
> On Mon, Feb 27, 2012 at 12:58 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> Mark Janssen wrote:
>>> Personally, I don't find unittest very suitable for test-driven
>>> *development*, although it *is* obviously well-suited for writing
>>> assurance tests otherwise.
>> I like unittest for TDD.
> 
> I should probably correct myself.  It is suiltable, just not
> enjoyable.  But now I know you are someone who likes all that arcana
> of unittest module.

I'm not sure about *that* -- having to exactly reproduce the output of 
the interpreter seems kind of arcane to me.  ;)


>> unittest can be a bit bulky, but definitely worth it IMO, especially when
>> covering the corner cases.
> 
> Corner cases are generally useful for the developer to know about, so
> its worth it to mention (==> test) in the documentation.

Absolutely.  I can see great value to using doctest on documentation, 
and even on code itself -- as I mentioned already, I just hate having 
code cluttered with lots of non-code.

The other thing I like about unittest as opposed to doctest is the 
ability to be exhaustive.  For an example, take a look at the tests I 
have for my dbf module on PyPI -- not even sure how I could convert that 
into a doctest format.

>> I have not used doctest, but I can say that I strongly dislike having more
>> than one or two examples in a docstring.
> 
> This is often just a failure to separate tests property among different methods.
> 
>> The other gripe I have (possibly easily fixed): my python prompt is '-->'
>> (makes email posting easier) -- should my doctests still use '>>>'?  Will
>> doctest fail on my machine?
> 
> As written, yes, but easily changeable in the module code for your
> unique case....

Go with the Source, eh?  I can live with that.  :)

~Ethan~


From ned at nedbatchelder.com  Mon Feb 27 21:14:55 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 27 Feb 2012 15:14:55 -0500
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
Message-ID: <4F4BE43F.6090003@nedbatchelder.com>

On 2/27/2012 2:23 PM, Mark Janssen wrote:
> I just realized I've been replying personally to these replies instead
> of the whole list (damn I hate that!).  So resending a bunch of
> messages that went to individuals.  [Mark]
> On Fri, Feb 17, 2012 at 3:12 PM, Nick Coghlan<ncoghlan at gmail.com>  wrote:
>> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen<dreamingforward at gmail.com>  wrote:
>>> Anyway... of course patches welcome, yes...  ;^)
>> Not really. doctest is for *testing code example in docs*.
> I understand.  This is exactly what I was wanting to use it for.  As
> Tim says "literate testing" or "executable documentation".
I think you misunderstand: Nick meant, "doctest is only useful for 
testing the snippets of code that naturally appear in documentation 
meant for people to read."  Many people agree with this sentiment, and 
find doctest unsuitable for writing comprehensive tests.
> The suggestions I made are for enhancing those two.
>
> Personally, I don't find unittest very suitable for test-driven
> *development*, although it *is* obviously well-suited for writing
> assurance tests otherwise.
>
> The key difference, to me, is in that doctest promotes tests being
> written in order to have the *additional functionality* of
> documentation.    That makes it fun since your getting "twice the
> value for the cost of one", and that alone is the major item which
> drives test-driven development (IMHO) within the spirit of python,
> otherwise unittest is rather bulky to write in and of itself.
>
> Does anyone really use unittest outside the context of shop policy?
Many, many people use unittest, namely, all of us that think doctest is 
a cute idea, but its many limitations hobble it for serious work.
>
> mark
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From fuzzyman at gmail.com  Mon Feb 27 21:35:08 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 27 Feb 2012 20:35:08 +0000
Subject: [Python-ideas] doctest
In-Reply-To: <CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
Message-ID: <CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>

On 18 February 2012 04:24, Ian Bicking <ianb at colorstudy.com> wrote:

> On Feb 17, 2012 4:12 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
> >
> > On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <dreamingforward at gmail.com>
> wrote:
> > > Anyway... of course patches welcome, yes...  ;^)
> >
> > Not really. doctest is for *testing code example in docs*. If you try
> > to use it for more than that, it's likely to drive you up the wall, so
> > proposals to make it more than it is usually don't get a great
> > reception (docs patches to make it's limitations clearer are generally
> > welcome, though). The stdib solution for test driven development is
> > unittest (the vast majority of our own regression suite is written
> > that way - only a small proportion uses doctest).
>
> This pessimistic attitude is why doctest is challenging to work with at
> times, not anything to do with doctest's actual model.  The constant
> criticisms of doctest keep contributors away, and keep its many resolvable
> problems from being resolved.
>

Personally I think there are several fundamental problems with doctest *as
a unit testing tool*. doctest is *awesome* for testing documentation
examples but in particular this one:

* Every line becomes an assertion - in a unit test you typically follow the
arrange -> act -> assert pattern. Only the results of the *assertion* are
relevant to the test. (Obviously unexpected exceptions at any stage are
relevant....). With doctest you have to take care to ensure that the exact
output of *every line* of your arrange and act steps also match, even if
they are irrelevant to your assertion. (The arrange and act steps will
often include lines where you are creating state, and their output is
irrelevant so long as they put the right things in place.)

The particular implementation of doctest means that there are additional,
potentially resolvable problems that are also a damn nuisance in a unit
testing fail:

Execution of an individual testing section continues after a failure. So a
single failure results in the *reporting* of potentially many failures.

The problem of being dependent on order of unorderable types (actually very
difficult to solve).

Things like shared fixtures and mocking become *harder* (although by no
means impossible) in a doctest environment.

Another thing I dislike is that it encourages a "test last" approach, as by
far the easiest way of generating doctests is to copy and paste from the
interactive interpreter. The alternative is lots of annoying typing of
'>>>' and '...', and as you're editing text and not code IDE support tends
to be worse (although this is a tooling issue and not a problem with
doctest itself).

So whilst I'm not against improving doctest, I don't promote it as a unit
testing tool and disagree that it is suited to that task.

All the best,

Michael Foord


> > An interesting third party alternative that has been created recently
> > is behave: http://crate.io/packages/behave/
>
> This style of test is why it's so sad that doctest is ignored and
> unmaintained.  It's based on testing patterns developed by people who care
> to promote what they are doing, but I'm of the strong opinion that they are
> inferior to doctest.
>
>   Ian
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/7771dd93/attachment.html>

From jimjjewett at gmail.com  Mon Feb 27 21:39:37 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 27 Feb 2012 15:39:37 -0500
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CEE532DD-03B6-4656-BF05-81B530DB1811@masklinn.net>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CEE532DD-03B6-4656-BF05-81B530DB1811@masklinn.net>
Message-ID: <CA+OGgf7gmSvpDRaSex17Eoo2+QrKUYfd8iHgF=oH0WuXd1fJsg@mail.gmail.com>

On Sat, Feb 25, 2012 at 6:32 PM, Masklinn <masklinn at masklinn.net> wrote:
> On 2012-02-26, at 00:05 , Steven D'Aprano wrote:
>> - Immutable types can be used as keys in dicts.

Not always; for example, you can't use a tuple of lists, even though
the tuple itself is immutable.

> *technically*, you can use mutable types as dict keys if you define
> their __hash__ no? That is of course a bad idea when the instances
> are *expected* to be modified, but it should "work".

Not even a bad idea, if you define the hash carefully.  (Similar to
java final.)

Once hash(obj) returns something other than -1, it should return that
same value forever.  Attributes which do not contribute to the hash
can certainly still change.

That said, I would be nervous about changes to attributes that
contribute to __eq__, just because third party code may be so
surprised.

>>> class Str(str): pass
>>> a=Str("a")
>>> a.x=5
>>> a == "a"
True
>>> "x" in dir("a")
False
>>> "x" in dir(a)
True

-jJ


From dreamingforward at gmail.com  Mon Feb 27 21:43:14 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 13:43:14 -0700
Subject: [Python-ideas]  Fwd: doctest (and.... python3000)
Message-ID: <CAMjeLr-XDMzPjGf9axZQhUjNTHNg0JW7FknRH=cPcQB=HGg_WA@mail.gmail.com>

On Mon, Feb 27, 2012 at 1:22 PM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> I should probably correct myself.  It is suiltable, just not
>> enjoyable.  But now I know you are someone who likes all that arcana
>> of unittest module.
>
> I'm not sure about *that* -- having to exactly reproduce the output of the
> interpreter seems kind of arcane to me.  ;)

Well, you're an interesting test case for a theory -- some people shouldn't
be coding in python...

Python, as I see, is "the coder's language".  It's meant for a programmers
who want to write code for the sake of their art -- coding for him/herself
firstly (and their community) and secondly for "industrial productions" --
shops that just churn out working apps without a consideration for the art.

In the latter case, tests won't be for future coders in your community, but
for maintaining "*la machine*"  -- the simple, logical machine in your code.

This, to me, is the primary split between those of us who still have high
hopes for a true Python3000 (now evolved into python4000 because of release
v3) and the rest....

Accurate in your case?

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/6128056b/attachment.html>

From ethan at stoneleaf.us  Mon Feb 27 20:58:52 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 11:58:52 -0800
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
Message-ID: <4F4BE07C.1000505@stoneleaf.us>

Mark Janssen wrote:
> On Fri, Feb 17, 2012 at 3:12 PM, Nick Coghlan wrote:
>> On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen wrote:
>>> Anyway... of course patches welcome, yes...  ;^)
>> Not really. doctest is for *testing code example in docs*.
> 
> I understand.  This is exactly what I was wanting to use it for.  As
> Tim says "literate testing" or "executable documentation".
> 
> The suggestions I made are for enhancing those two.
> 
> Personally, I don't find unittest very suitable for test-driven
> *development*, although it *is* obviously well-suited for writing
> assurance tests otherwise.

I like unittest for TDD.


> The key difference, to me, is in that doctest promotes tests being
> written in order to have the *additional functionality* of
> documentation.    That makes it fun since your getting "twice the
> value for the cost of one", and that alone is the major item which
> drives test-driven development (IMHO) within the spirit of python,
> otherwise unittest is rather bulky to write in and of itself.

unittest can be a bit bulky, but definitely worth it IMO, especially 
when covering the corner cases.

I have not used doctest, but I can say that I strongly dislike having 
more than one or two examples in a docstring.  Having all possibilities 
(including corner cases) in a separate file I am okay with (as that 
would be documentation -- when I'm reading code I want to see code, and 
I'll look up the docs if I have a question).

The other gripe I have (possibly easily fixed): my python prompt is 
'-->' (makes email posting easier) -- should my doctests still use 
'>>>'?  Will doctest fail on my machine?


> Does anyone really use unittest outside the context of shop policy?

Yup.


From ben+python at benfinney.id.au  Mon Feb 27 22:35:37 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 28 Feb 2012 08:35:37 +1100
Subject: [Python-ideas] Fwd:  doctest
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>
	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>
	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>
Message-ID: <87k438q7rq.fsf@benfinney.id.au>

Mark Janssen <dreamingforward at gmail.com>
writes:

> The key difference, to me, is in that doctest promotes tests being
> written in order to have the *additional functionality* of
> documentation.

I think that doctest promotes docstrings being written with the
additional functionality of tests. To that extent, it is very good.

-- 
 \          ?The fact that I have no remedy for all the sorrows of the |
  `\     world is no reason for my accepting yours. It simply supports |
_o__)  the strong probability that yours is a fake.? ?Henry L. Mencken |
Ben Finney


From phd at phdru.name  Mon Feb 27 22:53:08 2012
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 28 Feb 2012 01:53:08 +0400
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwbbx=pmPFQsd7pkaiTb5TPMxZ1rW6LB+ibB5z8KR7EfVw@mail.gmail.com>
References: <CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<20120227184942.GA12927@iskra.aviel.ru>
	<CAMpsgwbbx=pmPFQsd7pkaiTb5TPMxZ1rW6LB+ibB5z8KR7EfVw@mail.gmail.com>
Message-ID: <20120227215307.GA20426@iskra.aviel.ru>

On Mon, Feb 27, 2012 at 07:55:50PM +0100, Victor Stinner wrote:
> 2012/2/27 Oleg Broytman <phd at phdru.name>:
> > On Mon, Feb 27, 2012 at 06:35:57PM +0000, Rob Cliffe wrote:
> >> I suggested a "mutable" attribute some time ago.
> >> This could lead to finally doing away with one of Python's FAQs: Why
> >> does python have lists AND tuples? ?They could be unified into a
> >> single type.
> >
> > ? The main difference between lists and tuples is not mutability but
> > usage: lists are for a (unknown) number of similar items (a list of
> > messages, e.g.), tuples are for a (known) number of different items at
> > fixed positions (an address is a tuple of (country, city, street
> > address), for example).
> 
> And tuple doesn't have append, extend, remove, ... methods.

   Tuples are *also* read only, but being read only lists is not their
main purpose.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From dreamingforward at gmail.com  Mon Feb 27 23:25:14 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 15:25:14 -0700
Subject: [Python-ideas]  Fwd: doctest (and.... python3000)
In-Reply-To: <CAMjeLr-e55Ax6uvgJ5wGdtBU8RGbnN5pyoQKjsJtdrOmOajk1g@mail.gmail.com>
References: <CAMjeLr-XDMzPjGf9axZQhUjNTHNg0JW7FknRH=cPcQB=HGg_WA@mail.gmail.com>
	<4F4C00B6.9020406@stoneleaf.us>
	<CAMjeLr-e55Ax6uvgJ5wGdtBU8RGbnN5pyoQKjsJtdrOmOajk1g@mail.gmail.com>
Message-ID: <CAMjeLr-rC0=5K1p=Mx7VMx5iHs6Z811zBsC=eANroyijfmo1Xg@mail.gmail.com>

On Mon, Feb 27, 2012 at 3:16 PM, Ethan Furman <ethan at stoneleaf.us> wrote:

> Mark Janssen wrote:
>
>  On Mon, Feb 27, 2012 at 1:22 PM, Ethan Furman wrote:
>>
>>> I should probably correct myself.  It is suiltable, just not
>>>> enjoyable.  But now I know you are someone who likes all that arcana
>>>> of unittest module.
>>>>
>>>
>>> I'm not sure about *that* -- having to exactly reproduce the output of
>>> the interpreter seems kind of arcane to me.  ;)
>>>
>>
>> Well, you're an interesting test case for a theory -- some people
>> shouldn't be coding in python...
>>
>
> Wow.  Talk about mixed emotions -- on the one hand I totally agree with
> you, on the other I haven't been that offended in quite some time.  ;)
>
> Haha, okay.  Sorry, I was a bit blunt there.


>
>  Python, as I see, is "the coder's language".  It's meant for a
>> programmers who want to write code for the sake of their art -- coding for
>> him/herself firstly (and their community) and secondly for "industrial
>> productions" -- shops that just churn out working apps without a
>> consideration for the art.
>>
>
> While Python is the most enjoyable language I have ever used, I strive for
> mastery and beauty in all the languages I work with.  One of Python's big
> strengths is it's simplicity, while still allowing for great power (with
> it's data structures, exception handling, metaclasses (okay, not so simple
> there ;)).
>

Have you seen Ada, Oberon?  For some reason I couldn't begin to describe, I
think you might actually like them better.  But, hey, I'm happy there're
people who enjoy python.


>  In the latter case, tests won't be for future coders in your community,
>> but for maintaining "/la machine/"  -- the simple, logical machine in your
>> code.
>>
>
> I fail to see your point here with regards to doctest versus unittest.
> When I actually write the docs for my dbf module (simple Sphinx generated
> at the moment), I will have examples in it and run it through doctest.
>  However, I will still have the unit tests as the primary test bench for it.
>

Hmm, I guess you're kind of a hybrid, then...


> As an example, for the dBase III table type there are five field types.
>  There is a test for a table with each possible combination (not
> permutation) of one to five of those field types (okay, so I'm slightly
> paranoid, too ;) -- would you really want to see that in your documentation?


No, you are right there.  But this look's like a case of hybridized code --
you aren't able to make doctests at a fine-enough granularity in order to
ensure your code so they go up a level of abstraction where it gets bulky
and no longer self-documenting.

If Python 3 is so hope-dashing, perhaps you should fork your own version?
>
> Well, I still have hopes for it, it's just still in progress...

I appreciate your reply,

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/8c5c82ed/attachment.html>

From ethan at stoneleaf.us  Mon Feb 27 23:16:22 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 14:16:22 -0800
Subject: [Python-ideas] Fwd: doctest (and.... python3000)
In-Reply-To: <CAMjeLr-XDMzPjGf9axZQhUjNTHNg0JW7FknRH=cPcQB=HGg_WA@mail.gmail.com>
References: <CAMjeLr-XDMzPjGf9axZQhUjNTHNg0JW7FknRH=cPcQB=HGg_WA@mail.gmail.com>
Message-ID: <4F4C00B6.9020406@stoneleaf.us>

Mark Janssen wrote:
> On Mon, Feb 27, 2012 at 1:22 PM, Ethan Furman wrote:
>>> I should probably correct myself.  It is suiltable, just not
>>> enjoyable.  But now I know you are someone who likes all that arcana
>>> of unittest module.
>>
>> I'm not sure about *that* -- having to exactly reproduce the output 
>> of the interpreter seems kind of arcane to me.  ;)
> 
> Well, you're an interesting test case for a theory -- some people 
> shouldn't be coding in python...

Wow.  Talk about mixed emotions -- on the one hand I totally agree with 
you, on the other I haven't been that offended in quite some time.  ;)


> Python, as I see, is "the coder's language".  It's meant for a 
> programmers who want to write code for the sake of their art -- coding 
> for him/herself firstly (and their community) and secondly for 
> "industrial productions" -- shops that just churn out working apps 
> without a consideration for the art.

While Python is the most enjoyable language I have ever used, I strive 
for mastery and beauty in all the languages I work with.  One of 
Python's big strengths is it's simplicity, while still allowing for 
great power (with it's data structures, exception handling, metaclasses 
(okay, not so simple there ;)).


> In the latter case, tests won't be for future coders in your community, 
> but for maintaining "/la machine/"  -- the simple, logical machine in 
> your code.

I fail to see your point here with regards to doctest versus unittest. 
When I actually write the docs for my dbf module (simple Sphinx 
generated at the moment), I will have examples in it and run it through 
doctest.  However, I will still have the unit tests as the primary test 
bench for it.

As an example, for the dBase III table type there are five field types. 
  There is a test for a table with each possible combination (not 
permutation) of one to five of those field types (okay, so I'm slightly 
paranoid, too ;) -- would you really want to see that in your documentation?


> This, to me, is the primary split between those of us who still have 
> high hopes for a true Python3000 (now evolved into python4000 because of 
> release v3) and the rest....

Overall I am quite happy with Py3k.  I seriously doubt that I would be 
100% satisfied with somebody else's language simply because we are not 
the same individual and so have different preferences.  I can say I am 
at least 95% happy with Python, which is the best approval rating I have 
been able to give since Assembly.

If Python 3 is so hope-dashing, perhaps you should fork your own version?

> Accurate in your case?


That I shouldn't be using Python?  No, inaccurate.

That I am part of the bunch so disappointed with Py3k that I am yearning 
for Py4k?  No, inaccurate.

~Ethan~


From fuzzyman at gmail.com  Mon Feb 27 23:59:17 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 27 Feb 2012 22:59:17 +0000
Subject: [Python-ideas] doctest
In-Reply-To: <CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
Message-ID: <CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>

On 27 February 2012 20:35, Michael Foord <fuzzyman at gmail.com> wrote:

>
>
> On 18 February 2012 04:24, Ian Bicking <ianb at colorstudy.com> wrote:
>
>> On Feb 17, 2012 4:12 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>> >
>> > On Sat, Feb 18, 2012 at 7:57 AM, Mark Janssen <
>> dreamingforward at gmail.com> wrote:
>> > > Anyway... of course patches welcome, yes...  ;^)
>> >
>> > Not really. doctest is for *testing code example in docs*. If you try
>> > to use it for more than that, it's likely to drive you up the wall, so
>> > proposals to make it more than it is usually don't get a great
>> > reception (docs patches to make it's limitations clearer are generally
>> > welcome, though). The stdib solution for test driven development is
>> > unittest (the vast majority of our own regression suite is written
>> > that way - only a small proportion uses doctest).
>>
>> This pessimistic attitude is why doctest is challenging to work with at
>> times, not anything to do with doctest's actual model.  The constant
>> criticisms of doctest keep contributors away, and keep its many resolvable
>> problems from being resolved.
>>
>
> Personally I think there are several fundamental problems with doctest *as
> a unit testing tool*. doctest is *awesome* for testing documentation
> examples but in particular this one:
>
> * Every line becomes an assertion - in a unit test you typically follow
> the arrange -> act -> assert pattern. Only the results of the *assertion*
> are relevant to the test. (Obviously unexpected exceptions at any stage are
> relevant....). With doctest you have to take care to ensure that the exact
> output of *every line* of your arrange and act steps also match, even if
> they are irrelevant to your assertion. (The arrange and act steps will
> often include lines where you are creating state, and their output is
> irrelevant so long as they put the right things in place.)
>
> The particular implementation of doctest means that there are additional,
> potentially resolvable problems that are also a damn nuisance in a unit
> testing fail:
>

Jeepers, I changed direction mid-sentence there. It should have read
something along the lines of:

As well as fundamental problems, the particular implementation of doctest
suffers from these potentially resolvable problems:


>
> Execution of an individual testing section continues after a failure. So a
> single failure results in the *reporting* of potentially many failures.
>
> The problem of being dependent on order of unorderable types (actually
> very difficult to solve).
>
> Things like shared fixtures and mocking become *harder* (although by no
> means impossible) in a doctest environment.
>
> Another thing I dislike is that it encourages a "test last" approach, as
> by far the easiest way of generating doctests is to copy and paste from the
> interactive interpreter. The alternative is lots of annoying typing of
> '>>>' and '...', and as you're editing text and not code IDE support tends
> to be worse (although this is a tooling issue and not a problem with
> doctest itself).
>
>

More fundamental-ish problems:

    Putting debugging prints into a function can break a myriad of tests
(because they're output based).

    With multiple doctest blocks in a test file running an individual test
can be difficult (impossible?).

    I may be misremembering, but I think debugging support is also
problematic because of the stdout redirection.

So yeah. Not a huge fan.

All the best,

Michael


> So whilst I'm not against improving doctest, I don't promote it as a unit
> testing tool and disagree that it is suited to that task.
>
> All the best,
>
> Michael Foord
>
>
>
>
>>  > An interesting third party alternative that has been created recently
>> > is behave: http://crate.io/packages/behave/
>>
>> This style of test is why it's so sad that doctest is ignored and
>> unmaintained.  It's based on testing patterns developed by people who care
>> to promote what they are doing, but I'm of the strong opinion that they are
>> inferior to doctest.
>>
>>   Ian
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>
>
> --
>
> http://www.voidspace.org.uk/
>
> May you do good and not evil
> May you find forgiveness for yourself and forgive others
>
> May you share freely, never taking more than you give.
> -- the sqlite blessing http://www.sqlite.org/different.html
>
>
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/62db0ae2/attachment.html>

From fuzzyman at gmail.com  Tue Feb 28 00:20:44 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 27 Feb 2012 23:20:44 +0000
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr8q5A8Qnr1Mz7=LZdicTuiw7V8311MJ+faBzSE5h+FTcw@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<4F3F3240.4090104@pearwood.info>
	<CAMjeLr9nKpnhmEsCKkszPVN05_raFwKSdVhCwqAn1JZV1m=Sow@mail.gmail.com>
	<CAMjeLr8q5A8Qnr1Mz7=LZdicTuiw7V8311MJ+faBzSE5h+FTcw@mail.gmail.com>
Message-ID: <CAKCKLWwGgauL0J-KEDxksSJj9Lp7NsXgunnEcBacGDe5_ob_Mg@mail.gmail.com>

On 27 February 2012 19:27, Mark Janssen <dreamingforward at gmail.com> wrote:

> On Fri, Feb 17, 2012 at 10:08 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
> > Mark Janssen wrote:
> >> 1. Execution context determined by outer-scope doctest defintions.
> >
> > Can you give an example of how you would like this to work?
> >
>
> Sure, I wish I had a good example off the top of my head, but perhaps
> this will convey the idea:
>
> class MyClass():
>  """Yadda Yadda: foo's bars.
>
> >>> m = MyClass({some, sufficiently, interesting, initialization})
>  #tPOINT1:  this variable (m) now accessible by all methods.
> "foobar check"   #POINT2: possible output here is a useful test case
> not well-definable elsewhere without losing context.
> """
>
>    def method1(self, other):
>         """Method method method method.
>
>         >>> m.method("foo")  #Now we see m is already defined and useable.
>         "bar"
>          """
>
>    def meth2(self, other):
>        """Method to foo all bars
>
>       >>> m.method("bar") #would have to decide whether a fresh m is
> redefined with each innerscope doctest (if we want side-effects to
> carry across inner doctests).
>
> (END)
>
> This is a basic example, sorry it's rather crude.  There's probably a
> better example.  (Think establishing a network socket connection or
> something in the class' doc which is then used by all the methods, for
> example.)
>
> >> 2. Smart Comparisons that will detect output of a non-ordered type
> >> (dict/set), lift and recast it and do a real comparison.>
> >
> > I would love to see a doctest directive that accepted differences in
> output
> > order, e.g. would match {1, 2, 3} and {3, 1, 2}. But I think that's a
> hard
> > problem to solve in the general case.
>
> I think this would be as simple as lifting the (string) output and
> doing an eval("{1,2,3}")=={3,2,1}, or (for security) using
> ast.literal_eval like Devin suggested.
>
>

How will that handle not-particularly-obscure code like this:

>>> class Foo(object):
...  def __init__(self, a):
...   self.a = a
...  def __repr__(self):
...   return '<Foo a=%r>' % self.a
...
>>> a = {Foo(1), Foo(2), Foo(3)}
>>> b = {Foo(4), Foo(5), Foo(6)}
>>> {'first': a, 'second': b}
{'second': set([<Foo a=4>, <Foo a=6>, <Foo a=5>]), 'first': set([<Foo a=3>,
<Foo a=2>, <Foo a=1>])}


I don't think a *general* solution for unordered types is even possible
because you can't parse arbitrary reprs.

All the best,

Michael


> > I'd like a #3 as well: an abbreviated way to spell doctest directives,
> > because they invariably push my tests well past the 80 character mark.
>
> Hmm, seem like an alias could be defined easily enough, but I'll try
> to think about this when I have more time.
>
> mark
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/7f3a5633/attachment.html>

From rob.cliffe at btinternet.com  Tue Feb 28 00:18:57 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Mon, 27 Feb 2012 23:18:57 +0000
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <20120227215307.GA20426@iskra.aviel.ru>
References: <CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<20120227184942.GA12927@iskra.aviel.ru>
	<CAMpsgwbbx=pmPFQsd7pkaiTb5TPMxZ1rW6LB+ibB5z8KR7EfVw@mail.gmail.com>
	<20120227215307.GA20426@iskra.aviel.ru>
Message-ID: <4F4C0F61.5020405@btinternet.com>


On 27/02/2012 21:53, Oleg Broytman wrote:
> On Mon, Feb 27, 2012 at 07:55:50PM +0100, Victor Stinner wrote:
>> 2012/2/27 Oleg Broytman<phd at phdru.name>:
>>> On Mon, Feb 27, 2012 at 06:35:57PM +0000, Rob Cliffe wrote:
>>>> I suggested a "mutable" attribute some time ago.
>>>> This could lead to finally doing away with one of Python's FAQs: Why
>>>> does python have lists AND tuples?  They could be unified into a
>>>> single type.
>>>    The main difference between lists and tuples is not mutability but
>>> usage: lists are for a (unknown) number of similar items (a list of
>>> messages, e.g.), tuples are for a (known) number of different items at
>>> fixed positions (an address is a tuple of (country, city, street
>>> address), for example).
>> And tuple doesn't have append, extend, remove, ... methods.
>     Tuples are *also* read only, but being read only lists is not their
> main purpose.
>
> Oleg.
With respect, I think you are thinking too narrowly, conditioned by 
familiar usage.
Items of a list do not have to be similar (there is nothing in the 
language that implies that).
And tuples are often - conceptually - extended, even though it actually 
has to be done by building a new tuple - Python even allows you to write
     tuple1 += tuple2
A unified type would have "mutating" methods such as append - it's just 
that they would raise an error if the object's flag (however it was 
implemented) defined it as immutable.

I visualised an actual object attribute, e.g. __mutable__, that could be 
set to a boolean value.  But having __mutable__() and __immutable__() 
methods as suggested by Eric is an alternative.  And there may well be 
others.

Rob Cliffe


From fuzzyman at gmail.com  Tue Feb 28 00:31:17 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 27 Feb 2012 23:31:17 +0000
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr-C48+zc2VE67ZUQQNJG-kG_K6htBw-_MNbCSRsV8r=qg@mail.gmail.com>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>
	<CAMjeLr-C48+zc2VE67ZUQQNJG-kG_K6htBw-_MNbCSRsV8r=qg@mail.gmail.com>
Message-ID: <CAKCKLWzwcWB-yHexQ6V8m8pQQnNKBb03nY1HZPhTcXybCiofNw@mail.gmail.com>

On 27 February 2012 23:23, Mark Janssen <dreamingforward at gmail.com> wrote:

> On Mon, Feb 27, 2012 at 3:59 PM, Michael Foord <fuzzyman at gmail.com> wrote:
>>
>> As well as fundamental problems, the particular implementation of doctest
>> suffers from these potentially resolvable problems:
>>
>>
>>> Execution of an individual testing section continues after a failure. So
>>> a single failure results in the *reporting* of potentially many failures.
>>>
>>> Hmm, perhaps I don't understand you.  doctest reports how many failures
> occur, without blocking on any single failure.
>


Right. But you  typically group a bunch of actions into a single "test". If
a doctest fails in an early action then every line after that will probably
fail - a single test failure will cause multiple *reported* failures.


>
>
>> The problem of being dependent on order of unorderable types (actually
>>> very difficult to solve).
>>>
>>
> Well, a crude solution is just to lift any output text that denotes an
> non-ordered type and pass it through an "eval" operation.
>


Not a general solution - not all reprs are reversible (in fact very few are
as a proportion of all objects).


>
>
>> Things like shared fixtures and mocking become *harder* (although by no
>>> means impossible) in a doctest environment.
>>>
>>>
> This, I think, what I was suggesting with doctest "scoping" where the
> execution environment is a matter of how nested the docstring is in
> relation to the "python semantic environment", with a final scope of
> "globs" that can be passed into the test environment, for anything with
> global scope.
>
>
>> Another thing I dislike is that it encourages a "test last" approach, as
>>> by far the easiest way of generating doctests is to copy and paste from the
>>> interactive interpreter. The alternative is lots of annoying typing of
>>> '>>>' and '...', and as you're editing text and not code IDE support tends
>>> to be worse (although this is a tooling issue and not a problem with
>>> doctest itself).
>>>
>>
> This is where I think the idea of having a test() built-in, like help(),
> would really be nice.  One could run test(myClass.mymethod) iterively while
> one codes, encouraging TDD and writing tests *along with* your code.   My
> TDD sense says it couldn't get any better.
>
>
>> More fundamental-ish problems:
>>
>>     Putting debugging prints into a function can break a myriad of tests
>> (because they're output based).
>>
>
> That's a good point.  But then it's a fairly simple matter of adding the
> output device:  'print >> stderr, 'here I am'", another possibility, if TDD
> were to become more of part of the language, is a special debug exception:
>  "raise Debug("Am at the test point, ", x)"  Such special exceptions could
> be caught and ignored by doctest.
>
>
>>     With multiple doctest blocks in a test file running an individual
>> test can be difficult (impossible?).
>>
>> This again solved with the test() built-in an making TDD something that
> is a feature of the language itself.
>

I don't fully follow you, but it shouldn't be hard to add this to doctest
and see if it is really useful.


>
>
>>     I may be misremembering, but I think debugging support is also
>> problematic because of the stdout redirection
>>
>
> Interesting, I try to pre-conceive tests well enough so I never need to
> invoke the debugger.
>

Heh. When I'm adding new features to existing code it is very common for me
to write a test that drops into the debugger after setting up some state -
and potentially using the test infrastructure (fixtures, django test client
perhaps, etc). So not being able to run a single test or drop into a
debugger puts the kybosh on that.

Michael


>
>
>> So yeah. Not a huge fan.
>>
>> That's good feedback.  Thanks.
>
> Mark
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/5b1e5d96/attachment.html>

From ianb at colorstudy.com  Tue Feb 28 00:44:02 2012
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 27 Feb 2012 17:44:02 -0600
Subject: [Python-ideas] doctest
In-Reply-To: <CAKCKLWzwcWB-yHexQ6V8m8pQQnNKBb03nY1HZPhTcXybCiofNw@mail.gmail.com>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>
	<CAMjeLr-C48+zc2VE67ZUQQNJG-kG_K6htBw-_MNbCSRsV8r=qg@mail.gmail.com>
	<CAKCKLWzwcWB-yHexQ6V8m8pQQnNKBb03nY1HZPhTcXybCiofNw@mail.gmail.com>
Message-ID: <CAHtfchr7tyqJEVuc-Y_3+eZYh43EnnS9SMySHqu7MHWsTAt-kQ@mail.gmail.com>

On Mon, Feb 27, 2012 at 5:31 PM, Michael Foord <fuzzyman at gmail.com> wrote:

>
>
> On 27 February 2012 23:23, Mark Janssen <dreamingforward at gmail.com> wrote:
>
>> On Mon, Feb 27, 2012 at 3:59 PM, Michael Foord <fuzzyman at gmail.com>wrote:
>>>
>>> As well as fundamental problems, the particular implementation of
>>> doctest suffers from these potentially resolvable problems:
>>>
>>>
>>>> Execution of an individual testing section continues after a failure.
>>>> So a single failure results in the *reporting* of potentially many failures.
>>>>
>>>> Hmm, perhaps I don't understand you.  doctest reports how many failures
>> occur, without blocking on any single failure.
>>
>
>
> Right. But you  typically group a bunch of actions into a single "test".
> If a doctest fails in an early action then every line after that will
> probably fail - a single test failure will cause multiple *reported*
> failures.
>
>
>>
>>
>>> The problem of being dependent on order of unorderable types (actually
>>>> very difficult to solve).
>>>>
>>>
>> Well, a crude solution is just to lift any output text that denotes an
>> non-ordered type and pass it through an "eval" operation.
>>
>
>
> Not a general solution - not all reprs are reversible (in fact very few
> are as a proportion of all objects).
>

Just an implementation suggestion - Guido's suggestion of using
sys.displayhook will work to change the repr of objects (I had never heard
of it until then, and had to test to convince myself).  Doctest needs
reliable repr's more than reversable repr's, and you can create them using
that.  You'll still get a lot of <foobar.Foobar object at 0x391a9df>
strings, which suck... but if you are committed to doctest then maybe
better to provide good __repr__ methods on your custom objects!  For
doctest.js (where I implemented a number of changes I would have wanted for
doctest in Python) I have found this sort of thing sufficient, but
Javascript objects tend to be a little more bare and there aren't existing
conventions for repr/print/etc, so I have some more flexibility in my
implementation.

  Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/7332f159/attachment.html>

From ericsnowcurrently at gmail.com  Tue Feb 28 00:48:29 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 27 Feb 2012 16:48:29 -0700
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CALFfu7DKDH+Jyd7AjGOvYKZ9JJy+JVLbWoquie3Stm5_n_1Kbw@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
	<CALFfu7DKDH+Jyd7AjGOvYKZ9JJy+JVLbWoquie3Stm5_n_1Kbw@mail.gmail.com>
Message-ID: <CALFfu7DkCmUP2GB0CSOeG_945BqoeDBNov4m00W4Rz8M3_wmSg@mail.gmail.com>

On Mon, Feb 27, 2012 at 12:18 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Mon, Feb 27, 2012 at 11:45 AM, Mark Janssen
> <dreamingforward at gmail.com> wrote:
>> On Mon, Feb 27, 2012 at 11:35 AM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
>>> I suggested a "mutable" attribute some time ago.
>>> This could lead to finally doing away with one of Python's FAQs: Why does
>>> python have lists AND tuples? ?They could be unified into a single type.
>>> Rob Cliffe.
>>
>> Yeah, that would be cool. ?It would force (ok, *allow*) the
>> documenting of any non-mutable attributes (i.e. when they're mutable,
>> and why they're being set immutable, etc.).
>>
>> There an interesting question, then, should the mutable bit be on the
>> Object itself (the whole type) or in each instance....? ?There's
>> probably no "provable" or abstract answer to this, but rather just an
>> organization principle to the language....
>
> In contrast to a flag on objects, one alternative is to have a
> __mutable__() method for immutable types and __immutable__() for
> mutable types. ?I'd be nervous about being able to make an immutable
> object mutable at an arbitrary moment with the associated effect on
> the hash of the object.

Just to be clear, I meant that __mutable__() would return a mutable
version of the object, of a distinct mutable type, if the object
supported one.  So for a tuple, it would return the corresponding
list.  These would be distinct objects.  Likewise obj.__immutable__()
would return a separate, immutable version of obj.

Such an approach could be applied to lists/tuples, sets/frozensets,
strings/bytearrays, bytes/bytearrays, and any other pairings we
already have.  Unless a frozendict were added as a standard type, dict
would not have a match so an __immutable__() method would not be
added.  In that case, trying to call dict.__immutable__() would be an
AttributeError, as happens now.

-eric


From ncoghlan at gmail.com  Tue Feb 28 01:15:19 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 28 Feb 2012 10:15:19 +1000
Subject: [Python-ideas] doctest
In-Reply-To: <CAHtfchr7tyqJEVuc-Y_3+eZYh43EnnS9SMySHqu7MHWsTAt-kQ@mail.gmail.com>
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>
	<CAMjeLr-C48+zc2VE67ZUQQNJG-kG_K6htBw-_MNbCSRsV8r=qg@mail.gmail.com>
	<CAKCKLWzwcWB-yHexQ6V8m8pQQnNKBb03nY1HZPhTcXybCiofNw@mail.gmail.com>
	<CAHtfchr7tyqJEVuc-Y_3+eZYh43EnnS9SMySHqu7MHWsTAt-kQ@mail.gmail.com>
Message-ID: <CADiSq7c9-AzW_YyK6rksyLpe0BwFJWCuW2rBkqCGB51PDoYpvw@mail.gmail.com>

On Tue, Feb 28, 2012 at 9:44 AM, Ian Bicking <ianb at colorstudy.com> wrote:
> Just an implementation suggestion - Guido's suggestion of using
> sys.displayhook will work to change the repr of objects (I had never heard
> of it until then, and had to test to convince myself).? Doctest needs
> reliable repr's more than reversable repr's, and you can create them using
> that.? You'll still get a lot of <foobar.Foobar object at 0x391a9df>
> strings, which suck... but if you are committed to doctest then maybe better
> to provide good __repr__ methods on your custom objects!? For doctest.js
> (where I implemented a number of changes I would have wanted for doctest in
> Python) I have found this sort of thing sufficient, but Javascript objects
> tend to be a little more bare and there aren't existing conventions for
> repr/print/etc, so I have some more flexibility in my implementation.

You can actually do some pretty cool doctest hacks via displayhook and
excepthook. I created a hacked together doctest variant [1] years ago
that could run doctests from ODT files and also pay attention to
sys.excepthook/displayhook before deciding that the test had failed.

[1] http://svn.python.org/view/sandbox/trunk/userref/

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Tue Feb 28 01:26:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 28 Feb 2012 10:26:38 +1000
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CALFfu7DkCmUP2GB0CSOeG_945BqoeDBNov4m00W4Rz8M3_wmSg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
	<CALFfu7DKDH+Jyd7AjGOvYKZ9JJy+JVLbWoquie3Stm5_n_1Kbw@mail.gmail.com>
	<CALFfu7DkCmUP2GB0CSOeG_945BqoeDBNov4m00W4Rz8M3_wmSg@mail.gmail.com>
Message-ID: <CADiSq7f1=_-ccctqVYeLRH6rGqrUEDY_k4KvXRz5=ARcGxjJOw@mail.gmail.com>

On Tue, Feb 28, 2012 at 9:48 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> Such an approach could be applied to lists/tuples, sets/frozensets,
> strings/bytearrays, bytes/bytearrays, and any other pairings we
> already have. ?Unless a frozendict were added as a standard type, dict
> would not have a match so an __immutable__() method would not be
> added. ?In that case, trying to call dict.__immutable__() would be an
> AttributeError, as happens now.

Folks, before retreading this ground, please make sure to review the
relevant past history and decide what (if anything) has changed since
Barry proposed the freeze protocol 5 years ago and the PEP was
rejected: http://www.python.org/dev/peps/pep-0351/

While hypergeneralisation of this behaviour is tempting, it really
isn't a solid abstraction. It's better to make use case specific
design decisions that handle all the corner cases relating to mutable
vs immutable variants of *particular* container types. The issues you
have to consider when converting a list to a tuple are not the same as
those that exist when converting bytearray to bytes or a set to a
frozenset.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From sven at marnach.net  Tue Feb 28 00:39:06 2012
From: sven at marnach.net (Sven Marnach)
Date: Mon, 27 Feb 2012 23:39:06 +0000
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
Message-ID: <20120227233906.GB3406@pantoffel-wg.de>

An easy way to create immutable instances is 'collections.namedtuple':

    X = namedtuple("X", "a b")
    x = X(a=4, b=2)
    x.a + x.b          # fine
    x.a = 5            # AttributeError: can't set attribute
    x.c = 5            # AttributeError: 'X' object has no attribute 'c'

Tricks using 'object.__setattr__()' etc. will fail since the instance
doesn't have a '__dict__'.  The only data in the instance is stored in
a tuple, so it's as immutable as a tuple.

You can also derive from 'X' to add further methods.  Remember to set
'__slots__' to an empty iterable to maintain immutability.

Cheers,
    Sven


From ericsnowcurrently at gmail.com  Tue Feb 28 01:51:17 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 27 Feb 2012 17:51:17 -0700
Subject: [Python-ideas] Support other dict types for type.__dict__
In-Reply-To: <CADiSq7f1=_-ccctqVYeLRH6rGqrUEDY_k4KvXRz5=ARcGxjJOw@mail.gmail.com>
References: <CAMpsgwZQQkt5XQhye8SQ_pfuwC67QDp3E5t55SOZxguncv2fHQ@mail.gmail.com>
	<CADiSq7e6DrAku2tzzOPeY-W9CJVBjiP1DeYMcZCUoinnw0FU-g@mail.gmail.com>
	<CAMpsgwaL4pTLvTq0Ncyd+Kc-uGc=vqwKU-Ow0w+B0kXT9DNKXg@mail.gmail.com>
	<CAMpsgwYUuKO4GWy-8wG6PQ1R7sX2vVzRcF50k+hcq=r6D_wxGg@mail.gmail.com>
	<4F48DF86.7060600@nedbatchelder.com> <4F496954.30101@pearwood.info>
	<CAMjeLr_6zpHOCXqtBSSufON-G6xzai-gb-ncDDYnv8zkmeN=dg@mail.gmail.com>
	<CAMjeLr8NpCuteQ0DhJfGtBNucHLN=_vV7Y25LrDX0H1c0m4Pvg@mail.gmail.com>
	<4F4BCD0D.8040706@btinternet.com>
	<CAMjeLr-bOTWpndzh1_Umh8TJ+GLV2sDFG3fDyL_K3AmQXVVfmA@mail.gmail.com>
	<CALFfu7DKDH+Jyd7AjGOvYKZ9JJy+JVLbWoquie3Stm5_n_1Kbw@mail.gmail.com>
	<CALFfu7DkCmUP2GB0CSOeG_945BqoeDBNov4m00W4Rz8M3_wmSg@mail.gmail.com>
	<CADiSq7f1=_-ccctqVYeLRH6rGqrUEDY_k4KvXRz5=ARcGxjJOw@mail.gmail.com>
Message-ID: <CALFfu7DgB6_qrE=fBrq0AUReOgmFZPpPDW6rbjbf6WVfC4mPxQ@mail.gmail.com>

On Mon, Feb 27, 2012 at 5:26 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, Feb 28, 2012 at 9:48 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> Such an approach could be applied to lists/tuples, sets/frozensets,
>> strings/bytearrays, bytes/bytearrays, and any other pairings we
>> already have. ?Unless a frozendict were added as a standard type, dict
>> would not have a match so an __immutable__() method would not be
>> added. ?In that case, trying to call dict.__immutable__() would be an
>> AttributeError, as happens now.
>
> Folks, before retreading this ground, please make sure to review the
> relevant past history and decide what (if anything) has changed since
> Barry proposed the freeze protocol 5 years ago and the PEP was
> rejected: http://www.python.org/dev/peps/pep-0351/
>
> While hypergeneralisation of this behaviour is tempting, it really
> isn't a solid abstraction. It's better to make use case specific
> design decisions that handle all the corner cases relating to mutable
> vs immutable variants of *particular* container types. The issues you
> have to consider when converting a list to a tuple are not the same as
> those that exist when converting bytearray to bytes or a set to a
> frozenset.

Point taken.  :)  I knew I'd heard the idea somewhere.  I appreciate
how Raymond reacts here:

  http://mail.python.org/pipermail/python-dev/2006-February/060802.html

and how Greg Ewing responds here:

  http://mail.python.org/pipermail/python-dev/2006-February/060822.html

My point was that an __immutable__ flag was not a good idea.  However,
I agree that the generic protocol is likewise inadvisable because it
fosters a generic design approach where a generic one is not
appropriate.

-eric


From ethan at stoneleaf.us  Tue Feb 28 01:00:02 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 27 Feb 2012 16:00:02 -0800
Subject: [Python-ideas] Fwd: doctest (and.... python3000)
In-Reply-To: <CAMjeLr9nY-mYeyGP76PLRwCc3yvvPhR6xYOa7dCHsyn+nmx56Q@mail.gmail.com>
References: <CAMjeLr-XDMzPjGf9axZQhUjNTHNg0JW7FknRH=cPcQB=HGg_WA@mail.gmail.com>	<4F4C00B6.9020406@stoneleaf.us>	<CAMjeLr-e55Ax6uvgJ5wGdtBU8RGbnN5pyoQKjsJtdrOmOajk1g@mail.gmail.com>	<CAMjeLr-rC0=5K1p=Mx7VMx5iHs6Z811zBsC=eANroyijfmo1Xg@mail.gmail.com>	<4F4C0A18.3060802@stoneleaf.us>
	<CAMjeLr9nY-mYeyGP76PLRwCc3yvvPhR6xYOa7dCHsyn+nmx56Q@mail.gmail.com>
Message-ID: <4F4C1902.4040802@stoneleaf.us>

Mark Janssen wrote:
> On Mon, Feb 27, 2012 at 3:56 PM, Ethan Furman <ethan at stoneleaf.us 
> <mailto:ethan at stoneleaf.us>> wrote:
> 
>     As probably the easiest example, what is gained by having regression
>     tests as a document?  With unittest you write a test with the
>     expected output and your done.  I would imagine a doctest being
>     something like
> 
>     """This bug introduced in version 2.7.1, fixed in 2.7.2
>      >>> this = quibble('that')
>      >>> this.attr
>     'correct value'
>     """
> 
> 
> Huh?  Perhaps I'm being dumb, but this is generally done outside of 
> unittest and within the code itself, something like:
> 
> if sys.version > 2.4:
>     this= quibbleV4
> else:
>    this = quibbleV3

The choice of python similar version numbers was probably a mistake.

The point is quibbleV3 has a bug in it, and I want to make sure that bug 
doesn't come back in later versions -- so I add a test in my unit tests 
to make sure that it doesn't.

~Ethan~


From dreamingforward at gmail.com  Tue Feb 28 05:34:02 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 21:34:02 -0700
Subject: [Python-ideas] adding a Debug exception?
Message-ID: <CAMjeLr_=hoaGnDgEiDtL-MyO4XLTYgNRwwU=E=s+mAX1_jiJLg@mail.gmail.com>

Had an idea on another thread (doctest) about a special exception called
"Debug" that could could be raised to generate arbitrary output to stderr.
 This would be used instead of spurious print statements in code to inform
developers during debugging (which might throw off doctest, for example).
 It could also replace "assert" (and improve upon it) which seems to be
deprecated.   Also, the __debug__ global could actually gain some
functionality...

Its "argument" could be an `eval`uatable string (checked at compile time)
and it's output, this very string *plus* the output of it (if it evaluates
to something different than itself).

Just an idea....

mark
santa fe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/c952b099/attachment.html>

From dreamingforward at gmail.com  Tue Feb 28 05:52:10 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 21:52:10 -0700
Subject: [Python-ideas] Fwd:  [Python-Dev] matrix operations on dict :)
In-Reply-To: <CAMjeLr-+gDAhDorAQ2jNzpnUO7GzEqbK7Ah9aNFZ7rNQ4xmDwQ@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
	<87d39oycrv.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr-+gDAhDorAQ2jNzpnUO7GzEqbK7Ah9aNFZ7rNQ4xmDwQ@mail.gmail.com>
Message-ID: <CAMjeLr8OKB6_kGQWABbN0PxwxBfSvPtsZpO9XCVdDBFVd+J-2Q@mail.gmail.com>

More messages I didn't realize weren't being sent to the group....[mark]
On Wed, Feb 8, 2012 at 7:13 PM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Mark Janssen writes:
>
>  > The math (in my world) simply decided that factorial(0)=1 as the
>  > convention of "an empty product" (Wikipedia::Factorial).
>
> In modern math (ie, post-Eilenberg-Mac Lane), it's not really a
> convention (unlike, say, Euclid's Parallel Postulate); it's the only
> way to go if you want the idea of product to generalize.  If you don't
> understand that, I have serious doubts that you know what you're
> talking about.  If you do understand that, please take care to be more
> precise.
>

Awesome.  I didn't know anyone else really understood this kind of issue.

Yes, I want the idea to generalize.  In this case, not of "product" and
arithmetic (in a mathematical space), but of "object model" and the notion
of "grouping" (in a set-theoretical space).  So a formalization must be
made, and perhaps this arena will be the place to do that.

I have to say that I'm approaching this from in the domain of computer
science, so in some ways creating a definition in a new "space", or at
least a space separate from the Platonian "Abstract" of mathematics.

Love it!  cheers!

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/9c2f3ee0/attachment.html>

From dreamingforward at gmail.com  Tue Feb 28 05:53:41 2012
From: dreamingforward at gmail.com (Mark Janssen)
Date: Mon, 27 Feb 2012 21:53:41 -0700
Subject: [Python-ideas]  [Python-Dev] matrix operations on dict :)
In-Reply-To: <CAMjeLr_11T+uGg_vmt4w3OZtvLNN2T76LmUPV+A2fPCu+3PPLg@mail.gmail.com>
References: <CAMjeLr8OY9BgD3NDtu5V0f_RoSEwHORQKGxjdyZb2nuWud4g=g@mail.gmail.com>
	<CAFpLVkx3Y2A5isJ=QEOMsib+57kRhjC7shM=5j0JPnLPvRG12Q@mail.gmail.com>
	<CAMjeLr9k+wUeYQbpqz9Uhyv1HevYVE0YBH5=6ZFavHbuhNFhfA@mail.gmail.com>
	<87d39oycrv.fsf@uwakimon.sk.tsukuba.ac.jp>
	<CAMjeLr-+gDAhDorAQ2jNzpnUO7GzEqbK7Ah9aNFZ7rNQ4xmDwQ@mail.gmail.com>
	<CAMjeLr_11T+uGg_vmt4w3OZtvLNN2T76LmUPV+A2fPCu+3PPLg@mail.gmail.com>
Message-ID: <CAMjeLr-u9iwKn3ou67a-VzNnibf-6dmh=PBugRZtTE+MZLqPCg@mail.gmail.com>

Perhaps more specifically, I want to define a "grouping" (as encapsulated
semantically and syntactically in a dict), at the place where the
transition from atomic element into a group occurs.  Syntactically this
will be *denoted* by the curly brackets {}, operationally this will be
defined in the CPython code itself.  But the semantics in the middle must
be "hashed out" for either to occur (pardon the pun).

The question is whether it's premature to attempt such a task or not....

mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120227/05b570ad/attachment.html>

From steve at pearwood.info  Tue Feb 28 07:39:50 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 28 Feb 2012 17:39:50 +1100
Subject: [Python-ideas] adding a Debug exception?
In-Reply-To: <CAMjeLr_=hoaGnDgEiDtL-MyO4XLTYgNRwwU=E=s+mAX1_jiJLg@mail.gmail.com>
References: <CAMjeLr_=hoaGnDgEiDtL-MyO4XLTYgNRwwU=E=s+mAX1_jiJLg@mail.gmail.com>
Message-ID: <20120228063949.GA22075@ando>

On Mon, Feb 27, 2012 at 09:34:02PM -0700, Mark Janssen wrote:
> Had an idea on another thread (doctest) about a special exception called
> "Debug" that could could be raised to generate arbitrary output to stderr.
>  This would be used instead of spurious print statements in code to inform
> developers during debugging (which might throw off doctest, for example).

Printing to sys.stderr does not throw off doctest. If you use 

    print(something, file=sys.stderr)   # Python 3
    print >>sys.stderr, something  # Python 2

the output is invisible to doctest.


>  It could also replace "assert" (and improve upon it) which seems to be
> deprecated.

What makes you think assert is deprecated?

Informational messages printed to stderr and assertions are completely 
different functions. You can't replace one with the other.


> Also, the __debug__ global could actually gain some
> functionality...

What makes you think it doesn't?

__debug__ is very useful for conditional compilation of debugging code 
that is safe to optimise away when running under -O. I use it in most of 
my projects.


> Its "argument" could be an `eval`uatable string (checked at compile time)
> and it's output, this very string *plus* the output of it (if it evaluates
> to something different than itself).

So you mean, anything except a quine would be printed?

I don't get what you mean, or how you intend for this to be used. 


-- 
Steven


From steve at pearwood.info  Tue Feb 28 08:59:56 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 28 Feb 2012 18:59:56 +1100
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <CAMjeLr8nG03CfSkbY-UctkYP8x6G=+zTnEP8Mgey71qWN=HOAg@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<20120220132832.76b772da@resist.wooz.org>
	<CAMjeLr_49+g4Oegyp4DggiUWhfsbTyYdrRC38VQMceunEbuVYw@mail.gmail.com>
	<CAMjeLr8nG03CfSkbY-UctkYP8x6G=+zTnEP8Mgey71qWN=HOAg@mail.gmail.com>
Message-ID: <20120228075956.GB22075@ando>

On Mon, Feb 27, 2012 at 12:28:17PM -0700, Mark Janssen wrote:
> On Mon, Feb 20, 2012 at 11:28 AM, Barry Warsaw <barry at python.org> wrote:
> > On Feb 17, 2012, at 02:57 PM, Mark Janssen wrote:
> > FWIW, I think doctests are fantastic and I use them all the time. ?There are
> > IMO a couple of things to keep in mind:
> >
> > ?- doctests are documentation first. ?Specifically, they are testable
> > ? documentation. ?What better way to ensure that your documentation is
> > ? accurate and up-to-date? ?(And no, I do not generally find skew between the
> > ? code and the separate-file documentation.)

I second what Barry says here. Doctests are for documentation, or at 
least, doctests in docstrings are for documentation, which means they 
should be simple and minimal, and only cover the most important parts of 
your function. Certainly they should only cover the interface, and never 
the implementation, so not used for regression testing or coverage of 
odd corner cases.

I have no problem with extensive doctests if they are put in an external 
document. I do this myself. But when I call help(func), I want to learn 
how to use func, not see seven pages of tests that don't help me 
understand how to use the function.


> > My doctests usually describe mostly the good path through the API.
> > Occasionally I'll describe error modes if I think those are important for
> > understanding how to use the code. ?However, for all those fuzzy corner cases,
> > weird behaviors, bug fixes, etc., unittests are much better suited because
> > ensuring you've fixed these problems and don't regress in the future doesn't
> > help the narrative very much.

And again, +1 with what Barry says here, which means I disagree with 
your response:

> I think is an example of (mal)adapting to an incomplete module, rather
> than fixing it. ?I think doctest can handle all the points you're
> making. ?See clarification pointers below...

I don't accept this argument. Doctest is designed for including example 
code in documentation, and ensuring that the examples are correct. For 
that, it does a very good job. It makes a great hammer. Don't use it 
when you need a spanner.

It's not that doctest can't handle regression tests, but that regression 
tests shouldn't be put inside the the function docstring. Why should 
people see a test case for some bug that occurred three versions back 
in the documentation? Put it in a separate test suite, either unit 
tests, or a literate programming doc using doctest. Don't pollute the 
docstring with tests that aren't useful documentation.

When people read your docstring, you have to expect that they are 
reading it in isolation. They want to know "How do I use function 
spam?", and any example code should show them how to use function spam:

    >>> data = (23, 42, 'foo', 9)
    >>> collector = [5]
    >>> spam(data, collector)
    >>> collector
    [5, 28, 70, None, 79]

That works as documentation first, and as a test second. This does not:

    >>> spam(data, collector)  # data and collector defined elsewhere
    >>> collector
    [5, 28, 70, None, 79]

The fact that each docstring sees a fresh execution context is a 
good thing, not a bug.


> >>1. Execution context determined by outer-scope doctest defintions.
> >
> > Can you explain this one?
> 
> I gave an example in a prior message on this thread, dated Feb 17. ?I
> think it's clear there but let me know.
> 
> Basically, the idea is that since the class def can also have a
> docstring, where better would setup and teardown code go to provide
> the execution context of the inner method docstrings?

Is that a trick question? I don't want docstrings to have automatic 
setup and teardown code at all, and if they did, I certainly don't want 
them to be in some other object's docstring (related or not).


> Now the question: ?is it useful or appropriate to put setup and
> teardown code in a classdef docstring?

In my opinion, no, neither useful nor appropriate. It would be 
counter-productive, by reducing the value of documentation as 
documentation, while still being insufficiently powerful to replace unit 
tests.

A big -1 on this.


[...]
> Well, hopefully, I've convinced you a little that the limitations in
> doctests over unittests are almost, if not entirely due, to the
> incompleteness of the module. ?If the two items I mentioned were
> implemented I think it would be far superior to unittest.

I already think that doctest is far superior to unittest, for testing 
executable examples in documentation. I don't think it is superior to 
unittest for unit testing, or regression testing. Nor is it inferior -- 
its just different.


>?(Corner
> cases, etc can all find a place, because every corner case should be
> documented somewhere anyway!!)

I think you have a different idea of "corner case" than I do.

Corner cases, in my experience, refer to the implementation: does the 
function work correctly when the input is in the corner? Since this is 
testing the implementation, it shouldn't be in the documentation.

The classic example is, does this list-function work when the list is 
empty? So I would expect that the unit tests for, say, the sorted() 
built-in will include a test case for sorted([]). (This applies 
regardless of whether you use unittest, doctest, nose, or some other 
testing framework.)

But the documentation for sorted() don't need to explicitly state that 
sorting an empty list returns an empty list. That's a given from the 
accepted meaning of sorting -- if there's nothing to sort, you get 
nothing. Nor does it need to explicitly state that sorting a list with 
one item returns a list with one item. A single example of sorting a 
list of (say) four items is sufficient to document the purpose of 
sorted(), but it would be completely insufficient for unit testing 
purposes.


-- 
Steven


From ben+python at benfinney.id.au  Tue Feb 28 11:24:46 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 28 Feb 2012 21:24:46 +1100
Subject: [Python-ideas] Fwd:  doctest
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<20120220132832.76b772da@resist.wooz.org>
	<CAMjeLr_49+g4Oegyp4DggiUWhfsbTyYdrRC38VQMceunEbuVYw@mail.gmail.com>
	<CAMjeLr8nG03CfSkbY-UctkYP8x6G=+zTnEP8Mgey71qWN=HOAg@mail.gmail.com>
	<20120228075956.GB22075@ando>
Message-ID: <87r4xfp85t.fsf@benfinney.id.au>

Steven D'Aprano <steve at pearwood.info> writes:

> On Mon, Feb 27, 2012 at 12:28:17PM -0700, Mark Janssen wrote:
> > Well, hopefully, I've convinced you a little that the limitations in
> > doctests over unittests are almost, if not entirely due, to the
> > incompleteness of the module.

I don't think ?doctest? is incomplete. It comprehensively covers the use
case for which it is designed.

The ?unittest? module is limited in usefulness at testing code examples
in documentation. That doesn't make it incomplete, either.

> > If the two items I mentioned were implemented I think it would be
> > far superior to unittest.
>
> I already think that doctest is far superior to unittest, for testing
> executable examples in documentation. I don't think it is superior to
> unittest for unit testing, or regression testing. Nor is it inferior
> -- its just different.

+1 QotW.

-- 
 \        ?The fact of your own existence is the most astonishing fact |
  `\    you'll ever have to confront. Don't dare ever see your life as |
_o__)    boring, monotonous, or joyless.? ?Richard Dawkins, 2010-03-10 |
Ben Finney


From ben+python at benfinney.id.au  Tue Feb 28 12:40:37 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Tue, 28 Feb 2012 22:40:37 +1100
Subject: [Python-ideas] doctest
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
Message-ID: <87ipirp4ne.fsf@benfinney.id.au>

Ian Bicking <ianb at colorstudy.com> writes:

> On Feb 17, 2012 4:12 PM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:

> > An interesting third party alternative that has been created
> > recently is behave: http://crate.io/packages/behave/
>
> This style of test is why it's so sad that doctest is ignored and
> unmaintained.

I don't see why you draw a connection. There doesn't, to me, seem any
need to expand the capabilities of ?doctest?: it does what it says on
the tin, and does it well. Other tasks require other tools.

> [the ?behave? library is] based on testing patterns developed by
> people who care to promote what they are doing, but I'm of the strong
> opinion that they are inferior to doctest.

I think the code-examples-in-documentation is a good thing to have and
it's what ?doctest? excels at.

I don't think distorting behaviour-driven specifications, of the kind
?behave? is designed to read, to fit the doctest model would be a good
thing. Can you present an argument why you think it would?

-- 
 \          ?Now Maggie, I?ll be watching you too, in case God is busy |
  `\       creating tornadoes or not existing.? ?Homer, _The Simpsons_ |
_o__)                                                                  |
Ben Finney


From ned at nedbatchelder.com  Tue Feb 28 13:17:52 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Tue, 28 Feb 2012 07:17:52 -0500
Subject: [Python-ideas] adding a Debug exception?
In-Reply-To: <CAMjeLr_=hoaGnDgEiDtL-MyO4XLTYgNRwwU=E=s+mAX1_jiJLg@mail.gmail.com>
References: <CAMjeLr_=hoaGnDgEiDtL-MyO4XLTYgNRwwU=E=s+mAX1_jiJLg@mail.gmail.com>
Message-ID: <4F4CC5F0.1070004@nedbatchelder.com>

On 2/27/2012 11:34 PM, Mark Janssen wrote:
> Had an idea on another thread (doctest) about a special exception 
> called "Debug" that could could be raised to generate arbitrary output 
> to stderr.  This would be used instead of spurious print statements in 
> code to inform developers during debugging (which might throw off 
> doctest, for example).  It could also replace "assert" (and improve 
> upon it) which seems to be deprecated.   Also, the __debug__ global 
> could actually gain some functionality...
>
Raising an exception to generate a log message?  You'd never execute the 
statement after the raise, completely destroying the flow of the code.  
Perhaps this idea needs a little more thought...  Have you looked into 
the logging module?

--Ned.
>
> mark
> santa fe
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120228/a1681fdf/attachment.html>

From tjreedy at udel.edu  Tue Feb 28 21:59:53 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 28 Feb 2012 15:59:53 -0500
Subject: [Python-ideas] Fwd:  doctest
In-Reply-To: <20120228075956.GB22075@ando>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
	<20120220132832.76b772da@resist.wooz.org>
	<CAMjeLr_49+g4Oegyp4DggiUWhfsbTyYdrRC38VQMceunEbuVYw@mail.gmail.com>
	<CAMjeLr8nG03CfSkbY-UctkYP8x6G=+zTnEP8Mgey71qWN=HOAg@mail.gmail.com>
	<20120228075956.GB22075@ando>
Message-ID: <jijf8c$q2l$1@dough.gmane.org>

On 2/28/2012 2:59 AM, Steven D'Aprano wrote:

> Corner cases, in my experience, refer to the implementation: does the
> function work correctly when the input is in the corner? Since this is
> testing the implementation, it shouldn't be in the documentation.

It depends on the function. fact(0) = 1 could go in a doc string. So, I 
think, could combo(0,0) = 1. But sometimes implementations introduce a 
special case that is an artifact of the implementation and definitely 
not belong in a doc string. An real example is approximating the normal 
integral (cumlative normal distribution) with different approximations 
for [0,a] and (a,infinity). Unit tests should test f(a) and 
f(a+epsilonl) and the difference (should be >= 0) to make sure the two 
approximations 'join' properly so as to be transparent to the user. If 
the join point changes, so does the special unit test.

-- 
Terry Jan Reedy


From barry at python.org  Tue Feb 28 23:14:14 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 28 Feb 2012 17:14:14 -0500
Subject: [Python-ideas] doctest
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
Message-ID: <20120228171414.3cc7e38a@limelight.wooz.org>

On Feb 27, 2012, at 08:35 PM, Michael Foord wrote:

>The problem of being dependent on order of unorderable types (actually very
>difficult to solve).

Actually, not so much, only because IME, I find that I rarely want to just
dump the repr of such objects.  That's usually going to be hard to read even
if the output were sorted.  Instead, I very often iterate over the items (in
sorted order of course), and use ellipses to ignore the lines (i.e. items) I
don't care about.  In practice, I haven't found this one to be so bad.

>Things like shared fixtures and mocking become *harder* (although by no
>means impossible) in a doctest environment.

Not if you use separate DocFileSuites.

>Another thing I dislike is that it encourages a "test last" approach, as by
>far the easiest way of generating doctests is to copy and paste from the
>interactive interpreter. The alternative is lots of annoying typing of
>'>>>' and '...', and as you're editing text and not code IDE support tends
>to be worse (although this is a tooling issue and not a problem with
>doctest itself).

Actually, Emacs users should use rst-mode, which has no so bad support for
separate file doctests.  Of course, the mode is useful for reST documentation
even if your documentation is untested <wink>.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120228/3d3a5692/attachment.pgp>

From barry at python.org  Tue Feb 28 23:19:18 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 28 Feb 2012 17:19:18 -0500
Subject: [Python-ideas] doctest
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>
	<CAMjeLr-C48+zc2VE67ZUQQNJG-kG_K6htBw-_MNbCSRsV8r=qg@mail.gmail.com>
	<CAKCKLWzwcWB-yHexQ6V8m8pQQnNKBb03nY1HZPhTcXybCiofNw@mail.gmail.com>
	<CAHtfchr7tyqJEVuc-Y_3+eZYh43EnnS9SMySHqu7MHWsTAt-kQ@mail.gmail.com>
Message-ID: <20120228171918.3db69bd2@limelight.wooz.org>

On Feb 27, 2012, at 05:44 PM, Ian Bicking wrote:

>Doctest needs reliable repr's more than reversable repr's, and you can create
>them using that.  You'll still get a lot of <foobar.Foobar object at
>0x391a9df> strings, which suck... but if you are committed to doctest then
>maybe better to provide good __repr__ methods on your custom objects!

+1 even if you don't use doctests!  I can't tell you how many times adding a
useful repr has vastly improved debugging.  I urge everyone to flesh out your
reprs with a little bit of useful information so you can quickly identify your
instances at a pdb prompt.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120228/f3f5d8ab/attachment.pgp>

From barry at python.org  Tue Feb 28 23:16:35 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 28 Feb 2012 17:16:35 -0500
Subject: [Python-ideas] doctest
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<CAKCKLWzwuH1yGW1paLq+raBFFsXCRcC8dZpQfCpzqBpnJzb4HQ@mail.gmail.com>
Message-ID: <20120228171635.6fe3e4d6@limelight.wooz.org>

On Feb 27, 2012, at 10:59 PM, Michael Foord wrote:

>    I may be misremembering, but I think debugging support is also
>problematic because of the stdout redirection.

This one is largely solved too, but the trick is to put the pdb entry on the
same line as the doctest line you care about, e.g.:

    >>> import pdb; pdb.set_trace(); command.process(None)
    GNU Mailman 3...

When the debugger drops me into command.process(), everything Just Works.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120228/e3c30ec3/attachment.pgp>

From ethan at stoneleaf.us  Tue Feb 28 23:16:21 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 28 Feb 2012 14:16:21 -0800
Subject: [Python-ideas] Fwd: doctest
In-Reply-To: <CAMjeLr-JtfaKyQvzAuP=2E9A7KXkjT_V+9BsaDpjpgYGnc6t+g@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>	<CADiSq7dpuyGrLOwRVHnCE8eyXmpgBDzqw5MkK146ang4SOEi7w@mail.gmail.com>	<CAMjeLr9KWb4gantf7B4eyZ-MVRL7dt_cucM_3OzAYtveOrmEsw@mail.gmail.com>	<CAMjeLr85i1pTQGGqNi7fw0=WfxMeG9YPAmy9KE7pcqhv7KyJNQ@mail.gmail.com>	<4F4BE07C.1000505@stoneleaf.us>	<CAMjeLr-v_gDng9N8pMRODx=xvY-T_RU6SNK5_44qVkvwNvU0Vw@mail.gmail.com>
	<CAMjeLr-JtfaKyQvzAuP=2E9A7KXkjT_V+9BsaDpjpgYGnc6t+g@mail.gmail.com>
Message-ID: <4F4D5235.5000601@stoneleaf.us>

Mark Janssen wrote:
> On Mon, Feb 27, 2012 at 12:58 PM, Ethan Furman wrote:
>> The other gripe I have (possibly easily fixed): my python prompt is '-->'
>> (makes email posting easier) -- should my doctests still use '>>>'?  Will
>> doctest fail on my machine?
> 
> As written, yes, but easily changeable in the module code for your
> unique case....

Which means my doctests will then fail on other's machines unless they 
also change their local module *and* their python prompt.  Not good.

~Ethan~


From ncoghlan at gmail.com  Wed Feb 29 00:07:13 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 29 Feb 2012 09:07:13 +1000
Subject: [Python-ideas] More helpers in reprlib (was Re:  doctest)
Message-ID: <CADiSq7d9LwP0FmGM1EA-Cf9Rgg1Z=pkS=m+_vfdkwb3SG6w2mg@mail.gmail.com>

On Wed, Feb 29, 2012 at 8:19 AM, Barry Warsaw <barry at python.org> wrote:
> On Feb 27, 2012, at 05:44 PM, Ian Bicking wrote:
>
>>Doctest needs reliable repr's more than reversable repr's, and you can create
>>them using that. ?You'll still get a lot of <foobar.Foobar object at
>>0x391a9df> strings, which suck... but if you are committed to doctest then
>>maybe better to provide good __repr__ methods on your custom objects!
>
> +1 even if you don't use doctests! ?I can't tell you how many times adding a
> useful repr has vastly improved debugging. ?I urge everyone to flesh out your
> reprs with a little bit of useful information so you can quickly identify your
> instances at a pdb prompt.

Since this question came up recently, what do you think of adding some
more helpers to reprlib to make this even easier to do?

I know I just added some utility functions to PulpDist [1] to avoid
reinventing that particular wheel for each of my class definitions.

Cheers,
Nick.

[1] http://git.fedorahosted.org/git/?p=pulpdist.git;a=blob;f=src/pulpdist/core/util.py

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Wed Feb 29 00:49:32 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 29 Feb 2012 10:49:32 +1100
Subject: [Python-ideas] More helpers in reprlib (was Re:  doctest)
In-Reply-To: <CADiSq7d9LwP0FmGM1EA-Cf9Rgg1Z=pkS=m+_vfdkwb3SG6w2mg@mail.gmail.com>
References: <CADiSq7d9LwP0FmGM1EA-Cf9Rgg1Z=pkS=m+_vfdkwb3SG6w2mg@mail.gmail.com>
Message-ID: <4F4D680C.9000603@pearwood.info>

Nick Coghlan wrote:
> On Wed, Feb 29, 2012 at 8:19 AM, Barry Warsaw <barry at python.org> wrote:
>> On Feb 27, 2012, at 05:44 PM, Ian Bicking wrote:
>>
>>> Doctest needs reliable repr's more than reversable repr's, and you can create
>>> them using that.  You'll still get a lot of <foobar.Foobar object at
>>> 0x391a9df> strings, which suck... but if you are committed to doctest then
>>> maybe better to provide good __repr__ methods on your custom objects!
>> +1 even if you don't use doctests!  I can't tell you how many times adding a
>> useful repr has vastly improved debugging.  I urge everyone to flesh out your
>> reprs with a little bit of useful information so you can quickly identify your
>> instances at a pdb prompt.
> 
> Since this question came up recently, what do you think of adding some
> more helpers to reprlib to make this even easier to do?

Your question is too general. Of course people should be in favour of helpers 
to simplify making good reprs, but that's like asking if people are in favour 
of solving world hunger. Who wouldn't be? But the answer should depend on what 
the helpers do, how well they do them, and whether or not they actually help.

I fear getting carried away with enthusiasm for repr helpers and dumping a lot 
of unnecessary, trivial or sub-optimal helpers in reprlib, where they will be 
enshrined as "the one obvious way to do it" when perhaps they shouldn't be. 
Since you are the author of them, I'm sure that they scratch your itches, but 
will they scratch other people's?

I suggest publishing them as recipes on ActiveState first, and see what 
feedback you get.

http://code.activestate.com/recipes/langs/python/top/


-- 
Steven


From ben+python at benfinney.id.au  Wed Feb 29 02:23:30 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 29 Feb 2012 12:23:30 +1100
Subject: [Python-ideas] [OT] rst-mode (was: doctest)
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<20120228171414.3cc7e38a@limelight.wooz.org>
Message-ID: <87aa42ph4d.fsf_-_@benfinney.id.au>

Barry Warsaw <barry at python.org> writes:

> Actually, Emacs users should use rst-mode, which has no so bad support
> for separate file doctests. Of course, the mode is useful for reST
> documentation even if your documentation is untested <wink>.

Any idea where I should send bug reports for ?rst-mode?? It's not clear
to me who develops it.

-- 
 \     ?Welchen Teil von ?Gestalt? verstehen Sie nicht?  [What part of |
  `\                ?gestalt? don't you understand?]? ?Karsten M. Self |
_o__)                                                                  |
Ben Finney
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/cd06ba7d/attachment.pgp>

From craigyk at me.com  Wed Feb 29 08:02:51 2012
From: craigyk at me.com (Craig Yoshioka)
Date: Tue, 28 Feb 2012 23:02:51 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
Message-ID: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>

So I've recently been trying to implement something for which I had hoped the 'with' statement would be perfect, but it turns out, won't work because Python provides no mechanism by which to skip the block of code in a with statement.

I want to create some functionality to make it easy too wrap command line programs in a caching architecture.  To do this there are some things that need to happen before and after the wrapped CLI program is called, a try,except,finally version might look like this:

def cachedcli(*args):
    try:
    	hashedoutput = hashon(args)    
        if iscached():
            return hashedoutput
        acquirelock()
        cli(*args,hashedoutput)
        iscached(True)
        return hashedoutput
    except AlreadyLocked:
        while locked:
            wait()
        return example(*args)
    finally:
        releaselock()

the 'with' version would look like

def cachedcli(*args)
    hashedpath = hashon(args)
    with cacheon(hashedpath):
         cli(hashedpath,*args)
    return hashedpath
    

So obviously the 'with' statement would be a good fit, especially since non-python programmers might be wrapping their CLI programs... unfortunately I can't use 'with' because I can't find a clean way to make the with block code conditional.

PEP377 suggested some mechanics that seemed a bit complicated for getting the desired effect, but I think, and correct me if I'm wrong, that the same effect could be achieved by having the __enter__ function raise a StopIteration that would be caught by the context and skip directly to the __exit__ function.  The semantics of this even make some sense too me, since the closest I've been able to get to what I had hoped for was using an iterator to execute the appropriate code before and after the loop block:

def cachedcli(*args)
    hashedpath = hashon(args)
    for _ in cacheon(hashedpath):
         cli(hashedpath,*args)
    return hashedpath

this still seems non-ideal to me...


From ncoghlan at gmail.com  Wed Feb 29 09:23:39 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 29 Feb 2012 18:23:39 +1000
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>
References: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>
Message-ID: <CADiSq7ePaGp=trBmfSY8jmbAsZPm4YkZxQfJGBPNA_PaxqrvAw@mail.gmail.com>

On Wed, Feb 29, 2012 at 5:02 PM, Craig Yoshioka <craigyk at me.com> wrote:
> PEP377 suggested some mechanics that seemed a bit complicated for getting the desired effect, but I think, and correct me if I'm wrong, that the same effect could be achieved by having the __enter__ function raise a StopIteration that would be caught by the context and skip directly to the __exit__ function.

It was the overhead of doing exception handling around the __enter__
call that got PEP 377 rejected.

One way to handle this case is to use a separate if statement to make
the flow control clear.

    with cm() as run_body:
        if run_body:
            # Do stuff

Depending on the use case, the return value from __enter__ may be a
simple flag as shown, or it may be a more complex object.

Alternatively, you may want to investigate contextlib2, which aims to
provide improved support for conditional cleanup in with statements.
(in the current version, this is provided by contextlib2.ContextStack,
but the next version will offer an improved API as
contextlib2.CallbackStack. No current ETA on the next update though)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From taleinat at gmail.com  Wed Feb 29 15:07:19 2012
From: taleinat at gmail.com (Tal Einat)
Date: Wed, 29 Feb 2012 16:07:19 +0200
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>
References: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>
Message-ID: <CALWZvp6ZDhY2UXh353ti5cGnr5NcfTwp7Zd4NOfsuNRTKkz70w@mail.gmail.com>

On Wed, Feb 29, 2012 at 09:02, Craig Yoshioka <craigyk at me.com> wrote:
> So I've recently been trying to implement something for which I had hoped the 'with' statement would be perfect, but it turns out, won't work because Python provides no mechanism by which to skip the block of code in a with statement.
>
> I want to create some functionality to make it easy too wrap command line programs in a caching architecture. ?To do this there are some things that need to happen before and after the wrapped CLI program is called, a try,except,finally version might look like this:
>
> def cachedcli(*args):
> ? ?try:
> ? ? ? ?hashedoutput = hashon(args)
> ? ? ? ?if iscached():
> ? ? ? ? ? ?return hashedoutput
> ? ? ? ?acquirelock()
> ? ? ? ?cli(*args,hashedoutput)
> ? ? ? ?iscached(True)
> ? ? ? ?return hashedoutput
> ? ?except AlreadyLocked:
> ? ? ? ?while locked:
> ? ? ? ? ? ?wait()
> ? ? ? ?return example(*args)
> ? ?finally:
> ? ? ? ?releaselock()
>
> the 'with' version would look like
>
> def cachedcli(*args)
> ? ?hashedpath = hashon(args)
> ? ?with cacheon(hashedpath):
> ? ? ? ? cli(hashedpath,*args)
> ? ?return hashedpath
>
>
> So obviously the 'with' statement would be a good fit, especially since non-python programmers might be wrapping their CLI programs... unfortunately I can't use 'with' because I can't find a clean way to make the with block code conditional.
>
> PEP377 suggested some mechanics that seemed a bit complicated for getting the desired effect, but I think, and correct me if I'm wrong, that the same effect could be achieved by having the __enter__ function raise a StopIteration that would be caught by the context and skip directly to the __exit__ function. ?The semantics of this even make some sense too me, since the closest I've been able to get to what I had hoped for was using an iterator to execute the appropriate code before and after the loop block:
>
> def cachedcli(*args)
> ? ?hashedpath = hashon(args)
> ? ?for _ in cacheon(hashedpath):
> ? ? ? ? cli(hashedpath,*args)
> ? ?return hashedpath
>
> this still seems non-ideal to me...

Specifically with regard to caching, I recommend writing a CLI
execution class which implements the caching logic internally.

If you really want to do this with some special syntax sugar, use
decorators, which are good for wrapping functions/methods with
caching.

The "with" statement is IMO not suitable here (and rightfully so).

- Tal Einat


From fuzzyman at gmail.com  Wed Feb 29 15:24:30 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Wed, 29 Feb 2012 14:24:30 +0000
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <CADiSq7ePaGp=trBmfSY8jmbAsZPm4YkZxQfJGBPNA_PaxqrvAw@mail.gmail.com>
References: <E1F1E547-F69D-4363-B529-DC3B9887E9FD@me.com>
	<CADiSq7ePaGp=trBmfSY8jmbAsZPm4YkZxQfJGBPNA_PaxqrvAw@mail.gmail.com>
Message-ID: <CAKCKLWxp8ytTRdhJGfa=YPrBcWKLU8hNoTVp5OsE=u5FE0Qy_A@mail.gmail.com>

On 29 February 2012 08:23, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Wed, Feb 29, 2012 at 5:02 PM, Craig Yoshioka <craigyk at me.com> wrote:
> > PEP377 suggested some mechanics that seemed a bit complicated for
> getting the desired effect, but I think, and correct me if I'm wrong, that
> the same effect could be achieved by having the __enter__ function raise a
> StopIteration that would be caught by the context and skip directly to the
> __exit__ function.
>
> It was the overhead of doing exception handling around the __enter__
> call that got PEP 377 rejected.
>
> One way to handle this case is to use a separate if statement to make
> the flow control clear.
>
>    with cm() as run_body:
>        if run_body:
>            # Do stuff
>
> Depending on the use case, the return value from __enter__ may be a
> simple flag as shown, or it may be a more complex object.
>


The trouble with this is it indents all your code an extra level. One
possibility would be allowing continue in a with statement as an early exit:

   with cm() as run_body:
       if not run_body:
           continue

Michael


>
> Alternatively, you may want to investigate contextlib2, which aims to
> provide improved support for conditional cleanup in with statements.
> (in the current version, this is provided by contextlib2.ContextStack,
> but the next version will offer an improved API as
> contextlib2.CallbackStack. No current ETA on the next update though)
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/329a348f/attachment.html>

From craigyk at me.com  Wed Feb 29 18:49:17 2012
From: craigyk at me.com (Craig Yoshioka)
Date: Wed, 29 Feb 2012 17:49:17 +0000 (GMT)
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <CALWZvp6ZDhY2UXh353ti5cGnr5NcfTwp7Zd4NOfsuNRTKkz70w@mail.gmail.com>
Message-ID: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>

I've tried classes, decorators, and passing the conditional using 'as', as suggested by Michael, ?so I disagree that with is not suitable here since I have yet to find a better alternative. ?If you want I can give pretty concrete examples in the ways they aren't as good. ?Furthermore, I think it could be argued that it makes more sense to be able to safely skip the with body without the user of the with statement having to manually catch the exception themselves.... we don't make people catch the StopIteration exception manually when using iterators...

1) ?I can't think of many instances in python where a block of code can not be conditionally executed safely:
?? ? if - obvious
?? ? functions - need to be called
?? ? loops - can have 0 or more iterations
?? ? try/except/finally - even here there is the same notion of the code blocks being conditionally executed, just a bit more scrambled

in my view, the 'with' statement exists just because it is nice sugar for bracketing boilerplate around a block of code, so it might as well do that in the most general, reasonable way. ?And I think this behavior is pretty reasonable.?? ?

On Feb 29, 2012, at 06:07 AM, Tal Einat <taleinat at gmail.com> wrote:

On Wed, Feb 29, 2012 at 09:02, Craig Yoshioka <craigyk at me.com> wrote:
> So I've recently been trying to implement something for which I had hoped the 'with' statement would be perfect, but it turns out, won't work because Python provides no mechanism by which to skip the block of code in a with statement.
>
> I want to create some functionality to make it easy too wrap command line programs in a caching architecture. ?To do this there are some things that need to happen before and after the wrapped CLI program is called, a try,except,finally version might look like this:
>
> def cachedcli(*args):
> ? ?try:
> ? ? ? ?hashedoutput = hashon(args)
> ? ? ? ?if iscached():
> ? ? ? ? ? ?return hashedoutput
> ? ? ? ?acquirelock()
> ? ? ? ?cli(*args,hashedoutput)
> ? ? ? ?iscached(True)
> ? ? ? ?return hashedoutput
> ? ?except AlreadyLocked:
> ? ? ? ?while locked:
> ? ? ? ? ? ?wait()
> ? ? ? ?return example(*args)
> ? ?finally:
> ? ? ? ?releaselock()
>
> the 'with' version would look like
>
> def cachedcli(*args)
> ? ?hashedpath = hashon(args)
> ? ?with cacheon(hashedpath):
> ? ? ? ? cli(hashedpath,*args)
> ? ?return hashedpath
>
>
> So obviously the 'with' statement would be a good fit, especially since non-python programmers might be wrapping their CLI programs... unfortunately I can't use 'with' because I can't find a clean way to make the with block code conditional.
>
> PEP377 suggested some mechanics that seemed a bit complicated for getting the desired effect, but I think, and correct me if I'm wrong, that the same effect could be achieved by having the __enter__ function raise a StopIteration that would be caught by the context and skip directly to the __exit__ function. ?The semantics of this even make some sense too me, since the closest I've been able to get to what I had hoped for was using an iterator to execute the appropriate code before and after the loop block:
>
> def cachedcli(*args)
> ? ?hashedpath = hashon(args)
> ? ?for _ in cacheon(hashedpath):
> ? ? ? ? cli(hashedpath,*args)
> ? ?return hashedpath
>
> this still seems non-ideal to me...

Specifically with regard to caching, I recommend writing a CLI
execution class which implements the caching logic internally.

If you really want to do this with some special syntax sugar, use
decorators, which are good for wrapping functions/methods with
caching.

The "with" statement is IMO not suitable here (and rightfully so).

- Tal Einat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/0d7855f1/attachment.html>

From raymond.hettinger at gmail.com  Wed Feb 29 20:30:36 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Wed, 29 Feb 2012 11:30:36 -0800
Subject: [Python-ideas] doctest
In-Reply-To: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
References: <CAMjeLr9phiyemdWdK-YHuGyuBTjJUpz4A7VXyrE-82fnaCUCRA@mail.gmail.com>
Message-ID: <EBF84B5E-5914-4731-94B2-776844D83855@gmail.com>


On Feb 17, 2012, at 1:57 PM, Mark Janssen wrote:

> I find myself wanting to use doctest for some test-driven development,
> and find myself slightly frustrated 

ISTM that you're doing it wrong ;-)
Doctests are all about testing documentation, not about unittesting.

And because they are very literal (in fact, intentionally stupid
with respect to whitespace), doctests are inappropriate for
test driven development.  It is *much* easier to test the function
by hand and then cut-and-paste the test/result pair into the docstring.

Extending the doctest module to support your style of using it
would likely be counter-productive as that would encourage
more people to use the wrong tool for the job -- the doctest
style is almost completely at odds with the principles of
unittesting (i.e. isolated/independent tests, etc).

My clients tend to use doctests quite a bit (that is what I teach),
yet the need for doctest extensions almost never arises when
it is being used as designed.

I suggest that you try out some other third-party testing packages
that are designed to accommodate other testing styles.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/f7d18db3/attachment.html>

From craigyk at me.com  Wed Feb 29 20:44:01 2012
From: craigyk at me.com (Craig Yoshioka)
Date: Wed, 29 Feb 2012 11:44:01 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <4F4E7CA8.7010109@stoneleaf.us>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<4F4E7CA8.7010109@stoneleaf.us>
Message-ID: <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>

Ok, I'll go clean them up to try and present them as concisely as possible.  The code to skip the with body would have to go in the __enter__ method because wether the body should be executed is dependent on the semantics of the context being used.  Imagine a context that looks like:

with uncached('file') as file:
    # write data to file

Making the context skippable only from __enter__ means the person writing the context can be more confident of the possible code paths.  And the person writing the body code could always just 'skip' manually anyways by returning early, i.e. per Michael's suggestion.

with uncached('file') as file:
    if not file: return

which isn't so bad, except it is overloading the meaning of file a bit, and why shouldn't the with block be skippable?

I can see a couple of ways it might work: 

1) catch raised StopIteration, or a new 'SkipWithBlock', exception thrown from the __enter__ code
2) skip the with block when __enter__ returns a unique value like SkipWithBlock, otherwise assign the returned value using 'as'

In my mind 2 should be easy? to implement, and shouldn't break any existing code since the new sentinel value didn't exist before anyways.  Maybe it would be more efficient that also wrapping __enter__ in yet another try|except|finally?


On Feb 29, 2012, at 11:29 AM, Ethan Furman wrote:

> Craig Yoshioka wrote:
>> I've tried classes, decorators, and passing the conditional using 'as', as suggested by Michael,  so I disagree that with is not suitable here since I have yet to find a better alternative.  If you want I can give pretty concrete examples in the ways they aren't as good.
> 
> I would be interested in your concrete examples.
> 
> As far as conditionally skipping the with body, where would that code go?  In __enter__?  How would it know whether or not to skip?
> 
> ~Ethan~


From ethan at stoneleaf.us  Wed Feb 29 20:29:44 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 29 Feb 2012 11:29:44 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
Message-ID: <4F4E7CA8.7010109@stoneleaf.us>

Craig Yoshioka wrote:
> I've tried classes, decorators, and passing the conditional using 'as', 
> as suggested by Michael,  so I disagree that with is not suitable here 
> since I have yet to find a better alternative.  If you want I can give 
> pretty concrete examples in the ways they aren't as good.

I would be interested in your concrete examples.

As far as conditionally skipping the with body, where would that code 
go?  In __enter__?  How would it know whether or not to skip?

~Ethan~


From arnodel at gmail.com  Wed Feb 29 21:29:43 2012
From: arnodel at gmail.com (Arnaud Delobelle)
Date: Wed, 29 Feb 2012 20:29:43 +0000
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
References: <CALWZvp6ZDhY2UXh353ti5cGnr5NcfTwp7Zd4NOfsuNRTKkz70w@mail.gmail.com>
	<6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
Message-ID: <CAJ6cK1ZC88Eh3erCs52ZGRCJDjdckvSLzX4Zx9oh6FjQyqZ6cA@mail.gmail.com>

On 29 February 2012 17:49, Craig Yoshioka <craigyk at me.com> wrote:
> I've tried classes, decorators, and passing the conditional using 'as', as
> suggested by Michael, ?so I disagree that with is not suitable here since I
> have yet to find a better alternative. ?If you want I can give pretty
> concrete examples in the ways they aren't as good. ?Furthermore, I think it
> could be argued that it makes more sense to be able to safely skip the with
> body without the user of the with statement having to manually catch the
> exception themselves....

>From PEP 343:

    But the final blow came when I read Raymond Chen's rant about
    flow-control macros[1].  Raymond argues convincingly that hiding
    flow control in macros makes your code inscrutable, and I find
    that his argument applies to Python as well as to C.

So it is explicitly stated that the with statement should not be
capable of controlling the flow.

-- 
Arnaud

[1] Raymond Chen's article on hidden flow control
http://blogs.msdn.com/oldnewthing/archive/2005/01/06/347666.aspx


From ethan at stoneleaf.us  Wed Feb 29 20:55:30 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 29 Feb 2012 11:55:30 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<4F4E7CA8.7010109@stoneleaf.us>
	<9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
Message-ID: <4F4E82B2.5050809@stoneleaf.us>

Craig Yoshioka wrote:
> Ok, I'll go clean them up to try and present them as concisely as possible.  The code to skip the with body would have to go in the __enter__ method because wether the body should be executed is dependent on the semantics of the context being used.  Imagine a context that looks like:
> 
> with uncached('file') as file:
>     # write data to file
> 
> Making the context skippable only from __enter__ means the person writing the context can be more confident of the possible code paths.  And the person writing the body code could always just 'skip' manually anyways by returning early, i.e. per Michael's suggestion.
> 
> with uncached('file') as file:
>     if not file: return

Can you give an example of the code that would be in __enter__?

~Ethan~


From ncoghlan at gmail.com  Wed Feb 29 22:11:39 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 1 Mar 2012 07:11:39 +1000
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <CAJ6cK1ZC88Eh3erCs52ZGRCJDjdckvSLzX4Zx9oh6FjQyqZ6cA@mail.gmail.com>
References: <CALWZvp6ZDhY2UXh353ti5cGnr5NcfTwp7Zd4NOfsuNRTKkz70w@mail.gmail.com>
	<6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<CAJ6cK1ZC88Eh3erCs52ZGRCJDjdckvSLzX4Zx9oh6FjQyqZ6cA@mail.gmail.com>
Message-ID: <CADiSq7f7MNkAv3dO7f1Uqj483p8yAKyMkwNszyx-zJpgDtMhoA@mail.gmail.com>

On Thu, Mar 1, 2012 at 6:29 AM, Arnaud Delobelle <arnodel at gmail.com> wrote:
> On 29 February 2012 17:49, Craig Yoshioka <craigyk at me.com> wrote:
>> I've tried classes, decorators, and passing the conditional using 'as', as
>> suggested by Michael, ?so I disagree that with is not suitable here since I
>> have yet to find a better alternative. ?If you want I can give pretty
>> concrete examples in the ways they aren't as good. ?Furthermore, I think it
>> could be argued that it makes more sense to be able to safely skip the with
>> body without the user of the with statement having to manually catch the
>> exception themselves....
>
> From PEP 343:
>
> ? ?But the final blow came when I read Raymond Chen's rant about
> ? ?flow-control macros[1]. ?Raymond argues convincingly that hiding
> ? ?flow control in macros makes your code inscrutable, and I find
> ? ?that his argument applies to Python as well as to C.
>
> So it is explicitly stated that the with statement should not be
> capable of controlling the flow.

Indeed.

Craig, if you want to pursue this to the extent of writing up a full
PEP, I suggest starting with the idea I briefly wrote up a while ago
[1].

Instead of changing the semantics of __enter__, add a new optional
method __entered__ to the protocol that executes inside the with
statement's implicit try/except block.

That is (glossing over the complexities in the real with statement
expansion), something roughly like:

    _exit = cm.__exit__
    _entered = getattr(cm, "__entered__", None)
    _var = cm.__enter__()
    try:
        if _entered is not None:
            _var = _entered(_var)
        VAR = _var # if 'as' clause is present
        # with statement body
    finally:
        _exit(*sys.exc_info())

Then CM's would be free to skip directly from __entered__ to __exit__
by raising a custom exception. GeneratorContextManagers could
similarly be updated to handle the case where the underlying generator
doesn't yield.

However, that last point highlights why I no longer like the idea: it
makes it *really* easy to accidentally create CM's that, instead of
throwing an exception if you try to reuse them inappropriately, will
instead silently skip the with statement body. The additional
expressiveness provided by such a construct is minimal, but the
additional risk of incorrectly silencing errors is quite high - that's
not a good trade-off for the overall language design.

[1] http://readthedocs.org/docs/ncoghlan_devs-python-notes/en/latest/pep_ideas/skip_with.html

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From barry at python.org  Wed Feb 29 22:35:56 2012
From: barry at python.org (Barry Warsaw)
Date: Wed, 29 Feb 2012 16:35:56 -0500
Subject: [Python-ideas] [OT] rst-mode (was: doctest)
References: <mailman.32822.1329538377.27777.python-ideas@python.org>
	<CAHtfchqWwTseJ7WLED2GyU3zsAjDyuTuLRky7DTm+AGishN_Pg@mail.gmail.com>
	<CAKCKLWy6FKXnxAsLhd9QWN9kCCkzXqzdVYeVaS4NrnLD2XerSQ@mail.gmail.com>
	<20120228171414.3cc7e38a@limelight.wooz.org>
	<87aa42ph4d.fsf_-_@benfinney.id.au>
Message-ID: <20120229163556.67d31009@resist.wooz.org>

On Feb 29, 2012, at 12:23 PM, Ben Finney wrote:

>Barry Warsaw <barry at python.org> writes:
>
>> Actually, Emacs users should use rst-mode, which has no so bad support
>> for separate file doctests. Of course, the mode is useful for reST
>> documentation even if your documentation is untested <wink>.
>
>Any idea where I should send bug reports for ?rst-mode?? It's not clear
>to me who develops it.

From the head of the file that I have in my personal elisp:

;;; rst.el --- Mode for viewing and editing reStructuredText-documents.

;; Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
;;   Free Software Foundation, Inc.

;; Maintainer: Stefan Merten <smerten at oekonux.de>
;; Author: Martin Blais <blais at furius.ca>,
;;         David Goodger <goodger at python.org>,
;;         Wei-Wei Guo <wwguocn at gmail.com>

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/c649f62e/attachment.pgp>

From craigyk at me.com  Wed Feb 29 22:47:45 2012
From: craigyk at me.com (Craig Yoshioka)
Date: Wed, 29 Feb 2012 13:47:45 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <4F4E82B2.5050809@stoneleaf.us>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<4F4E7CA8.7010109@stoneleaf.us>
	<9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
	<4F4E82B2.5050809@stoneleaf.us>
Message-ID: <3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com>


On Feb 29, 2012, at 11:55 AM, Ethan Furman wrote:

> From PEP 343:
> 
>    But the final blow came when I read Raymond Chen's rant about
>    flow-control macros[1].  Raymond argues convincingly that hiding
>    flow control in macros makes your code inscrutable, and I find
>    that his argument applies to Python as well as to C.
> 
> So it is explicitly stated that the with statement should not be
> capable of controlling the flow.
> 

I read the rant, and I agree in principle, but I think it's also a far stretch to draw a line between a very confusing non-standard example of macros in C, and documentable behavior of a built-in statement.  That is, the only reason you might say with would be hiding flow-control is because people don't currently expect it to.  I also think that when people use non-builtin contextmanagers it's usually within a very specific... context (*dammit*), and so they are likely to look up why they are using an object as a context manager.  That's where you would document the behavior:

with uncached(path):
  # code here only executes if the path does not exist  


> Indeed.
> 
> Craig, if you want to pursue this to the extent of writing up a full
> PEP, I suggest starting with the idea I briefly wrote up a while ago
> [1].
> 
> Instead of changing the semantics of __enter__, add a new optional
> method __entered__ to the protocol that executes inside the with
> statement's implicit try/except block.
> 
> That is (glossing over the complexities in the real with statement
> expansion), something roughly like:
> 
>    _exit = cm.__exit__
>    _entered = getattr(cm, "__entered__", None)
>    _var = cm.__enter__()
>    try:
>        if _entered is not None:
>            _var = _entered(_var)
>        VAR = _var # if 'as' clause is present
>        # with statement body
>    finally:
>        _exit(*sys.exc_info())

that is an interesting alternative... do you see that as much better than __enter__ passing some sort of unique value to signal the skip?  I can't say I'm enamored of doing it with a signal value, just thought it would be easier to implement (and not require more exception handling):

   _exit = cm.__exit__
   _var = cm.__enter__()
   if _var == SkipWithBody:
     _exit(None,None,None)
   try:
       VAR = _var # if 'as' clause is present
       # with statement body
   finally:
       _exit(*sys.exc_info())


From ethan at stoneleaf.us  Wed Feb 29 23:03:00 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 29 Feb 2012 14:03:00 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<4F4E7CA8.7010109@stoneleaf.us>
	<9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
	<4F4E82B2.5050809@stoneleaf.us>
	<3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com>
Message-ID: <4F4EA094.30907@stoneleaf.us>

Craig Yoshioka wrote:
> On Feb 29, 2012, at 11:55 AM, Ethan Furman wrote:
> 
>> From PEP 343:
>>
>>    But the final blow came when I read Raymond Chen's rant about
>>    flow-control macros[1].  Raymond argues convincingly that hiding
>>    flow control in macros makes your code inscrutable, and I find
>>    that his argument applies to Python as well as to C.
>>
>> So it is explicitly stated that the with statement should not be
>> capable of controlling the flow.
>>
> 
> I read the rant, and I agree in principle, but I think it's also a far stretch to draw a line between a very confusing non-standard example of macros in C, and documentable behavior of a built-in statement.  That is, the only reason you might say with would be hiding flow-control is because people don't currently expect it to.  I also think that when people use non-builtin contextmanagers it's usually within a very specific... context (*dammit*), and so they are likely to look up why they are using an object as a context manager.  That's where you would document the behavior:
> 
> with uncached(path):
>   # code here only executes if the path does not exist  

I am -1 on the idea.

if / while / for / try   are *always* flow control.

Your proposal would have 'with' sometimes being flow control, and 
sometimes not, and the only way to know is look at the object's code 
and/or docs.  This makes for a lot more complication for very little gain.

~Ethan~


From ironfroggy at gmail.com  Wed Feb 29 23:23:05 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Wed, 29 Feb 2012 17:23:05 -0500
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <4F4EA094.30907@stoneleaf.us>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>
	<4F4E7CA8.7010109@stoneleaf.us>
	<9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
	<4F4E82B2.5050809@stoneleaf.us>
	<3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com>
	<4F4EA094.30907@stoneleaf.us>
Message-ID: <CAGaVwhTh0XFkFgwpvRduSJx9EisKx6kb=Yvxvv1=z85v3UNYfg@mail.gmail.com>

On Feb 29, 2012 4:56 PM, "Ethan Furman" <ethan at stoneleaf.us> wrote:
>
> Craig Yoshioka wrote:
>>
>> On Feb 29, 2012, at 11:55 AM, Ethan Furman wrote:
>>
>>> From PEP 343:
>>>
>>>   But the final blow came when I read Raymond Chen's rant about
>>>   flow-control macros[1].  Raymond argues convincingly that hiding
>>>   flow control in macros makes your code inscrutable, and I find
>>>   that his argument applies to Python as well as to C.
>>>
>>> So it is explicitly stated that the with statement should not be
>>> capable of controlling the flow.
>>>
>>
>> I read the rant, and I agree in principle, but I think it's also a far
stretch to draw a line between a very confusing non-standard example of
macros in C, and documentable behavior of a built-in statement.  That is,
the only reason you might say with would be hiding flow-control is because
people don't currently expect it to.  I also think that when people use
non-builtin contextmanagers it's usually within a very specific... context
(*dammit*), and so they are likely to look up why they are using an object
as a context manager.  That's where you would document the behavior:
>>
>> with uncached(path):
>>  # code here only executes if the path does not exist
>
>
> I am -1 on the idea.
>
> if / while / for / try   are *always* flow control.
>
> Your proposal would have 'with' sometimes being flow control, and
sometimes not, and the only way to know is look at the object's code and/or
docs.  This makes for a lot more complication for very little gain.
>
> ~Ethan~
>
>

I like the general idea, but a conditionally conditional control syntax is
a readability nightmare., however, I wonder if the case in which the with
statement act as a conditional could be explicit so a reader can
distinguish between those that will always execute their body and those
which may or may not.

with cached(key):
    do_caching()
else:
    update_exp(key)
_______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120229/269a2926/attachment.html>

From ethan at stoneleaf.us  Wed Feb 29 22:48:26 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 29 Feb 2012 13:48:26 -0800
Subject: [Python-ideas] revisit pep 377: good use case?
In-Reply-To: <4F4E82B2.5050809@stoneleaf.us>
References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com>	<4F4E7CA8.7010109@stoneleaf.us>	<9611190B-6909-4037-9A1D-2ED701D38E3C@me.com>
	<4F4E82B2.5050809@stoneleaf.us>
Message-ID: <4F4E9D2A.9080709@stoneleaf.us>

Ethan Furman wrote:
re-posting to list

Craig Yoshioka wrote:
 > Here is what the context might look like:
 >
 > class Uncached(object):
 >   def __init__(self,path):
 >     self.path = path
 >     self.lock = path + '.locked'
 >   def __enter__(self):
 >     if os.path.exists(self.path):
 >        return SKipWithBlock # skips body goes straight to __exit__
 >     try:
 >         os.close(os.open(self.lock,os.O_CREAT|os.O_EXCL|os.O_RDWR))
 >     except OSError as e:
 >         if e.errno != errno.EEXIST:
 >             raise
 >         while os.path.exists(self.lock):
 >             time.sleep(0.1)
 >         return self.__enter__()
 >     return self.path
 >   def __exit__(self,et,ev,st):
 >     if os.path.exists(self.lock):
 >       os.unlink(self.lock)
 >
 > class Cache(object):
 >   def __init__(self,*args,**kwargs):
 >      self.base = os.path.join(CACHE_DIR,hashon(args,kwargs))
 >   #.....
 >   def create(self,path):
 >      return Uncached(os.path.join(self.base,path))
 >   #.....
 >
 > def cached(func):
 >   def wrapper(*args,**kwargs):
 >     cache = Cache(*args,**kwargs)
 >     return func(cache,*args,**kwargs)
 >   return wrapper
 >
 > ---------------------------------------------------------------------
 > Person using code:
 > ---------------------------------------------------------------------
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   with cache.create(path) as cpath:
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > pool.map(createdata,['x','x','t','x','t'])
 >
 > ---------------------------------------------------------------------
 >
 > so separate processes return the path to the cached data and create it
 > if it doesn't exist, and even wait if another process is working on
 > it.
 >
 > my collaborators could hopefully very easily wrap their programs with
 > minimal effort using the cleanest syntax possible,
 > and since inputs get hashed to consistent output paths for each
 > wrapped function, the wrapped functions can be easily combined,
 > chained, etc. and behind the scenes they are reusing as much work as
 > possible.
 >
 > Here are the current possible alternatives:
 >
 > 1. use the passed var as a flag, they must insert the if for every use
 > of the context, if not, then cached results get recomputed
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   with cache.create(path) as cpath:
 >     if not cpath: return
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > 2. using the for loop and an iterator instead of a context, is more
 > fool-proof, but a bit confusing?
 >
 > @cached
 > def createdata(cache,x):
 >   path = cache.pathfor('output.data')
 >   for cpath in cache.create(path):
 >     if not cpath: return
 >     with open(cpath,'wb') as cfile:
 >       cfile.write(x*10000)
 >   return path
 >
 > 3. using a class the outputs and caching function need to be specified
 > separately so that calls can be scripted together, also a lot more
 > boilerplate:
 >
 > class createdata(CachedWrapper):
 >   def outputs(self,x):
 >     self.outputs += [self.cache.pathfor('output.data')]
 >   def tocache(self,x):
 >     with open(self.outputs[0],'wb') as cfile:
 >       cfile.write(x*10000)